大模型训练评估参数设置
这个错误表明你设置了 metric_for_best_model="eval_accuracy" ,但在评估过程中并没有计算 accuracy 指标。可用的评估指标只有: ['eval_loss', 'eval_runtime', 'eval_samples_per_second', 'eval_steps_per_second', 'epoch']。
·
1. 参数问题
File "/home/wuwenliang/py_workspace/llm_finetune/qwenvl_latext_ocr/train_latex_ocr_sft_yaml_3.py", line 260, in <module>
trainer.train()
File "/data2/llmtuner/lib/python3.10/site-packages/transformers/trainer.py", line 2238, in train
return inner_training_loop(
File "/data2/llmtuner/lib/python3.10/site-packages/transformers/trainer.py", line 2698, in _inner_training_loop
self._maybe_log_save_evaluate(
File "/data2/llmtuner/lib/python3.10/site-packages/transformers/trainer.py", line 3138, in _maybe_log_save_evaluate
is_new_best_metric = self._determine_best_metric(metrics=metrics, trial=trial)
File "/data2/llmtuner/lib/python3.10/site-packages/transformers/trainer.py", line 3208, in _determine_best_metric
raise KeyError(
KeyError("The
metric_for_best_model
training argument is set to 'eval_accuracy', which is not found in the evaluation metrics. The available evaluation metrics are: ['eval_loss', 'eval_runtime', 'eval_samples_per_second', 'eval_steps_per_second', 'epoch']. Consider changing the
metric_for_best_model
via the TrainingArguments.")
17%|█▋ | 62/372 [03:53<19:27, 3.77s/it]
Exception ignored in: <function tqdm.del at 0x7f91837d6f80>
Traceback (most recent call last):
File "/data2/llmtuner/lib/python3.10/site-packages/tqdm/std.py", line 1148, in del
File "/data2/llmtuner/lib/python3.10/site-packages/tqdm/std.py", line 1303, in close
File "/data2/llmtuner/lib/python3.10/site-packages/tqdm/std.py", line 1287, in fp_write
File "/data2/llmtuner/lib/python3.10/site-packages/tqdm/utils.py", line 196, in inner
File "/data2/llmtuner/lib/python3.10/site-packages/swanlab/log/log.py", line 93, in write_handler
File "/data2/llmtuner/lib/python3.10/site-packages/swankit/env.py", line 90, in create_time
AttributeError: 'NoneType' object has no attribute 'datetime'
2. 分析问题
这个错误表明你设置了 metric_for_best_model="eval_accuracy" ,但在评估过程中并没有计算 accuracy 指标。可用的评估指标只有: ['eval_loss', 'eval_runtime', 'eval_samples_per_second', 'eval_steps_per_second', 'epoch'] 。
3. 解决方案
方案1:使用存在的评估指标(推荐)
args = TrainingArguments(
# ... 其他参数 ...
metric_for_best_model="eval_loss", # 使用存在的评估指标
greater_is_better=False, # 对于 loss,越小越好
load_best_model_at_end=True,
)
方案2:添加 accuracy 计算函数
如果你确实需要计算 accuracy,需要在训练器中添加计算函数:
import numpy as np
from transformers import EvalPrediction
def compute_metrics(eval_pred):
predictions, labels = eval_pred
# 根据你的任务类型实现 accuracy 计算
# 例如对于分类任务:
predictions = np.argmax(predictions, axis=1)
accuracy = (predictions == labels).astype(np.float32).mean()
return {"accuracy": accuracy}
# 在创建 Trainer 时添加 compute_metrics
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
compute_metrics=compute_metrics, # 添加这个
# ... 其他参数 ...
)
方案3:检查当前代码中的 compute_metrics
查看你的代码中是否已经定义了
compute_metrics
函数,但可能实现有问题或者没有正确传递给 Trainer。
推荐修改
根据你的错误信息,最简单的解决方案是:
args = TrainingArguments(
# ... 其他参数 ...
eval_strategy="steps",
save_strategy="steps",
eval_steps=500,
save_steps=500,
load_best_model_at_end=True,
metric_for_best_model="eval_loss", # 改为使用 eval_loss
greater_is_better=False, # loss 越小越好
# ... 其他参数 ...
)
如果你需要 accuracy 指标
如果你的任务是分类任务,可以这样实现:
def compute_metrics(eval_pred):
logits, labels = eval_pred
predictions = np.argmax(logits, axis=-1)
accuracy = np.mean(predictions == labels)
return {"accuracy": accuracy}
trainer = Trainer(
# ... 其他参数 ...
compute_metrics=compute_metrics,
)
选择方案1是最快速的解决方案,先让训练正常运行起来。
火山引擎开发者社区是火山引擎打造的AI技术生态平台,聚焦Agent与大模型开发,提供豆包系列模型(图像/视频/视觉)、智能分析与会话工具,并配套评测集、动手实验室及行业案例库。社区通过技术沙龙、挑战赛等活动促进开发者成长,新用户可领50万Tokens权益,助力构建智能应用。
更多推荐
所有评论(0)