/ repo3-fine-tuning-template / dpo_training.log
dpo_training.log
1 2025-08-14 13:25:26,066 - __main__ - INFO - 开始DPO训练... 2 2025-08-14 13:25:26,067 - __main__ - INFO - 配置文件加载完成 3 2025-08-14 13:25:26,067 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 4 2025-08-14 13:25:26,068 - data_utils - INFO - 成功加载 5 条DPO数据 5 2025-08-14 13:25:26,081 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 6 2025-08-14 13:25:26,081 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-4B-Instruct 7 2025-08-14 13:25:26,544 - __main__ - ERROR - 训练过程中出现错误: Qwen/Qwen2.5-4B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' 8 If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>` 9 Traceback (most recent call last): 10 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/utils/_http.py", line 409, in hf_raise_for_status 11 response.raise_for_status() 12 ~~~~~~~~~~~~~~~~~~~~~~~~~^^ 13 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/requests/models.py", line 1026, in raise_for_status 14 raise HTTPError(http_error_msg, response=self) 15 requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/Qwen/Qwen2.5-4B-Instruct/resolve/main/tokenizer_config.json 16 17 The above exception was the direct cause of the following exception: 18 19 Traceback (most recent call last): 20 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/utils/hub.py", line 479, in cached_files 21 hf_hub_download( 22 ~~~~~~~~~~~~~~~^ 23 path_or_repo_id, 24 ^^^^^^^^^^^^^^^^ 25 ...<10 lines>... 26 local_files_only=local_files_only, 27 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 28 ) 29 ^ 30 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn 31 return fn(*args, **kwargs) 32 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 1010, in hf_hub_download 33 return _hf_hub_download_to_cache_dir( 34 # Destination 35 ...<14 lines>... 36 force_download=force_download, 37 ) 38 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 1117, in _hf_hub_download_to_cache_dir 39 _raise_on_head_call_error(head_call_error, force_download, local_files_only) 40 ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 41 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 1658, in _raise_on_head_call_error 42 raise head_call_error 43 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 1546, in _get_metadata_or_catch_error 44 metadata = get_hf_file_metadata( 45 url=url, proxies=proxies, timeout=etag_timeout, headers=headers, token=token, endpoint=endpoint 46 ) 47 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn 48 return fn(*args, **kwargs) 49 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 1463, in get_hf_file_metadata 50 r = _request_wrapper( 51 method="HEAD", 52 ...<5 lines>... 53 timeout=timeout, 54 ) 55 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 286, in _request_wrapper 56 response = _request_wrapper( 57 method=method, 58 ...<2 lines>... 59 **params, 60 ) 61 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/file_download.py", line 310, in _request_wrapper 62 hf_raise_for_status(response) 63 ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^ 64 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/huggingface_hub/utils/_http.py", line 459, in hf_raise_for_status 65 raise _format(RepositoryNotFoundError, message, response) from e 66 huggingface_hub.errors.RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-689e1c06-10675ea85b09036c30d666b0;c14d0913-b990-4be2-a6a3-17b02f106628) 67 68 Repository Not Found for url: https://huggingface.co/Qwen/Qwen2.5-4B-Instruct/resolve/main/tokenizer_config.json. 69 Please make sure you specified the correct `repo_id` and `repo_type`. 70 If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication 71 Invalid username or password. 72 73 The above exception was the direct cause of the following exception: 74 75 Traceback (most recent call last): 76 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 208, in <module> 77 main() 78 ~~~~^^ 79 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 154, in main 80 model, tokenizer = load_model_and_tokenizer( 81 ~~~~~~~~~~~~~~~~~~~~~~~~^ 82 model_name=model_config['base_model'], 83 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 84 ...<3 lines>... 85 device_map=hardware_config['device_map'] 86 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 87 ) 88 ^ 89 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/model_utils.py", line 48, in load_model_and_tokenizer 90 tokenizer = AutoTokenizer.from_pretrained( 91 model_name, 92 trust_remote_code=True, 93 padding_side="left" 94 ) 95 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/models/auto/tokenization_auto.py", line 1049, in from_pretrained 96 tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs) 97 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/models/auto/tokenization_auto.py", line 881, in get_tokenizer_config 98 resolved_config_file = cached_file( 99 pretrained_model_name_or_path, 100 ...<12 lines>... 101 _commit_hash=commit_hash, 102 ) 103 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/utils/hub.py", line 321, in cached_file 104 file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs) 105 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/utils/hub.py", line 511, in cached_files 106 raise OSError( 107 ...<4 lines>... 108 ) from e 109 OSError: Qwen/Qwen2.5-4B-Instruct is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models' 110 If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `hf auth login` or by passing `token=<your_token>` 111 2025-08-14 13:26:03,365 - __main__ - INFO - 开始DPO训练... 112 2025-08-14 13:26:03,367 - __main__ - INFO - 配置文件加载完成 113 2025-08-14 13:26:03,367 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 114 2025-08-14 13:26:03,367 - data_utils - INFO - 成功加载 5 条DPO数据 115 2025-08-14 13:26:03,387 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 116 2025-08-14 13:26:03,387 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 117 2025-08-14 13:26:04,282 - __main__ - ERROR - 训练过程中出现错误: The installed version of bitsandbytes (<0.43.1) requires CUDA, but CUDA is not available. You may need to install PyTorch with CUDA support or upgrade bitsandbytes to >=0.43.1. 118 Traceback (most recent call last): 119 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 208, in <module> 120 main() 121 ~~~~^^ 122 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 154, in main 123 model, tokenizer = load_model_and_tokenizer( 124 ~~~~~~~~~~~~~~~~~~~~~~~~^ 125 model_name=model_config['base_model'], 126 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 127 ...<3 lines>... 128 device_map=hardware_config['device_map'] 129 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 130 ) 131 ^ 132 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/model_utils.py", line 59, in load_model_and_tokenizer 133 model = AutoModelForCausalLM.from_pretrained( 134 model_name, 135 ...<4 lines>... 136 attn_implementation="flash_attention_2" if torch.cuda.is_available() else "eager" 137 ) 138 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained 139 return model_class.from_pretrained( 140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 141 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs 142 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 143 ) 144 ^ 145 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/modeling_utils.py", line 317, in _wrapper 146 return func(*args, **kwargs) 147 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/modeling_utils.py", line 4887, in from_pretrained 148 hf_quantizer.validate_environment( 149 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ 150 torch_dtype=torch_dtype, 151 ^^^^^^^^^^^^^^^^^^^^^^^^ 152 ...<3 lines>... 153 weights_only=weights_only, 154 ^^^^^^^^^^^^^^^^^^^^^^^^^^ 155 ) 156 ^ 157 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 88, in validate_environment 158 raise ImportError( 159 ...<2 lines>... 160 ) 161 ImportError: The installed version of bitsandbytes (<0.43.1) requires CUDA, but CUDA is not available. You may need to install PyTorch with CUDA support or upgrade bitsandbytes to >=0.43.1. 162 2025-08-14 13:26:25,995 - __main__ - INFO - 开始DPO训练... 163 2025-08-14 13:26:25,999 - __main__ - INFO - 配置文件加载完成 164 2025-08-14 13:26:25,999 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 165 2025-08-14 13:26:26,001 - data_utils - INFO - 成功加载 5 条DPO数据 166 2025-08-14 13:26:26,041 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 167 2025-08-14 13:26:26,042 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 168 2025-08-14 21:52:53,947 - __main__ - INFO - 开始DPO训练... 169 2025-08-14 21:52:53,950 - __main__ - INFO - 配置文件加载完成 170 2025-08-14 21:52:53,950 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 171 2025-08-14 21:52:53,950 - data_utils - INFO - 成功加载 5 条DPO数据 172 2025-08-14 21:52:53,962 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 173 2025-08-14 21:52:53,962 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 174 2025-08-14 21:52:57,323 - model_utils - INFO - 模型和分词器加载完成 175 2025-08-14 21:52:57,323 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 176 2025-08-14 21:52:57,323 - model_utils - INFO - 正在应用PEFT配置到模型... 177 2025-08-14 21:52:57,718 - model_utils - INFO - PEFT配置应用完成 178 2025-08-14 21:52:57,718 - __main__ - ERROR - 训练过程中出现错误: TrainingArguments.__init__() got an unexpected keyword argument 'evaluation_strategy' 179 Traceback (most recent call last): 180 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 208, in <module> 181 main() 182 ~~~~^^ 183 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 173, in main 184 dpo_trainer = create_dpo_trainer( 185 model, tokenizer, train_dataset, eval_dataset, config 186 ) 187 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 88, in create_dpo_trainer 188 training_args = create_training_arguments( 189 output_dir=output_config['output_dir'], 190 ...<13 lines>... 191 logging_dir=output_config['logging_dir'] 192 ) 193 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/model_utils.py", line 171, in create_training_arguments 194 training_args = TrainingArguments( 195 output_dir=output_dir, 196 ...<24 lines>... 197 ddp_find_unused_parameters=False, 198 ) 199 TypeError: TrainingArguments.__init__() got an unexpected keyword argument 'evaluation_strategy' 200 2025-08-14 21:55:06,763 - __main__ - INFO - 开始DPO训练... 201 2025-08-14 21:55:06,766 - __main__ - INFO - 配置文件加载完成 202 2025-08-14 21:55:06,766 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 203 2025-08-14 21:55:06,766 - data_utils - INFO - 成功加载 5 条DPO数据 204 2025-08-14 21:55:06,771 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 205 2025-08-14 21:55:06,771 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 206 2025-08-14 21:55:09,339 - model_utils - INFO - 模型和分词器加载完成 207 2025-08-14 21:55:09,339 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 208 2025-08-14 21:55:09,339 - model_utils - INFO - 正在应用PEFT配置到模型... 209 2025-08-14 21:55:10,095 - model_utils - INFO - PEFT配置应用完成 210 2025-08-14 21:55:10,096 - model_utils - INFO - 训练参数创建完成 211 2025-08-14 21:55:10,096 - __main__ - ERROR - 训练过程中出现错误: DPOTrainer.__init__() got an unexpected keyword argument 'beta' 212 Traceback (most recent call last): 213 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 208, in <module> 214 main() 215 ~~~~^^ 216 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 173, in main 217 dpo_trainer = create_dpo_trainer( 218 model, tokenizer, train_dataset, eval_dataset, config 219 ) 220 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 108, in create_dpo_trainer 221 dpo_trainer = DPOTrainer( 222 model=model, 223 ...<16 lines>... 224 data_collator_kwargs={}, 225 ) 226 TypeError: DPOTrainer.__init__() got an unexpected keyword argument 'beta' 227 2025-08-14 22:09:14,373 - __main__ - INFO - 开始DPO训练... 228 2025-08-14 22:09:14,376 - __main__ - INFO - 配置文件加载完成 229 2025-08-14 22:09:14,376 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 230 2025-08-14 22:09:14,376 - data_utils - INFO - 成功加载 5 条DPO数据 231 2025-08-14 22:09:14,461 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 232 2025-08-14 22:09:14,461 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 233 2025-08-14 22:09:18,012 - model_utils - INFO - 模型和分词器加载完成 234 2025-08-14 22:09:18,013 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 235 2025-08-14 22:09:18,013 - model_utils - INFO - 正在应用PEFT配置到模型... 236 2025-08-14 22:09:18,523 - model_utils - INFO - PEFT配置应用完成 237 2025-08-14 22:09:18,523 - __main__ - ERROR - 训练过程中出现错误: cannot access local variable 'dpo_config' where it is not associated with a value 238 Traceback (most recent call last): 239 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 207, in <module> 240 main() 241 ~~~~^^ 242 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 172, in main 243 dpo_trainer = create_dpo_trainer( 244 model, tokenizer, train_dataset, eval_dataset, config 245 ) 246 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 104, in create_dpo_trainer 247 beta=dpo_config['beta'], 248 ^^^^^^^^^^ 249 UnboundLocalError: cannot access local variable 'dpo_config' where it is not associated with a value 250 2025-08-14 22:12:55,447 - __main__ - INFO - 开始DPO训练... 251 2025-08-14 22:12:55,449 - __main__ - INFO - 配置文件加载完成 252 2025-08-14 22:12:55,449 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 253 2025-08-14 22:12:55,450 - data_utils - INFO - 成功加载 5 条DPO数据 254 2025-08-14 22:12:55,467 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 255 2025-08-14 22:12:55,467 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 256 2025-08-14 22:12:58,008 - model_utils - INFO - 模型和分词器加载完成 257 2025-08-14 22:12:58,008 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 258 2025-08-14 22:12:58,008 - model_utils - INFO - 正在应用PEFT配置到模型... 259 2025-08-14 22:12:58,420 - model_utils - INFO - PEFT配置应用完成 260 2025-08-14 22:12:58,421 - model_utils - INFO - 训练参数创建完成 261 2025-08-14 22:12:58,421 - __main__ - ERROR - 训练过程中出现错误: DPOTrainer.__init__() got an unexpected keyword argument 'tokenizer' 262 Traceback (most recent call last): 263 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 208, in <module> 264 main() 265 ~~~~^^ 266 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 173, in main 267 dpo_trainer = create_dpo_trainer( 268 model, tokenizer, train_dataset, eval_dataset, config 269 ) 270 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 112, in create_dpo_trainer 271 dpo_trainer = DPOTrainer( 272 model=model, 273 ...<12 lines>... 274 data_collator_kwargs={}, 275 ) 276 TypeError: DPOTrainer.__init__() got an unexpected keyword argument 'tokenizer' 277 2025-08-14 22:13:50,779 - __main__ - INFO - 开始DPO训练... 278 2025-08-14 22:13:50,782 - __main__ - INFO - 配置文件加载完成 279 2025-08-14 22:13:50,782 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 280 2025-08-14 22:13:50,782 - data_utils - INFO - 成功加载 5 条DPO数据 281 2025-08-14 22:13:50,786 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 282 2025-08-14 22:13:50,786 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 283 2025-08-14 22:13:52,908 - model_utils - INFO - 模型和分词器加载完成 284 2025-08-14 22:13:52,909 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 285 2025-08-14 22:13:52,909 - model_utils - INFO - 正在应用PEFT配置到模型... 286 2025-08-14 22:13:53,375 - model_utils - INFO - PEFT配置应用完成 287 2025-08-14 22:13:53,376 - model_utils - INFO - 训练参数创建完成 288 2025-08-14 22:13:53,967 - __main__ - ERROR - 训练过程中出现错误: 'TrainingArguments' object has no attribute 'padding_value' 289 Traceback (most recent call last): 290 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 199, in <module> 291 main() 292 ~~~~^^ 293 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 164, in main 294 dpo_trainer = create_dpo_trainer( 295 model, tokenizer, train_dataset, eval_dataset, config 296 ) 297 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 112, in create_dpo_trainer 298 dpo_trainer = DPOTrainer( 299 model=model, 300 ...<3 lines>... 301 peft_config=None, # 已经在模型上应用了PEFT 302 ) 303 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/trl/trainer/dpo_trainer.py", line 278, in __init__ 304 if args.padding_value is not None: 305 ^^^^^^^^^^^^^^^^^^ 306 AttributeError: 'TrainingArguments' object has no attribute 'padding_value' 307 2025-08-14 22:14:25,153 - __main__ - INFO - 开始DPO训练... 308 2025-08-14 22:14:25,154 - __main__ - INFO - 配置文件加载完成 309 2025-08-14 22:14:25,154 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 310 2025-08-14 22:14:25,155 - data_utils - INFO - 成功加载 5 条DPO数据 311 2025-08-14 22:14:25,158 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 312 2025-08-14 22:14:25,158 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 313 2025-08-14 22:14:27,629 - model_utils - INFO - 模型和分词器加载完成 314 2025-08-14 22:14:27,629 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 315 2025-08-14 22:14:27,629 - model_utils - INFO - 正在应用PEFT配置到模型... 316 2025-08-14 22:14:28,115 - model_utils - INFO - PEFT配置应用完成 317 2025-08-14 22:14:28,116 - model_utils - INFO - 训练参数创建完成 318 2025-08-14 22:14:28,116 - __main__ - ERROR - 训练过程中出现错误: DPOConfig.__init__() got an unexpected keyword argument 'max_target_length'. Did you mean 'max_prompt_length'? 319 Traceback (most recent call last): 320 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 217, in <module> 321 main() 322 ~~~~^^ 323 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 182, in main 324 dpo_trainer = create_dpo_trainer( 325 model, tokenizer, train_dataset, eval_dataset, config 326 ) 327 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 115, in create_dpo_trainer 328 dpo_training_args = DPOConfig( 329 **training_args.to_dict(), 330 ...<10 lines>... 331 generate_during_eval=False, 332 ) 333 TypeError: DPOConfig.__init__() got an unexpected keyword argument 'max_target_length'. Did you mean 'max_prompt_length'? 334 2025-08-14 22:15:13,861 - __main__ - INFO - 开始DPO训练... 335 2025-08-14 22:15:13,863 - __main__ - INFO - 配置文件加载完成 336 2025-08-14 22:15:13,864 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 337 2025-08-14 22:15:13,864 - data_utils - INFO - 成功加载 5 条DPO数据 338 2025-08-14 22:15:13,868 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 339 2025-08-14 22:15:13,868 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 340 2025-08-14 22:15:16,417 - model_utils - INFO - 模型和分词器加载完成 341 2025-08-14 22:15:16,418 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 342 2025-08-14 22:15:16,419 - model_utils - INFO - 正在应用PEFT配置到模型... 343 2025-08-14 22:15:16,935 - model_utils - INFO - PEFT配置应用完成 344 2025-08-14 22:15:16,936 - model_utils - INFO - 训练参数创建完成 345 2025-08-14 22:15:16,937 - __main__ - ERROR - 训练过程中出现错误: DPOConfig.__init__() got an unexpected keyword argument 'ref_model' 346 Traceback (most recent call last): 347 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 216, in <module> 348 main() 349 ~~~~^^ 350 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 181, in main 351 dpo_trainer = create_dpo_trainer( 352 model, tokenizer, train_dataset, eval_dataset, config 353 ) 354 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 115, in create_dpo_trainer 355 dpo_training_args = DPOConfig( 356 **training_args.to_dict(), 357 ...<9 lines>... 358 generate_during_eval=False, 359 ) 360 TypeError: DPOConfig.__init__() got an unexpected keyword argument 'ref_model' 361 2025-08-14 22:15:43,905 - __main__ - INFO - 开始DPO训练... 362 2025-08-14 22:15:43,907 - __main__ - INFO - 配置文件加载完成 363 2025-08-14 22:15:43,907 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 364 2025-08-14 22:15:43,907 - data_utils - INFO - 成功加载 5 条DPO数据 365 2025-08-14 22:15:43,910 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 366 2025-08-14 22:15:43,910 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 367 2025-08-14 22:15:46,003 - model_utils - INFO - 模型和分词器加载完成 368 2025-08-14 22:15:46,005 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 369 2025-08-14 22:15:46,005 - model_utils - INFO - 正在应用PEFT配置到模型... 370 2025-08-14 22:15:46,425 - model_utils - INFO - PEFT配置应用完成 371 2025-08-14 22:15:46,425 - model_utils - INFO - 训练参数创建完成 372 2025-08-14 22:15:47,000 - __main__ - INFO - DPO训练器创建完成 373 2025-08-14 22:15:47,000 - __main__ - INFO - 开始训练... 374 2025-08-14 22:15:47,109 - __main__ - ERROR - 训练过程中出现错误: Can only automatically infer lengths for datasets whose items are dictionaries with an 'input_ids' key. 375 Traceback (most recent call last): 376 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 211, in <module> 377 main() 378 ~~~~^^ 379 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 182, in main 380 train_result = dpo_trainer.train() 381 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 2238, in train 382 return inner_training_loop( 383 args=args, 384 ...<2 lines>... 385 ignore_keys_for_eval=ignore_keys_for_eval, 386 ) 387 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 2288, in _inner_training_loop 388 train_dataloader = self.get_train_dataloader() 389 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/trl/trainer/dpo_trainer.py", line 836, in get_train_dataloader 390 return super().get_train_dataloader() 391 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^ 392 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 1060, in get_train_dataloader 393 return self._get_dataloader( 394 ~~~~~~~~~~~~~~~~~~~~^ 395 dataset=self.train_dataset, 396 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 397 ...<3 lines>... 398 is_training=True, 399 ^^^^^^^^^^^^^^^^^ 400 ) 401 ^ 402 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 1029, in _get_dataloader 403 dataloader_params["sampler"] = sampler_fn(dataset) 404 ~~~~~~~~~~^^^^^^^^^ 405 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 992, in _get_train_sampler 406 return LengthGroupedSampler( 407 self.args.train_batch_size * self.args.gradient_accumulation_steps, 408 ...<2 lines>... 409 model_input_name=model_input_name, 410 ) 411 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer_pt_utils.py", line 641, in __init__ 412 raise ValueError( 413 ...<2 lines>... 414 ) 415 ValueError: Can only automatically infer lengths for datasets whose items are dictionaries with an 'input_ids' key. 416 2025-08-14 22:17:33,008 - __main__ - INFO - 开始DPO训练... 417 2025-08-14 22:17:33,010 - __main__ - INFO - 配置文件加载完成 418 2025-08-14 22:17:33,010 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 419 2025-08-14 22:17:33,010 - data_utils - INFO - 成功加载 5 条DPO数据 420 2025-08-14 22:17:33,025 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 421 2025-08-14 22:17:33,025 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 422 2025-08-14 22:17:35,332 - model_utils - INFO - 模型和分词器加载完成 423 2025-08-14 22:17:35,333 - __main__ - INFO - 格式化数据集... 424 2025-08-14 22:17:35,485 - __main__ - INFO - 数据集格式化完成 425 2025-08-14 22:17:35,485 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 426 2025-08-14 22:17:35,485 - model_utils - INFO - 正在应用PEFT配置到模型... 427 2025-08-14 22:17:35,985 - model_utils - INFO - PEFT配置应用完成 428 2025-08-14 22:17:35,986 - model_utils - INFO - 训练参数创建完成 429 2025-08-14 22:17:36,351 - __main__ - ERROR - 训练过程中出现错误: Column to remove ['rejected', 'chosen'] not in the dataset. Current columns in the dataset: ['input_ids', 'attention_mask', 'chosen_input_ids', 'chosen_attention_mask', 'rejected_input_ids', 'rejected_attention_mask'] 430 Traceback (most recent call last): 431 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 217, in <module> 432 main() 433 ~~~~^^ 434 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 182, in main 435 dpo_trainer = create_dpo_trainer( 436 model, tokenizer, train_dataset, eval_dataset, config 437 ) 438 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 124, in create_dpo_trainer 439 dpo_trainer = DPOTrainer( 440 model=model, 441 ...<3 lines>... 442 peft_config=None, # 已经在模型上应用了PEFT 443 ) 444 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/trl/trainer/dpo_trainer.py", line 443, in __init__ 445 train_dataset = self._prepare_dataset(train_dataset, processing_class, args, "train") 446 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/trl/trainer/dpo_trainer.py", line 644, in _prepare_dataset 447 dataset = dataset.map( 448 self.tokenize_row if not self.is_vision_model else self.process_row, 449 ...<8 lines>... 450 **map_kwargs, 451 ) 452 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/datasets/arrow_dataset.py", line 560, in wrapper 453 out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) 454 ~~~~^^^^^^^^^^^^^^^^^^^^^^^ 455 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/datasets/arrow_dataset.py", line 3086, in map 456 raise ValueError( 457 f"Column to remove {list(missing_columns)} not in the dataset. Current columns in the dataset: {self._data.column_names}" 458 ) 459 ValueError: Column to remove ['rejected', 'chosen'] not in the dataset. Current columns in the dataset: ['input_ids', 'attention_mask', 'chosen_input_ids', 'chosen_attention_mask', 'rejected_input_ids', 'rejected_attention_mask'] 460 2025-08-14 22:18:26,761 - __main__ - INFO - 开始DPO训练... 461 2025-08-14 22:18:26,764 - __main__ - INFO - 配置文件加载完成 462 2025-08-14 22:18:26,764 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 463 2025-08-14 22:18:26,764 - data_utils - INFO - 成功加载 5 条DPO数据 464 2025-08-14 22:18:26,776 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 465 2025-08-14 22:18:26,776 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 466 2025-08-14 22:18:28,887 - model_utils - INFO - 模型和分词器加载完成 467 2025-08-14 22:18:28,887 - __main__ - INFO - 格式化数据集... 468 2025-08-14 22:18:29,072 - __main__ - INFO - 数据集格式化完成 469 2025-08-14 22:18:29,072 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 470 2025-08-14 22:18:29,072 - model_utils - INFO - 正在应用PEFT配置到模型... 471 2025-08-14 22:18:29,479 - model_utils - INFO - PEFT配置应用完成 472 2025-08-14 22:18:29,480 - model_utils - INFO - 训练参数创建完成 473 2025-08-14 22:18:30,071 - __main__ - INFO - DPO训练器创建完成 474 2025-08-14 22:18:30,071 - __main__ - INFO - 开始训练... 475 2025-08-14 22:18:30,219 - __main__ - ERROR - 训练过程中出现错误: '<=' not supported between instances of 'float' and 'str' 476 Traceback (most recent call last): 477 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 217, in <module> 478 main() 479 ~~~~^^ 480 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/dpo_train.py", line 188, in main 481 train_result = dpo_trainer.train() 482 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 2238, in train 483 return inner_training_loop( 484 args=args, 485 ...<2 lines>... 486 ignore_keys_for_eval=ignore_keys_for_eval, 487 ) 488 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 2345, in _inner_training_loop 489 self.create_optimizer_and_scheduler(num_training_steps=max_steps) 490 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 491 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 1178, in create_optimizer_and_scheduler 492 self.create_optimizer() 493 ~~~~~~~~~~~~~~~~~~~~~^^ 494 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/transformers/trainer.py", line 1244, in create_optimizer 495 self.optimizer = optimizer_cls(optimizer_grouped_parameters, **optimizer_kwargs) 496 ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 497 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/torch/optim/adamw.py", line 37, in __init__ 498 super().__init__( 499 ~~~~~~~~~~~~~~~~^ 500 params, 501 ^^^^^^^ 502 ...<10 lines>... 503 decoupled_weight_decay=True, 504 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 505 ) 506 ^ 507 File "/Users/evelyndu/Desktop/To-do/RA/Wangwenwen-LLM Tuning/venv/lib/python3.13/site-packages/torch/optim/adam.py", line 58, in __init__ 508 if not 0.0 <= lr: 509 ^^^^^^^^^ 510 TypeError: '<=' not supported between instances of 'float' and 'str' 511 2025-08-14 22:21:53,173 - __main__ - INFO - 开始DPO训练... 512 2025-08-14 22:21:53,176 - __main__ - INFO - 配置文件加载完成 513 2025-08-14 22:21:53,176 - __main__ - WARNING - CUDA不可用,将使用CPU训练(速度会很慢) 514 2025-08-14 22:21:53,177 - data_utils - INFO - 成功加载 5 条DPO数据 515 2025-08-14 22:21:53,190 - __main__ - INFO - 数据集分割完成: 训练集 4 条, 验证集 1 条 516 2025-08-14 22:21:53,191 - model_utils - INFO - 正在加载模型: Qwen/Qwen2.5-0.5B-Instruct 517 2025-08-14 22:21:55,202 - model_utils - INFO - 模型和分词器加载完成 518 2025-08-14 22:21:55,202 - __main__ - INFO - 格式化数据集... 519 2025-08-14 22:21:55,353 - __main__ - INFO - 数据集格式化完成 520 2025-08-14 22:21:55,353 - model_utils - INFO - PEFT配置创建完成: r=16, alpha=32, dropout=0.1 521 2025-08-14 22:21:55,353 - model_utils - INFO - 正在应用PEFT配置到模型... 522 2025-08-14 22:21:55,688 - model_utils - INFO - PEFT配置应用完成 523 2025-08-14 22:21:55,689 - model_utils - INFO - 训练参数创建完成 524 2025-08-14 22:21:56,191 - __main__ - INFO - DPO训练器创建完成 525 2025-08-14 22:21:56,191 - __main__ - INFO - 开始训练...