本文為魔搭社群輕量級訓練推理工具SWIFT微調實戰教程系列
SWIFT(Scalable lightWeight Infrastructure for Fine-Tuning)是 魔搭ModelScope開源社群推出的一套完整的輕量級訓練推理工具 , 基於PyTorch的輕量級、開箱即用的模型微調、推理框架,讓AI愛好者用自己的消費級顯卡就能玩轉大模型和AIGC。
魔搭官方,公眾號:魔搭ModelScope社群
SWIFT支持了開源模型,尤其是中小型模型(7B、14B等)對Agent場景的訓練,並將loss-scale技術( https://arxiv.org/pdf/2309.00986.pdf)套用到agent訓練中 ,使中小模型API Call能力更穩定,並支持使用單張商業級顯卡進行Agent推理和部署,可以直接在生產場景中全鏈路閉環落地使用。
接下來進入手把手Agent微調實操:
環境安裝
# 設定pip全域映像 (加速下載)
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
# 安裝ms-swift
git clone https://github.com/modelscope/swift.git
cd swift
pip install -e .[llm]
# 環境對齊 (通常不需要執行. 如果你執行錯誤, 可以跑下面的程式碼, 倉庫使用最新環境測試)
pip install -r requirements/framework.txt -U
pip install -r requirements/llm.txt -U
數據準備
為訓練Agent能力,魔搭官方提供了兩個開源數據集:
魔搭通用問答知識數據集(包含38萬條通用知識多輪對話數據)
連結: https://www.modelscope.cn/datasets/iic/ms_bench/summary
魔搭通用Agent訓練數據集(包含3萬條Agent格式的API呼叫數據)
連結:https://www.modelscope.cn/datasets/iic/ms_agent/summary
魔搭通用問答 數據集數據格式如下:
{
"id": "MS_Agent_Bench_126374",
"conversations": [{
"from": "system",
"value": "Answer the following questions as best you can. You have access to the following APIs:\n1. hm_recipe_recommend: Call this tool to interact with the hmreciperecommend API. What is the hmreciperecommend API useful for? . Parameters: [{\"name\": \"keywords_dict\", \"description\": \"盒馬推薦菜譜關鍵詞字典。\", \"required\": \"True\"}]\n\n2. hm_product_marketing: Call this tool to interact with the hmproductmarketing API. What is the hmproductmarketing API useful for? . Parameters: [{\"name\": \"sku_code_list\", \"description\": \"盒馬商品的編碼列表\", \"required\": \"True\"}]\n\n3. hm_product_info: Call this tool to interact with the hmproductinfo API. What is the hmproductinfo API useful for? . Parameters: [{\"name\": \"sku_code_list\", \"description\": \"盒馬商品的編碼列表\", \"required\": \"True\"}, {\"name\": \"sku_name_list\", \"description\": \"盒馬商品的名稱列表\", \"required\": \"True\"}, {\"name\": \"property_list\", \"description\": \"盒馬商品的內容列表\", \"required\": \"True\"}]\n\n4. hm_product_recommend: Call this tool to interact with the hmproductrecommend API. What is the hmproductrecommend API useful for? . Parameters: [{\"name\": \"keywords_dict\", \"description\": \"盒馬推薦商品關鍵詞字典。\", \"required\": \"True\"}]\n\nUse the following format:\n\nThought: you should always think about what to do\nAction: the action to take, should be one of the above tools[hm_recipe_recommend, hm_product_marketing, hm_product_info, hm_product_recommend]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can be repeated zero or more times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\nBegin!"
}, {
"from": "user",
"value": "你好,請問你們有哪些手機可以推薦?"
}, {
"from": "assistant",
"value": "Action: hm_product_recommend\nAction Input: {'keywords_dict': {'商品類目': '手機'}}\nObservation: [{'sku_code': '10518244314', 'product_name': '小米11 Ultra 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '5999', 'product_description': '擁有120Hz高重新整理率和驍龍888處理器,還配備了高品質相機系統'}, {'sku_code': '10222726664', 'product_name': 'vivo X60 Pro 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '4498', 'product_description': '搭載Exynos 1080處理器與ZEISS聯合調校的相機系統,拍照實力十足'}, {'sku_code': '18227073452', 'product_name': '華為暢享 20 Pro 5G手機 128GB', 'bussiness_type': '標品', 'product_price': '2699', 'product_description': '擁有4000mAh的電池容量和優秀的相機而價格非常親民'}]\nThought: I now know the final answer\nFinal Answer: 您好,我們有以下幾款手機供您選擇:1.小米11 Ultra 5G手機 256GB:價格為5999元,擁有120Hz高重新整理率和驍龍888處理器,還配備了高品質相機系統;2. vivo X60 Pro 5G手機 256GB:售價為4498元,搭載Exynos 1080處理器與ZEISS聯合調校的相機系統,拍照實力十足;3. 華為暢享 20 Pro 5G手機 128GB:價格親民,只需2699元即可擁有優秀的相機和4000mAh的電池容量。這些手機都具有不同的特點和功能,您可以根據自己的需求進行選擇。如果您有其他問題或需求,請隨時告訴我。"
}, {
"from": "user",
"value": "我想買一款拍照好的手機,有推薦嗎?"
}, {
"from": "assistant",
"value": "Action: hm_product_recommend\nAction Input: {'keywords_dict': {'商品類目': '手機', '適用場景': '拍照'}}\nObservation: [{'sku_code': '15956662254', 'product_name': '小米11 Ultra 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '5999', 'product_description': '擁有120Hz高重新整理率和驍龍888處理器,還配備了高品質相機系統'}, {'sku_code': '19424216918', 'product_name': 'vivo X60 Pro 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '4498', 'product_description': '搭載Exynos 1080處理器與ZEISS聯合調校的相機系統,拍照實力十足'}, {'sku_code': '14347982715', 'product_name': 'OPPO Ace2 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '3999', 'product_description': '拍照實力十足,支持65W超級閃充'}]\nThought: I now know the final answer\nFinal Answer: 如果您想要一款拍照表現出色的手機,我為您推薦以下幾款產品:1.小米11 Ultra 5G手機 256GB:價格為5999元,配備了高品質相機系統,可以拍攝出色的照片;2. vivo X60 Pro 5G手機 256GB:售價為4498元,相機系統由ZEISS聯合調校,拍照實力十足;3. OPPO Ace2 5G手機 256GB:價格為3999元,不僅拍照實力十足,還支持65W超級閃充,讓你的手機充電更加便捷。這些手機都有不同的特點和功能,您可以根據自己的需求進行選擇。如果您還有其他問題或需求,請隨時告訴我。"
}, {
"from": "user",
"value": "小米11 Ultra拍照效能怎麽樣?"
}, {
"from": "assistant",
"value": "Action: hm_product_info\nAction Input: {'sku_code_list': ['15956662254'], 'sku_name_list': ['小米11 Ultra 5G手機 256GB'], 'property_list': [['拍照效能']]}\nObservation: [{'sku_code': '15956662254', 'property': {'拍照效能': '小米11 Ultra 5G手機配備了1億像素的主網路攝影機,還有120X超長焦電影網路攝影機,以及20MP超廣角網路攝影機等組合,拍照效能非常出色。'}}]\nThought: I now know the final answer\nFinal Answer: 小米11 Ultra 5G手機配備了1億像素的主網路攝影機,還有120X超長焦電影網路攝影機,以及20MP超廣角網路攝影機等組合,拍照效能非常出色。如果您還有其他問題或需求,請隨時告訴我。"
}]
}
Agent數據集的system欄位具體格式如下(將\"字元轉換為"字元, \n轉換為換行):
Answer the following questions as best you can. You have access to the following APIs:
1. hm_recipe_recommend: Call this tool to interact with the hmreciperecommend API. What is the hmreciperecommend API useful for? . Parameters: [{"name": "keywords_dict", "description": "盒馬推薦菜譜關鍵詞字典。", "required": "True"}]
2. hm_product_marketing: Call this tool to interact with the hmproductmarketing API. What is the hmproductmarketing API useful for? . Parameters: [{"name": "sku_code_list", "description": "盒馬商品的編碼列表", "required": "True"}]
3. hm_product_info: Call this tool to interact with the hmproductinfo API. What is the hmproductinfo API useful for? . Parameters: [{"name": "sku_code_list", "description": "盒馬商品的編碼列表", "required": "True"}, {"name": "sku_name_list", "description": "盒馬商品的名稱列表", "required": "True"}, {"name": "property_list", "description": "盒馬商品的內容列表", "required": "True"}]
4. hm_product_recommend: Call this tool to interact with the hmproductrecommend API. What is the hmproductrecommend API useful for? . Parameters: [{"name": "keywords_dict", "description": "盒馬推薦商品關鍵詞字典。", "required": "True"}]
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[hm_recipe_recommend, hm_product_marketing, hm_product_info, hm_product_recommend]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
API格式:
Answer the following questions as best you can. You have access to the following APIs:
序號: API名稱: API作用 API參數
...
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[API名稱列表]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Agent數據集呼叫API的response的結構如下:
Agent數據集呼叫API的response的結構如下:
Action: hm_product_recommend
Action Input: {'keywords_dict': {'商品類目': '手機', '適用場景': '拍照'}}
Observation: [{'sku_code': '15956662254', 'product_name': '小米11 Ultra 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '5999', 'product_description': '擁有120Hz高重新整理率和驍龍888處理器,還配備了高品質相機系統'}, {'sku_code': '19424216918', 'product_name': 'vivo X60 Pro 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '4498', 'product_description': '搭載Exynos 1080處理器與ZEISS聯合調校的相機系統,拍照實力十足'}, {'sku_code': '14347982715', 'product_name': 'OPPO Ace2 5G手機 256GB', 'bussiness_type': '標品', 'product_price': '3999', 'product_description': '拍照實力十足,支持65W超級閃充'}]
Thought: I now know the final answer
Final Answer: 如果您想要一款拍照表現出色的手機,我為您推薦以下幾款產品:1.小米11 Ultra 5G手機 256GB:價格為5999元,配備了高品質相機系統,可以拍攝出色的照片;2. vivo X60 Pro 5G手機 256GB:售價為4498元,相機系統由ZEISS聯合調校,拍照實力十足;3. OPPO Ace2 5G手機 256GB:價格為3999元,不僅拍照實力十足,還支持65W超級閃充,讓你的手機充電更加便捷。這些手機都有不同的特點和功能,您可以根據自己的需求進行選擇。如果您還有其他問題或需求,請隨時告訴我。
Action:實際呼叫的API名稱
Action Input: 實際的輸入參數
Observation: 該部份是實際呼叫結果,訓練時不參與loss,推理時需要外部呼叫後填入模型
Thought: 模型思考輸出
Final Answer: 模型的最終回答
微調
在Agent訓練中,為了避免訓練後造成嚴重知識遺忘,我們的數據配比為
ms-agent
:
ms-
bench
數據集1比2,其中ms_agent共30000條,隨機抽樣ms_bench數據集60000條,同時為了改變模型認知,增加自我認知數據3000條。
數據集 | 條數 |
---|---|
ms-agent | 30000(全數據集) |
ms-bench | 60000(抽樣) |
self-recognition | 3000(重復抽樣) |
我們也支持使用自己的Agent數據集。數據集格式需要符合 自訂數據集 的要求。更具體地,Agent的response/system應該符合上述的Action/Action Input/Observation格式。
我們將
MLP
和
Embedder
加入了lora_target_modules. 你可以透過指定
--lora_target_modules ALL
在所有的linear層(包括qkvo以及mlp和embedder)加lora. 這
通常是效果最好的。
微調使用了qwen-7b-chat模型,超參數如下:
超參數 | 值 |
---|---|
LR | 5e-5 |
Epoch | 2 |
lora_rank | 8 |
lora_alpha | 32 |
lora_target_modules | ALL |
batch_size | 2 |
gradient_accumulation_steps | 32 total |
執行命令和其他超參數如下:
# Experimental environment: A100
nproc_per_node=8
PYTHONPATH=../../.. \
torchrun \
--nproc_per_node=$nproc_per_node \
--master_port 29500 \
llm_sft.py \
--model_id_or_path qwen/Qwen-7B-Chat \
--model_revision master \
--sft_type lora \
--tuner_backend swift \
--dtype AUTO \
--output_dir output \
--dataset ms-agent \
--train_dataset_mix_ratio 2.0 \
--train_dataset_sample -1 \
--num_train_epochs 2 \
--max_length 2048 \
--check_dataset_strategy warning \
--lora_rank 8 \
--lora_alpha 32 \
--lora_dropout_p 0.05 \
--lora_target_modules ALL \
--self_cognition_sample 3000 \
--model_name 卡卡羅特 \
--model_author 陶白白 \
--gradient_checkpointing true \
--batch_size 2 \
--weight_decay 0.01 \
--learning_rate 5e-5 \
--gradient_accumulation_steps $(expr 32 / $nproc_per_node) \
--max_grad_norm 0.5 \
--warmup_ratio 0.03 \
--eval_steps 100 \
--save_steps 100 \
--save_total_limit 2 \
--logging_steps 10
訓練過程使用了8*A100硬體環境,訓練時長3小時。該訓練使用單卡也可以執行,使用者可以將DDP改為單卡命令即可。
推理
我們針對通用知識和Agent進行評測。下面列出了一個簡單的評測結果。
原始模型
通用知識
西湖醋魚怎麽做
新冠和普通感冒有什麽區別
Agent能力
我們使用一個火焰報警場景作為測試用例:
Answer the following questions as best you can. You have access to the following APIs:
1. fire_recognition: Call this tool to interact with the fire recognition API. This API is used to recognize whether there is fire in the image. Parameters: [{"name": "image", "description": "The input image to recognize fire", "required": "True"}]
2. fire_alert: Call this tool to interact with the fire alert API. This API will start an alert to warn the building's administraters. Parameters: []
3. call_police: Call this tool to interact with the police calling API. This API will call 110 to catch the thief. Parameters: []
4. call_fireman: Call this tool to interact with the fireman calling API. This API will call 119 to extinguish the fire. Parameters: []
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[fire_recognition, fire_alert, call_police, call_fireman]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
可以看到,人工輸入Observation後模型答案並不正確。
訓練後
通用知識
西湖醋魚怎麽做
新冠和普通感冒有什麽區別
Agent能力
可以看到,訓練後模型可以正確呼叫API並給出最終答案。
自我認知
在命令列中使用Agent
目前命令列的Agent推理支持需要指定
--eval_human true
,因為該參數為false的時候會讀取數據集內容,此時無法手動傳入
Observation:
後面的API呼叫結果。也可用--ckpt_dir指定訓練後輸出
swift infer --model_type chatglm3-6b-32k --eval_human true --stop_words Observation: --infer_backend pt
執行命令後,改變system欄位:
# 單行system
<<< reset-system
<<< Answer the following questions as best you can. You have access to the following APIs:\n1. fire_recognition: Call this tool to interact with the fire recognition API. This API is used to recognize whether there is fire in the image. Parameters: [{"name": "image", "description": "The input image to recognize fire", "required": "True"}]\n\n2. fire_alert: Call this tool to interact with the fire alert API. This API will start an alert to warn the building's administraters. Parameters: []\n\n3. call_police: Call this tool to interact with the police calling API. This API will call 110 to catch the thief. Parameters: []\n\n4. call_fireman: Call this tool to interact with the fireman calling API. This API will call 119 to extinguish the fire. Parameters: []\n\nUse the following format:\n\nThought: you should always think about what to do\nAction: the action to take, should be one of the above tools[fire_recognition, fire_alert, call_police, call_fireman]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can be repeated zero or more times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\nBegin!
如果需要以多行方式輸入,可以用下面的命令(多行資訊以#號結束):
# 多行system
<<< multi-line#
<<<[M] reset-system#
<<<[MS] Answer the following questions as best you can. You have access to the following APIs:
1. fire_recognition: Call this tool to interact with the fire recognition API. This API is used to recognize whether there is fire in the image. Parameters: [{"name": "image", "description": "The input image to recognize fire", "required": "True"}]
2. fire_alert: Call this tool to interact with the fire alert API. This API will start an alert to warn the building's administraters. Parameters: []
3. call_police: Call this tool to interact with the police calling API. This API will call 110 to catch the thief. Parameters: []
4. call_fireman: Call this tool to interact with the fireman calling API. This API will call 119 to extinguish the fire. Parameters: []
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[fire_recognition, fire_alert, call_police, call_fireman]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!#
下面就可以進行Agent問答:
<<< 輸入圖片是/tmp/1.jpg,協助判斷圖片中是否存在著火點
Thought: I need to use the fire\_recognition API to analyze the input image and determine if there are any signs of fire.
Action: Use the fire\_recognition API to analyze the input image.
Action Input: /tmp/1.jpg
Observation:
<<< [{'coordinate': [101.1, 200.9], 'on_fire': True}]
Thought: The fire\_recognition API has returned a result indicating that there is fire in the input image.
Final Answer: There is fire in the input image.
可以看到,模型已經返回了API呼叫的結果分析。使用者可以繼續問問題進行多輪Agent場景。也可以指定
--infer_backend vllm
和
--stream true
來使用vllm和流式推理。
在部署中使用Agent
由於部署不支持history管理,因此agent的API呼叫結果拼接需要使用者自行進行,下面給出一個OpenAI格式可執行的程式碼範例。
伺服端:
swift deploy --model_type chatglm3-6b-32k --stop_words Observation
客戶端:
from openai import OpenAI
client = OpenAI(
api_key='EMPTY',
base_url='http://localhost:8000/v1',
)
model_type = client.models.list().data[0].id
print(f'model_type: {model_type}')
system = """Answer the following questions as best you can. You have access to the following APIs:
1. fire_recognition: Call this tool to interact with the fire recognition API. This API is used to recognize whether there is fire in the image. Parameters: [{\"name\": \"image\", \"description\": \"The input image to recognize fire\", \"required\": \"True\"}]
2. fire_alert: Call this tool to interact with the fire alert API. This API will start an alert to warn the building's administraters. Parameters: []
3. call_police: Call this tool to interact with the police calling API. This API will call 110 to catch the thief. Parameters: []
4. call_fireman: Call this tool to interact with the fireman calling API. This API will call 119 to extinguish the fire. Parameters: []
Use the following format:
Thought: you should always think about what to do
Action: the action to take, should be one of the above tools[fire_recognition, fire_alert, call_police, call_fireman]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!"""
messages = [{
'role': 'system',
'content': system
}, {
'role': 'user',
'content': '輸入圖片是/tmp/1.jpg,協助判斷圖片中是否存在著火點'
}]
resp = client.chat.completions.create(
model=model_type,
messages=messages,
stop=['Observation:'],
seed=42)
response = resp.choices[0].message.content
print(f'response: {response}')
# # 流式
messages.append({'role': 'assistant', 'content': response + "\n[{'coordinate': [101.1, 200.9], 'on_fire': True}]"})
print(messages)
stream_resp = client.chat.completions.create(
model=model_type,
messages=messages,
stop=['Observation:'],
stream=True,
seed=42)
print('response: ', end='')
for chunk in stream_resp:
print(chunk.choices[0].delta.content, end='', flush=True)
print()
## Output:
# model_type: chatglm3-6b-32k
# response: Thought: I need to check if there is fire in the image
# Action: Use fire\_recognition API
# Action Input: /tmp/2.jpg
# Observation:
# [{'role': 'system', 'content': 'Answer the following questions as best you can. You have access to the following APIs:\n1. fire_recognition: Call this tool to interact with the fire recognition API. This API is used to recognize whether there is fire in the image. Parameters: [{"name": "image", "description": "The input image to recognize fire", "required": "True"}]\n\n2. fire_alert: Call this tool to interact with the fire alert API. This API will start an alert to warn the building\'s administraters. Parameters: []\n\n3. call_police: Call this tool to interact with the police calling API. This API will call 110 to catch the thief. Parameters: []\n\n4. call_fireman: Call this tool to interact with the fireman calling API. This API will call 119 to extinguish the fire. Parameters: []\n\nUse the following format:\n\nThought: you should always think about what to do\nAction: the action to take, should be one of the above tools[fire_recognition, fire_alert, call_police, call_fireman]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can be repeated zero or more times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\nBegin!'}, {'role': 'user', 'content': '輸入圖片是/tmp/2.jpg,協助判斷圖片中是否存在著火點'}, {'role': 'assistant', 'content': "Thought: I need to check if there is fire in the image\nAction: Use fire\\_recognition API\nAction Input: /tmp/2.jpg\nObservation:\n[{'coordinate': [101.1, 200.9], 'on_fire': True}]"}]
# response:
# Final Answer: There is fire in the image at coordinates [101.1, 200.9]
總結
透過SWIFT支持的Agent訓練能力,我們使用ms-agent和ms-bench對qwen-7b-chat模型進行了微調。可以看到微調後模型保留了通用知識問答能力,並在system欄位增加了API的情況下可以正確呼叫並完成任務。需要註意的是:
訓練從LoRA變為全參數訓練,知識遺忘問題會更加嚴重,數據集混合比例需要實際測試調整
部份模型可能在訓練後仍然呼叫效果不佳,可以測試該模型本身預訓練能力是否紮實
本文為SWIFT LLM&AIGC微調場景化最佳實踐系列之一,後續將繼續透過魔搭社群推出場景化教程。目前 SWIFT已支持182個大模型,71個數據集,支持LoRA、Q LoRA、 LongLoRA等 十余種tuners,一行程式碼即可開啟模型訓練,歡迎對大模型和AIGC微調部署感興趣的開發者小夥伴們多多交流!
Github:
https://github.com/modelscope/swift
官方交流群:
點選 閱讀原文 ,直達SWIFT開源連結,歡迎star~
👇點選關註ModelScope公眾號獲取
更多技術資訊~