-
Notifications
You must be signed in to change notification settings - Fork 825
-
Star 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
支持中文的checkpoints是做了什么优化? #159
支持中文的checkpoints是做了什么优化? #159
Comments
research4pan
commented
Apr 7, 2023
Thanks for your interest in LMFlow! We've extended the finetune dataset with more Chinese samples. That's the main modification we've made. Hope that answers your question 😄 感谢您关注LMFlow!我们主要是在finetune数据集上加入了更多的sample,其他基本没有做更大的改动。希望我们的回答能帮助到您 😄 |
faradayin
commented
Apr 7, 2023
加了多少sample,大概? |
shizhediao
commented
Apr 7, 2023
•
edited
Loading
edited
大概加了70K左右,主要是BELLE的0.5M数据。还需要加入更多更高质量的数据 |
shizhediao
commented
Apr 19, 2023
This issue has been marked as stale because it has not had recent activity. If you think this still needs to be addressed please feel free to reopen this issue. Thanks |
我看其他issue里提问中文的支持效果怎么样,回答的还是说不太行,因为训练数据里中文不多。
我理解llama本身对中文的支持可以说是没有,不知道你们做了哪些优化呢?词表?还是只在中文instruction数据上finetune了
有个大概的描述的话更清楚,谢谢~
The text was updated successfully, but these errors were encountered: