China's first ChatGPT-like model: Fudan University MOSS is officially open source today, and the RTX 3090 graphics card can be run. 04/15 Update SLTechnology News&Howtos

China's first ChatGPT-like model: Fudan University MOSS is officially open source today, and the RTX 3090 graphics card can be run.

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Thanks to CTOnews.com netizen Colorful M for the clue delivery! CTOnews.com, April 21 (Xinhua)-- A new version of the MOSS model developed by the Natural language processing Laboratory of Fudan University was officially launched today, becoming the first plug-in enhanced open source dialogue language model in China.

At present, the MOSS model has been online and open source, and the relevant code, data and model parameters have been opened on platforms such as Github and Hugging Face for researchers to download.

According to reports, MOSS is an open source dialogue language model that supports both Chinese and English and a variety of plug-ins. The moss-moon series model has 16 billion parameters and can run on a single A100 / A800 or two 3090 graphics cards with FP16 precision, and on a single 3090 graphics card with INT4/8 precision. The MOSS pedestal language model is pre-trained in about 700 billion Chinese, English and code words, followed by fine-tuning of dialogue instructions, plug-in enhancement learning and human preference training with multi-round dialogue ability and the ability to use multiple plug-ins.

MOSS, from Professor Qiu Xipeng's team from the Natural language processing Laboratory of Fudan University, whose name is the same as AI in the movie wandering the Earth, has been posted on the public platform (https://moss.fastnlp.top/), inviting the public to participate in internal testing.

CTOnews.com looked at the GitHub page of MOSS and found that the code contained in the project uses the Apache 2.0 protocol, the data uses the CC BY-NC 4.0 protocol, and the model weight uses the GNU AGPL 3.0 protocol. If you want to use the model contained in the project for commercial purposes or for public deployment, you need to sign the document and send it to robot@fudan.edu.cn for authorization. The commercial situation is for recording only and will not charge any fee.

MOSS use case:

▲ solution equation

▲ generates pictures

▲ harmless test model moss-moon-003-base: MOSS-003 pedestal model, which is obtained by self-supervised pre-training on high-quality Chinese and English corpus. The pre-training corpus contains about 700B words and the amount of computation is about 6.67x1022 floating point operations.

Moss-moon-003-sft: the pedestal model is fine-tuned on more than 1.1 million rounds of conversation data, and has the ability to follow instructions, multiple rounds of dialogue, and the ability to avoid harmful requests.

Moss-moon-003-sft-plugin: the pedestal model is fine-tuned on more than 1.1 million rounds of dialogue data and about 300,000 plug-ins enhanced by plug-ins. On the basis of moss-moon-003-sft, it also has the ability to use four plug-ins, such as search engine, text chart, calculator and equation solving.

Moss-moon-003-pm: the preference model trained based on the preference feedback data collected by moss-moon-003-sft will be opened in the near future.

Moss-moon-003: the final model trained by the preference model moss-moon-003-pm on the basis of moss-moon-003-sft has better factuality, security and more stable response quality, and will be available in the near future.

Moss-moon-003-plugin: based on the moss-moon-003-sft-plugin, the final model trained by the preference model moss-moon-003-pm has a stronger ability to understand the intention and the ability to use plug-ins, and will be available in the near future.

Data moss-002-sft-data: the multi-round dialogue data used by MOSS-002, covering three aspects of usefulness, faithfulness and innocuity, including about 570000 English conversations and 590,000 Chinese dialogues generated by text-davinci-003.

Moss-003-sft-data: the multi-round conversation data used by moss-moon-003-sft is based on about 100000 user input data collected in the internal test phase of MOSS-002 and constructed by gpt-3.5-turbo. Compared with moss-002-sft-data,moss-003-sft-data, it is more consistent with the real user intention distribution, including finer-grained useful category tags, a wider range of harmless data and a longer number of conversation rounds, including about 1.1 million conversation data. Currently, only a small amount of sample data is available, and the complete data will be opened in the near future.

Moss-003-sft-plugin-data: moss-moon-003-sft-plugin uses plug-ins to enhance multi-round conversation data, including about 300000 pieces of multi-round conversation data that supports four plug-ins: search engine, text graph, calculator, equation solving, etc. Currently, only a small amount of sample data is available, and the complete data will be opened in the near future.

Moss-003-pm-data: the preference data used by moss-moon-003-pm, including preference comparison data constructed from about 180,000 extra conversation context data and response data generated by using moss-moon-003-sft, will be available in the near future.

MOSS's GitHub page: click here to see

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.