Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The Tsinghua team launched the conversation robot ChatGLM, which supports deployment and tuning on personal computers.

2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Thanks to CTOnews.com netizens Xiao Zhan cut, Chao Tianjiao, fat cat, _ pupil _, orchid is my clue delivery! CTOnews.com news on March 22, in the core team of ChatGPT, there is no lack of developers who graduated from Tsinghua University and joined OpenAI. On the same day that GPT-4 was released, the top NLP team of Tsinghua University also unveiled its self-developed large model of ChatGPT-like ChatGPT-- the Chinese-English bilingual dialogue model ChatGLM-6B, which began to have question-and-answer and dialogue functions, and has now launched the invitation system internal testing (internal test application URL http://chatglm.cn). In the future, the scope of internal testing will be gradually expanded.

According to the official blog, this is a hundreds of billions of Chinese-English language model with question-and-answer and dialogue functions, and has been optimized for Chinese. The model is based on General Language Model (GLM) architecture and has 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (at least 6GB video memory is needed at the INT4 quantization level). ChatGLM-6B uses the same technology as ChatGLM and optimizes Chinese Q & An and dialogue. After bilingual training of about 1T identifiers in Chinese and English, supplemented by the techniques of supervised fine-tuning, feedback self-help, and human feedback reinforcement learning, although the scale of the ChatGLM-6B with 6.2 billion parameters is less than that of the 100 billion model, it greatly reduces the reasoning cost, improves the efficiency, and has been able to generate answers that are quite in line with human preferences.

Specifically, ChatGLM-6B has the following characteristics:

Adequate bilingual pre-training in Chinese and English: ChatGLM-6B trained 1T of token on the 1:1 ratio of Chinese and English, both bilingual.

Optimized model architecture and size: learn from the experience of GLM-130B training, modify the implementation of two-dimensional RoPE location coding, using the traditional FFN structure. The parameter size of 6B (6.2 billion) also makes it possible for researchers and individual developers to fine-tune and deploy ChatGLM-6B themselves.

Lower deployment threshold: under FP16 semi-precision, ChatGLM-6B needs at least 13 GB of video memory for reasoning. Combined with model quantization technology, this requirement can be further reduced to 10GB (INT8) and 6GB (INT4), so that ChatGLM-6B can be deployed on consumer-grade graphics cards.

Longer sequence length: compared to GLM-10B (sequence length 1024), ChatGLM-6B sequence length reaches 2048, supporting longer conversations and applications.

Human intention alignment training: supervised fine tuning (Supervised Fine-Tuning), feedback self-help (Feedback Bootstrap) and human feedback reinforcement learning (RLHF) are used to make the model have the ability to understand the intention of human instructions. The output format is markdown, which is easy to display.

Therefore, ChatGLM-6B has a good dialogue and question-answering ability under certain conditions. Of course, ChatGLM-6B also has quite a number of known limitations and shortcomings:

The model capacity is small: the small capacity of 6B determines its relatively weak model memory and language ability. ChatGLM-6B may generate incorrect information when faced with many factual knowledge tasks; she is also not good at solving logical problems (such as math, programming).

Harmful or biased content may be produced: ChatGLM-6B is just a language model that initially aligns with human intentions and may produce harmful and biased content.

Weak multi-round dialogue ability: ChatGLM-6B 's context understanding ability is not enough, in the face of long answer generation, as well as multi-round dialogue scenarios, context loss and misunderstanding may occur.

Lack of English ability: most of the instructions used in training are in Chinese, only a small number of instructions are in English. Therefore, when using English instructions, the quality of the reply may not be as good as that of the Chinese instructions, or even contradict with the responses under the Chinese instructions.

Easily misled: there may be problems with ChatGLM-6B 's "self-perception" and it is easy to be misled and misled. For example, the current version of the model will deviate in self-perception when it is misled. Even though the model has undergone bilingual pre-training of about 1 trillion identifiers (token), instruction fine-tuning and human feedback reinforcement learning (RLHF), misleading content may be produced under some instructions because of its small capacity.

The team said that it has been exploring, trying and trying, and the GLM series models have made some progress, but they are still far from the top international large model research and products, such as OpenAI's ChatGPT and the next generation GPT model. The catch-up and breakthrough of Chinese large model research in the original algorithm, AI chip and industry need not only everyone's efforts, but also our training and training of the next generation of AI talents.

CTOnews.com shows the effect of the dialogue with ChatGLM-6B:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report