Tsinghua ChatGLM-6B and ChatGLM2-6B models allow free commercial use 04/13 Update SLTechnology News&Howtos

Tsinghua ChatGLM-6B and ChatGLM2-6B models allow free commercial use

2025-04-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Thanks to CTOnews.com netizens for the clues of crucian carp snow fox to deliver! CTOnews.com, July 15, AI and Tsinghua KEG Lab decided that the weights of ChatGLM-6B and ChatGLM2-6B are completely open to academic research, and free commercial use is allowed after completing the enterprise registration.

CTOnews.com previously reported that the Tsinghua NLP team released the Chinese-English bilingual dialogue model ChatGLM-6B on March 14, which initially has question-and-answer and dialogue functions. The model is based on General Language Model (GLM) architecture and has 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (at least 6GB video memory is needed at the INT4 quantization level).

The ChatGLM2-6B model was released on June 25, adding many new features while retaining the features of the original model, such as smooth dialogue and low barriers to deployment:

More powerful performance: ChatGLM2-6B uses the mixed objective function of GLM, and has undergone 1.4T pre-training of Chinese and English identifiers and human preference alignment training. the evaluation results show that, compared with the primary model, the performance of ChatGLM2-6B on MMLU (+ 23%), CEval (+ 33%), GSM8K (+ 571%), BBH (+ 60%) and other datasets has been greatly improved.

Longer context: the context length has been extended from 2K to 32K in ChatGLM-6B.

More efficient reasoning: the reasoning speed has been increased by 42% compared with the original generation. Under the quantization of 6G video memory, the conversation length supported by 6G video memory has been increased from 1K to 8K.

A more open protocol: ChatGLM2-6B weights are completely open to academic research.

ChatGLM2-6B uses Multi-Query Attention, which not only improves the generation speed, but also reduces the video memory occupation of KV Cache in the generation process. At the same time, ChatGLM2-6B uses Causal Mask for dialogue training, and the KV Cache of previous rounds can be reused during continuous conversation, which further optimizes the occupation of video memory.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.