In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
On October 9, Beijing Yuezhi Dark side Technology Co., Ltd. (Moonshot AI) announced a breakthrough in the field of "long text", launching the first intelligent assistant product Kimi Chat that supports the input of 200,000 Chinese characters. This is the longest context input length that can be supported among the large model services that can be used in the global market, indicating that Moonshot AI has achieved a world leading level in this important technology.
Volcano engine cooperates deeply with Moonshot AI to provide its exclusive AI training and reasoning acceleration solution with high stability and cost performance. The two sides jointly carry out technical research and development to jointly promote the application of large-scale language models in vertical fields and general scenarios. At the same time, Kimi Chat is about to enter the volcano engine large model service platform-Volcano Ark, and the two sides will continue to provide richer AI applications for enterprises and consumers in the field of large model ecology.
Compared with the current large model services based on English training in the market, Kimi Chat has strong multilingual capabilities. For example, Kimi Chat has significant advantages in Chinese, and the actual use effect can support the context of about 200,000 Chinese characters, which is 2.5 times that of Anthropic's Claude-100k (about 80,000 words) and 8 times that of OpenAI's GPT-4-32k (about 25000 words). At the same time, through innovative network structure and engineering optimization, Kimi Chat can achieve a lossless long-range attention mechanism under hundreds of billions of parameters, independent of "shortcuts" that do great damage to performance, such as sliding window, downsampling, small model and so on.
Yang Zhilin, founder of Moonshot AI, said in an interview that lossless compression of huge amounts of data can achieve a high degree of intelligence, whether in text, voice or video. The upper limit of the capacity of the large model (that is, lossless compression ratio) is determined by the single-step capability and the number of steps executed, the former is related to the number of parameters, and the latter is the context length.
Meet the challenge of large language model landing and promote the landing of industry applications
Moonshot AI believes that the longer context length can bring a new chapter to the application of the large model, and promote the large model from the LLM era to the Long LLM (LLLM) era. When large models are used to find effective ways to deal with long text scenes, we need to constantly explore new ways to reduce model hallucinations and improve the controllability of generated content, and to seek new ways to personalize the ability of large models. In the process of research and development of large-scale language model, we also need to cross many thresholds, such as the expansion of computing resources, poor stability of task engineering, high project cost, security and trust, in order to improve the training efficiency of the model.
In order to solve the above problems, Moonshot AI cooperated with volcano engine to carry out AI technology innovation and AGI practice on volcano engine machine learning platform veMLP. Moonshot AI makes full use of GPU resource pool, based on large-scale pre-training model, realizes normal and stable training on the scale of thousands of cards per day, trains Kimi Chat, a large-scale language model with hundreds of billions of parameters within six months, and unlocks complex scenarios such as professional scene writing, ultra-long text understanding and analysis, personalized dialogue of ultra-long memory, knowledge question and answer based on a large number of documents, and has been successfully applied in many well-known enterprises.
Zhou Xinyu, co-founder of Moonshot AI, said: "Moonshot AI focuses on exploring the boundaries of general artificial intelligence and strives to find the optimal solution to turn computing power into intelligence. Volcano engine has the leading infrastructure capacity and computing power reserve in China. In the future, the two sides will further cooperate in AI computing infrastructure and application scenario expansion to jointly promote the development of artificial intelligence technology and bring users a stable, efficient and intelligent service experience. "
Based on volcano engine machine learning platform, large model training is more stable and faster.
Volcano engine provides a highly stable and cost-effective AI training and reasoning acceleration solution for large model construction and training. after long-term polishing of Douyin and other massive user business, its machine learning platform veMLP has formed full-stack AI development engineering optimization schemes, task failure self-healing, experimental observability and other solutions and best practices, providing one-stop AI algorithm development and iterative services with high efficiency, stability, security and mutual trust. Make the big model training faster, more stable and more cost-effective. Based on the ultra-large-scale AI training and reasoning acceleration solution provided by the volcano engine, Moonshot AI helps the team realize the continuous training iteration, fine tuning and reasoning of large language models quickly, steadily and at low cost.
Large-scale scheduling of 1.IaaS computing power and storage resources
Build a high-performance computing cluster to achieve 10,000-card-level large model training, microsecond delay network, flexible computing can save 70% of computing costs; the use of vePFS+TOS hot and cold layered acceleration scheme to meet the high throughput of training data while reducing the overall storage cost by 65%. Aiming at the large model file system read and write Pattern, we jointly develop a special file cache system to greatly improve the utilization of the graphics card.
Stability guarantee of 2.PaaS Computing Cluster
Optimize the stability of the super-large training cluster, provide hardware fault self-healing optimization and independent diagnosis capabilities, allow user tasks to quickly retry and continue training, achieve monthly stable training, and reduce RingAllReduce cross-switch communication through multi-machine training task communication affinity optimization.
3. High observability of experiment
Do experimental management for multiple training tasks, compare the training results through visualization to determine the iterative online model; use the complete monitoring log to help the business to tune 3D parallel parameters and assist in locating training faults.
4. Mutual trust scheme of large model service security
Combine trusted private computing with LLM application to provide security sandbox function and improve developer authority control. Volcano engine also works with Moonshot AI to design workflows suitable for large model research and development habits, to ensure work efficiency, to achieve hierarchical access to data and ensure data security.
Wu Di, head of intelligent algorithms for volcano engine, said: "Volcano engine has always adhered to the cooperative attitude of focusing technology, enabling partners, and value symbiosis. Moonshot AI has an advanced large model research and development team in China, and has in-depth understanding and application experience of AI technology. The cooperation between the two sides will further provide enterprises and consumers with richer AI applications in the field of multi-model ecological services."
Functional panorama of volcanic ark
At present, Volcano Ark, the volcano engine large model service platform, has been stationed in the large models of many AI technology companies and scientific research institutes, such as AI, Minimax, byte jumping lark, etc., and Kimi Chat, the large model service of Moonshot AI, will also land on the Volcano Ark. Volcano engine will continue to cooperate with domestic excellent large model service providers to provide model training, reasoning, evaluation, fine tuning and other full functions and services to help thousands of industries accelerate the AI process. Welcome all enterprises to experience the big model in the ark, Volcano Ark is willing to grow together with your business!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.