Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Meta launched the AI language model LLaMA, a large language model with 65 billion parameters.

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

CTOnews.com, February 25 (Xinhua)-- Meta announced on Friday local time that it will launch a new large language model based on artificial intelligence (AI) for the research community, joining the artificial intelligence competition with Microsoft, Google and other ChatGPT-stimulated companies.

Meta's LLaMA is an acronym for "large language Model Meta AI" (Large Language Model Meta AI), which can be made available to government, community, and academic researchers and entity workers under non-commercial license.

The company will provide the underlying code for users to use, so users can adjust the model themselves and use it for use cases related to research. Meta says the model requires "much less computing power".

According to reports, the company is developing a variety of parameters (7B, 13B, 33B and 65B) LLaMA. Among them, LLaMA 65B and LLaMA 33B were trained on 1.4 trillion tokens, while the smallest model, LLaMA 7B, was also trained on 1 trillion tokens.

Like other large language models, LLaMA works by taking a series of words as "input" and predicting the next word to generate text recursively. For this model, Meta trains texts from the top 20 languages, with an emphasis on Latin and Cyrillic letters.

Of course, like other models, LLaMA also faces the challenges of bias, toxic comments and hallucinations, and Meta needs to do more research to solve the shortcomings of this kind of language models.

Meta says that LLaMA, as a base model, is designed to be versatile and can be applied to many different use cases, rather than fine-tuning models designed for specific tasks. By opening up LLaMA's code, other researchers can more easily find new ways to limit or eliminate these problems. Meta also provides a set of benchmark criteria for evaluating model deviations and toxicity in this article to show the limitations of the model and to support further research in this critical area.

It is worth mentioning that Meta also launched a large language model OPT-175B in May last year. The project is also aimed at researchers, which forms the basis of a new iteration of its chat robot blenterbot.

Later, the company also launched a model called Galactica, which is said to be able to write scientific articles and solve mathematical problems, but its demo version was later removed from the shelves because it repeatedly generated "sounds authoritative" content.

CTOnews.com with official link:

Introduction on the official website

Github

Apply for access to LLaMA

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report