In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
The training data of PaLM2 model is five times that of the previous generation, reaching 3.6 trillion. PaLM2-based Bard has eight advantages over ChatGPT.
The key factor determining the ability of a large model is the parameters of the model or the size of the training text.
Google's PalM2 seems to have chosen the latter as the main path to promotion.
It is reported that Google uses almost five times as many PaLM2 texts to train its predecessor model.
And when PaLM 2 was announced last week, Google made it clear that the model was smaller than the earlier PaLM.
Google's internal documents show that PaLM has been trained with 540 billion parameters, while the newly launched PaLM2 has nearly halved the training parameters, only 340 billion.
Description of training data in technical documentation, but on the size of training corpus, another key data for model training, Google began to pile up crazily, pushing 780 billion of PaLM's training token directly to 3.6 trillion!
And in addition to the surge in the number of Token, PaLM2 also has a great improvement in data quality.
The proportion of each language in the training data, therefore, compared with PaLM, the English performance of the second generation has improved significantly without a significant increase in the amount of English corpus data, partly because of the improvement in the quality of English data.
The large model route choice OpenAI did not disclose the number of GPT-4 training parameters, but Google did not hide it and took the initiative to disclose the PaLM2 training parameters.
And at Google I / O, four models with fewer parameters were released at the same time.
The smallest model, Gecko (gecko), can even run on smartphones.
The move reflects Google's future ambitions to deploy its own large models on more platforms.
In this context, from a long-term point of view, it is almost impossible for Google to choose the number of heap training parameters to improve the performance of the model, and it is almost inevitable to increase the quantity and quality of training corpus.
PaLM 2: the most powerful model in history? When announcing PaLM 2 at the I / O conference, Google confirmed that the model was trained in 100 languages and could perform a wide range of tasks. It has been used to support 25 features and products, including Google's experimental chat robot Bard.
PaLM 2 comes in four sizes, from small to large: Gecko (gecko), Otter (otter), Bison (bison) and Unicorn (unicorn).
Based on the publicly disclosed data now, PaLM 2 is more powerful than any existing model.
Meta's LLaMA was launched in February and has been trained on 1.4 trillion token.
The last time OpenAI shared the scale of training was when GPT-3 was launched, when OpenAI said it had received 300 billion token training.
In addition, Google proposed the LaMDA model two years ago, when it received 1.5 trillion token training.
The AI arms race is heating up, and the public demands more transparency. For the details of the large model training data, the big companies have tacitly chosen "Close".
When releasing GPT-4, OpenAI did not disclose details such as architecture (including model size), hardware, training computing, dataset construction, training methods, and so on, citing "the competitive landscape and security implications of large-scale models like GPT-4."
Google, which has been cornered by OpenAI, has also been eager to demonstrate the power of its AI technology, including how to embed it in search, email, word processing and spreadsheets, but has been reluctant to release the size of training data or other details.
The reason for confidentiality, of course, is the competitive nature of the business.
Both Google and OpenAI are vying for users who want to use chatbots instead of traditional search engines.
But as the AI arms race heats up, the research community is demanding more transparency.
Now, with AI applications rapidly becoming mainstream, the controversy around the underlying technology is becoming more and more intense.
With the rapid development of new artificial intelligence applications, the controversy around the underlying technology is becoming more and more intense.
In February, El Mahdi, a senior research scientist at Google, resigned because of the company's lack of transparency.
OpenAI CEO Sam Altman testified at a Senate Judiciary subcommittee hearing on privacy and technology on Tuesday, agreeing with lawmakers that AI's new system needs to be regulated.
"for a very new technology, we need a new framework," Altman said. "of course, companies like ours have a lot of responsibility for the tools launched around the world. "
What Bard can do but ChatGPT can't. One of the significant advantages of accessing the network over ChatGPT,Bard is that it can access the Internet.
Ask about the sports events of today (May 17), and Bard quickly summed it up.
ChatGPT does not have direct access to the Internet and can only access the network through plug-ins on its paid version of Plus.
two。 Image generation
In terms of generating images, Bard also goes beyond paid and non-paid versions of ChatGPT.
Google announced that it will provide AI image generation through the integration of Adobe Firefly. This feature enhances the visual effect of the conversation and gives users more contextual information.
3. Voice input
In terms of voice input, Bard is also better than ChatGPT. Users can interact with the model as long as they use voice.
In this way, when multitasking and typing are not convenient, the user has a boundary way to get the response quickly.
The editor read through the questions of today's sports news, and Bard automatically displayed them. The only thing to pay attention to is that English pronunciation should be standard enough. 🤣
4. Coding ability
In terms of coding ability, Bard also surpasses ChatGPT. It can assist more than 20 programming languages, including C++, Python, Java, TypeScript, JavaScript and so on. It can assist developers in code generation, interpretation and debugging.
In contrast, although ChatGPT also has coding capabilities, it has shortcomings in handling additional tasks, and OpenAI's Codex may be better suited to perform these tasks.
Ask Bard to use python to generate a Fibonacci series and type out the first 10 numbers.
Bard completed successfully.
5. Highly integrated Gmail
Integration with Gmail is another important advantage of Bard.
With more than 2 billion users, Gmail is the largest e-mail service in the world. If you can use Bard in email, it undoubtedly opens up new possibilities for e-mail interaction.
However, Microsoft is also adding ChatGPT to Microsoft 365 and will embed it among competitors in Word, Excel, PowerPoint and Gmail.
6. Share the output. In addition, Bard can export the results to Gmail and Docs immediately.
Users can export the generated content directly to these platforms and easily share it with others. This feature greatly simplifies the process of sharing information and makes it easy to write emails.
OpenAI, on the other hand, has a similar export option in the settings. Users can export account details and conversations and send them to an email address as a downloadable file.
7. Support for image prompts
Another major feature of Bar is the ability to use images as prompts.
Users can simply click on the picture or scan the image using Google Lens to ask Bard for help.
For example, users can find resorts similar to a picture and ask about the historical significance of the site.
Similarly, GPT-4 is a large multimodal model that can accept image and text input, but as of the date of this article, this feature was not introduced in the paid version.
8. Because Bard can be connected to the Internet, it can summarize a page by simply sharing links.
By contrast, ChatGPT cannot be connected to the Internet, and users can only manually copy and paste what they want to summarize.
However, Bard also has its limitations, especially in terms of toxicity.
In the course of testing, when a clear toxic warning is given, Bard will produce a toxic reaction for more than 30% of the time.
In addition, PaLM 2 showed more toxic behavior in English, German and Portuguese languages as a whole.
Generally speaking, it is challenging to compare PaLM2 and GPT-4 directly due to different architectures and testing methods.
In reasoning tasks, PaLM 2 performs similar to or even better than GPT-4.
However, in the coding task, PaLM 2 requires multiple attempts, as well as additional coding token, to achieve good performance.
Reference:
Https://www.cnbc.com/2023/05/16/googles-palm-2-uses-nearly-five-times-more-text-data-than-predecessor.html
Https://analyticsindiamag.com/8-things-that-bard-can-do-but-chatgpt-cant/
This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.