In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Here comes the new work of Tsinghua Tangjie team:
WebGLM, a networked question and answer chat robot with a parameter of 10 billion (the thesis was selected as KDD2023).
You can ask it any question, and then it will list links to relevant articles on the Internet (such as Wikipedia, related official websites) and sort out the answers.
For example:
What is the core technology of ChatGPT?
Or:
Who proposed Music Transformer? What is its principle?
Or:
How about version 3.5 of the original god?
How can you live in a first-tier city without a high-paying job? (manual dog head)
……
It can give a reasonable answer.
According to reports, in the performance comparison test, the level of WebGLM has been higher than that of WebGPT with OpenAI 13.5 billion parameters, and even comparable with the model with 175 billion parameters in human evaluation.
So, how is it trained?
According to the introduction of WebGLM of Tsinghua Department, which has access to the Internet, the goal of WebGLM is to enhance the pre-training large language model through Web search and retrieval functions, and at the same time to carry out efficient actual deployment.
For this reason, the author develops based on three strategies.
The first is the large model enhancement searcher.
It is mainly used to enhance the retrieval ability of the relevant network content of the model, to find relevant references in the case of a given query, so as to better answer questions later.
It has two stages: coarse-grained web search and fine-grained LLM enhanced dense retrieval.
The second is the bootstrap generator.
It uses the ability of GLM (such as GLM-130B, a bilingual open source pre-training model previously released by Tsinghua University) to generate responses to questions and provide detailed answers.
Using this generator, the author gets a LLM bootstrap reference and a long-range QA dataset of WebGLM-QA--.
It is cleaned and filtered through strategies such as context learning, and finally includes 45k high-quality filter samples and 83k noise samples.
WebGLM's backbone is a GLM model trained on this data set.
Finally, there is a grader based on human preference.
It assesses the quality of the responses generated by giving priority to human preferences over expensive expert feedback to ensure that the system produces useful and attractive content.
The above three components eventually form the pipeline of WebGLM in order:
As you can see, there are exactly three modules, corresponding to the three parts described earlier, of which:
The LLM enhanced searcher uses the first five most relevant pages as a reference source, allowing the bootstrap generator to generate multiple answers, and the grader selects the one that is most likely to match human preference as the final output.
Performance Super OpenAI WebGPT in addition to WebGLM itself, Tang Jie's team also proposed a network-enhanced question answering system evaluation criteria, including both references and final answers.
The former measures relevance, information density, authenticity (no factual errors), toxicity (excluding violence, pornography and other information) and the degree of social prejudice; the latter measures fluency, correctness, citation accuracy, objectivity and redundancy.
They compared and evaluated 272 questions offered by the demo website WebGPT (from OpenAI and fine-tuned based on GPT-3), and recruited 15 volunteers with master's degrees to grade.
The final results are as follows:
"Rel.", "Den." Each corresponds to the 10 indicators mentioned above. )
As you can see, although the search results of WebGLM are slightly worse than WebGPT-175B, they are much better than Perplexity.ai and WebGPT-13B (the reference review on the left).
It is worth mentioning that the WebGLM retrieval process only uses some traditional word-based algorithms and two Contriever with a cumulative number of parameters no more than 300m.
In addition, WebGLM is obviously better than WebGPT-13B in computing performance and time consumption, and is comparable to 175B.
In the final result, WebGLM got the highest score in terms of fluency, authenticity and redundancy, while the correctness index was close to WebGPT-175B, much higher than Perplexity.ai and WebGPT-13B.
This shows that WebGLM can achieve higher performance at a lower cost, the authors say.
Deployment and training WebGLM release is open source.
To deploy it, you need to obtain a key from the SerpAPI website to get the search results during the search process.
The weight of the searcher can be downloaded from Tsinghua Cloud.
There are two ways to run the model: one is the command line interface, the other is in the form of Web services, and includes two optional models: WebGLM-2B and WebGLM-10B.
You can also train WebGLM yourself. Officials have provided training data for generators and searchers for download.
Paper address:
Https://arxiv.org/abs//2306.07906
GitHub Home Page:
Https://github.com/THUDM/WebGLM
This article is from the official account of Wechat: quantum bit (ID:QbitAI), author: Fengcai
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.