Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Google does not have open source PaLM, netizens open source, 100 billion parameter miniature version: the maximum is only 1 billion, 8k context

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Unexpectedly, Google PaLM has been open source, but a miniature version.

Google is not open source PaLM, netizens unexpectedly give open source.

Yesterday, a developer opened up three miniature versions of the PaLM model on GitHub: parameters 150 million (PalM-150m), 410 million (PalM-410m) and 1 billion (PalM-1b).

Project address: https://github.com/ conceptofmind / PaLM these three models are trained in Google C4 dataset with a context length of 8k. In the future, there are 2 billion parameters of the model under training.

An example of Google C4 dataset open source PaLM generated using the 410 million parameter model:

My dog is very cute, but not very good at socializing with other dogs. The dog loves all new people and he likes to hang out with other dogs. I do need to take him to the park with other dogs. He does have some bad puppy breath, but it is only when he runs off in a direction he doesn't want to go. Currently my dog is being very naughty. He would like to say hi in the park, but would rather take great care of himself for a while. He also has bad breath. I am going to have to get him some oral braces. It's been 3 months. The dog has some biting pains around his mouth. The dog is very timid and scared. The dog gets aggressive towards people. The dog is very playful and they are a little spoiled. I am not sure if it's a dog thing or if he is spoiled. He loves his toys and just wants to play. He plays with his toys all the time and even goes on walks. He is a little picky, not very good with other dogs. The dog is just a little puppy that goes to the park. He is a super friendly dog. He has not had a bad mouth or bad breath

My dog is lovely, but not good at socializing with other dogs. The dog likes all the newcomers and he likes to play with other dogs. I really need to take him to the park with the other dogs. He does have puppy breath, but only when he runs in a direction he doesn't want to go. Now my dog is very naughty. He wants to say hello in the park, but he would rather take good care of himself for a while. He still has bad breath. It's been three months since I had to buy him an oral appliance. The dog has some bite marks around its mouth. The dog is very timid and scared. The dog is aggressive to people. The dog is so naughty that they are a little spoiled. I'm not sure if it's the dog or he's spoiled. He likes his toys and just wants to play. He always plays with his toys and even goes for a walk. He's a little picky and doesn't get along well with other dogs. That dog is just a puppy going to the park. He is a super friendly dog. He doesn't have bad breath anymore.

Although the parameters are indeed a little small, but the effect is still somewhat indescribable.

These models are compatible with many popular Lucidrain repositories, such as Toolformer-pytorch, PalM-rlhf-pytorch, and PalM-pytorch.

The latest three open source models are all baseline models and will be trained on larger datasets.

All models will further adjust instructions on FLAN to provide flan-PaLM models.

The open source PaLM model is trained by Flash Attention and Xpos Rotary Embeddings to achieve better length extrapolation and more efficient decoding using multi-query single-key-value attention mechanism.

In the optimization algorithm, the decoupling weight attenuation Adam W is used, but the Stable Adam W of Mitchell Wortsman can also be used.

Currently, the model has been uploaded to Torch hub and the files are stored in Huggingface hub.

If the model cannot be downloaded correctly from Torch hub, be sure to clear the checkpoints and model folders in .cache / torch / hub/. If the problem is still unsolved, you can download the file from Huggingface's repository. At present, the integration of Huggingface is in progress.

All the training data have been pre-marked with the GPTNEOX marker, and the sequence length is cut off to 8192. This will help to save a lot of the cost of preprocessing data.

These data sets have been stored on Huggingface in parquet format, and you can find various data blocks here: C4 Chunk 1 Magi C4 Chunk 2 Magi C4 Chunk 3 Magi C4 Chunk 4, and C4 Chunk 5.

There is another option in the distributed training script to load and process another dataset, such as openwebtext, instead of using the supplied premarked C4 dataset.

Installation requires a wave of installation before attempting to run the model.

Git clone https://github.com/conceptofmind/PaLM.gitcd PaLM/pip3 install-r requirements.txt uses you to load a pre-trained model using Torch hub for additional training or fine-tuning:

Model = torch.hub.load ("conceptofmind/PaLM", "palm_410m_8k_v0"). Cuda () in addition, you can load PyTorch model checkpoints directly in the following ways:

From palm_rlhf_pytorch import PaLMmodel = PaLM (num_tokens=50304, dim=1024, depth=24, dim_head=128, heads=8, flash_attn=True, qk_rmsnorm = False,) .cuda () model.load ('/ palm_410m_8k_v0.pt') to generate text using the model, you can use the command line:

The hint that prompt- uses to generate text.

Seq _ len- the sequence length of the generated text, with a default value of 256.

Temperature- sampling temperature, default is 0.8

The filter threshold used by filter_thres- to sample. The default value is 0.9.

The model that model- is used to generate. There are three different parameters (150m 410m 1b): palm_150m_8k_v0,palm_410m_8k_v0,palm_1b_8k_v0.

Python3 inference.py "My dog is very cute"-- seq_len 256-- temperature 0.8-- filter_thres 0.9-- model "palm_410m_8k_v0" to improve performance, reasoning uses torch.compile (), Flash Attention, and Hidet.

If you want to extend the generation by adding stream processing or other functions, the author provides a general reasoning script "inference.py".

Training these "open source PalM" models is done on 64 A100 (80GB) GPU.

In order to facilitate the training of the model, the author also provides a distributed training script train_distributed.py.

You are free to change the model layer and hyperparameter configuration to meet the hardware requirements, and you can also load the weight of the model and change the training script to fine-tune the model.

Finally, the author says that he will add a specific fine-tuning script in the future and explore LoRA.

Data can be preprocessed to different datasets in a manner similar to the C4 dataset used during training by running build_dataset.py scripts. This pre-marks the data, divides it into chunks of specified sequence length, and uploads it to Huggingface hub.

For example:

Python3 build_dataset.py-seed 42-seq_len 8192-hf_account "your_hf_account"-tokenizer "EleutherAI/gpt-neox-20b"-dataset_name "EleutherAI/the_pile_deduplicated" PaLM 2 is coming in April 2022, and Google announced its first official PaLM with a parameter of 540 billion. Like other LLM, PaLM can perform a variety of text generation and editing tasks.

PaLM is the first large-scale use of Google's Pathways system to expand training to 6144 chips, which is by far the largest TPU-based system configuration for training.

Its ability to understand is outstanding, not only can read jokes, but also can explain to you where the joke is.

Just in March, Google opened its PaLM large language model API for the first time.

This means that people can use it to complete tasks such as summarizing text, writing code, and even train PaLM to be a chat robot like ChatGPT.

Firewood chopping will announce the company's latest developments in the AI field at Google's upcoming annual I / O conference.

It is said that the latest and most advanced large-scale language model PaLM 2 will be launched soon.

PaLM 2 contains more than 100 languages and has been running under the internal code name "Unified language Model" (Unified Language Model). It also conducts extensive coding and math tests as well as creative writing.

Last month, Google said its medical LLM "Med-PalM2" could answer questions on medical exams at the "expert doctor level" with an accuracy of 85 per cent.

In addition, Google will also release a chat robot Bard with the blessing of a large model, as well as a generative search experience.

Whether the latest AI release will straighten Google up remains to be seen.

Reference:

Https://github.com/conceptofmind/PaLM

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report