Nvidia releases ChipNeMo, a large language model, specializing in auxiliary chip design 04/27 Update SLTechnology News&Howtos

Nvidia releases ChipNeMo, a large language model, specializing in auxiliary chip design

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Nvidia launched its latest 43 billion-parameter big language model-ChipNeMo.

As for its use, Nvidia is also very clear in the official disclosure, pointing to the AI chip design.

Specifically, ChipNeMo can help staff complete tasks related to chip design, including answering general questions about chip design, summarizing bug documents, and writing scripts for EDA tools.

Bill Dally, Nvidia's chief scientist, said:

Our goal is to make Nvidia's designers more efficient.

Even if our productivity (because of ChipNeMo) has increased by only a few percentage points, it is worth it.

Take the Nvidia H100 Tensor Core GPU, which is made up of tens of billions of transistors and looks like a carefully planned city under a microscope.

These transistors are connected to a "street" 10000 times thinner than human hair, and it takes more than two years for multiple engineering teams to work together, which is evident in the tedious and huge workload.

Therefore, in Nvidia's view, with the advent of a large language model like ChipNeMo, all workflows will be more efficient.

In the words of Bill Dally:

This work marks an important first step in applying large language models to the complex work of semiconductor design.

It shows that even highly specialized areas can use their internal data to train useful generative AI models.

Nvidia chip design with new helpers on the whole, ChipNeMo can help Nvidia internal chip designers to complete three major areas of work.

First of all, there are questions and answers.

ChipNeMo includes a prototype chat robot that answers questions about GPU architecture and design, helping many engineers quickly find technical documentation in early testing.

For example, you can ask it a question:

Why should we add a rescheduling phase (retiming stages)?

Based on this question, ChipNeMo will give a precise answer:

The increase of the re-timing phase is to improve the timing of the design and take into account the physical distance between different parts of the chip.

By increasing the retiming level, we can reduce the time required for signal transmission between different parts of the chip, which helps to improve the overall performance of the design.

In addition, rescheduling can help reduce the amount of noise and interference that may occur on the signal, thus further improving the performance of the design.

The second is DEA script generation.

For example, you can simply ask ChipNeMo to generate code in natural language:

Write code in TOOL1 that outputs the number of trigger units in a given rectangle (0mem0100100).

Just wait a moment, and the annotated code snippet is generated with a snap.

It is understood that Nvidia is still developing a code generator (shown below), which will be integrated with existing tools in the future to make it easier for engineers to use.

The last part is Bug summary and analysis.

The chip designer only needs to describe the situation to ChipNeMo. For example, the content of prompt may include Bug's ID, Synopsis, Module, Description and so on.

Then ChipNeMo will give a good technical summary and management summary according to prompt.

How is ChipNeMo made? First of all, in terms of data sets, Nvidia mainly uses Bug summaries, design sources (Design Source), documents, Wikipedia, GitHub and other hardware-related code and natural language text.

After a centralized data collection process, 24.1 billion token are formed after cleaning and filtering.

Secondly, in terms of algorithm and architecture design, Nvidia does not directly deploy the commercial and open source large language model.

Instead, we mainly use these domain adaptive (Domain-Adapted) technologies, including custom markers, domain adaptive continuous pre-training, supervised fine tuning (SFT) with domain specific instructions, and domain adaptive retrieval model.

Under this method, the performance of large language model in engineering assistant chat robot, EDA script generation and Bug summary and analysis is improved.

The results show that these domain adaptive techniques make the performance of the large language model better than the general basic model; at the same time, the model size can be reduced by up to 5 times, and maintain similar or better performance.

But the author of the paper also said frankly:

Although some progress has been made in the current results, there is still room for improvement with the desired results. Further research on the LLM method adapted to the field will help to narrow this gap.

Reference link:

[1] https://blogs.nvidia.com/blog/2023/10/30/llm-semiconductors-chip-nemo/

[2] https://www.eetimes.com/nvidia-trains-llm-on-chip-design/

[3] https://d1qx31qr3h6wln.cloudfront.net/publications/ChipNeMo%20%2824%29.pdf

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.