Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to find and solve the problem that the interrupt driver register of PCIe interface is overwritten

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

What this article shares to you is about the discovery and solution of the problem that the interrupt driver register of the PCIe interface is overwritten. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Recently, when debugging the PCIe network driver under the Windows platform, we found that the interrupt was not handled, and it was suspected that the interrupt was lost. Then in the process of debugging, the problem is located in the following two aspects.

DMA write repeat start

We use WDF framework to develop PCIe-driven DMA read and write function under Windows. There are two steps for a driver to start a DMA transfer.

Initialize the DMA transport object

Perform DMA transfer

When initializing the DMA transfer object, you should write the address and length of the data buffer to be transferred by this DMA to the object and register with it the callback function PCIeEvtProgramWriteDma that is used to configure and start the DMA transfer.

When performing the DMA transfer, the driver only needs to call the WdfDmaTransactionExecute function of the WDF framework, and the operating system will call the callback function registered in the previous step to configure the hardware and start the DMA transfer.

Normally, the driver calls the WdfDmaTransactionExecute function once, and the operating system should call the callback function once for hardware configuration. However, after we replace the hardware platform (CPU+FPGA), there are serious problems with the DMA write process, such as: one call of the former may correspond to multiple calls of the latter, and each callback function will be fully executed and trigger the DMA write completion interrupt, resulting in the driver's interrupt state machine being disrupted, which directly shows that the subsequent DMA write interrupt begins to be lost and cannot start DMA write normally.

As shown below, figure 1 is a screenshot of the number of times the driver calls the WdfDmaTransactionExecute function and the number of times the operating system calls the callback function.

Fig. 1 DebugMonito monitoring

Where 5658 (5576 / 82 / 0) is the number of times the driver calls the WdfDmaTransactionExecute function, and 5664 is the number of times the operating system calls the callback function. The difference of 6 between the two is the number of repeated calls made by the operating system.

The hardware side can complete the DMA transmission normally and trigger the DMA write to complete the interrupt. At this point, we speculate that the reason why the operating system calls the callback function many times is that it reconfigures it because it thinks there is something wrong with the configuration process until the last success. However, the hardware side will not perceive this error, and each time it starts DMA write normally and triggers the DMA write completion interrupt, causing the drive interrupt state machine to run away.

At this point of troubleshooting, we can't go deep into the closed-source Windows operating system to explore the cause of the error. So after a change of thinking, we try whether we can provide some guarantee mechanism for the interrupt state machine.

Driven interrupt state machine

In order to facilitate debugging, we added a lot of key debugging log information to the interrupt handler, and found a clue in it.

Figure 2 Log print record

Looking at the log in figure 2, it is found that two interrupt delay handling functions, MPHandleInterrupt, are executing in parallel. The direct consequence of overwriting is that the former has read the registered interrupt, while the latter cannot be handled by the interrupt delay handler after it is overwritten.

This phenomenon is obviously unreasonable. To solve this problem, we put a lock inside the MPHandleInterrupt function to prevent MPHandleInterrupt from executing in parallel. In this way, the phenomenon that the interrupt deposit is overwritten no longer occurs.

The above is how to find and solve the problem that the interrupt driver register of the PCIe interface is overwritten. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report