How to analyze Linux Kernel SCSI IO Subsystem 07/04 Update SLTechnology News&Howtos

How to analyze Linux Kernel SCSI IO Subsystem

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

How to analyze the Linux kernel SCSI IO subsystem, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Overview

The SCSI subsystem in LINUX kernel is composed of SCSI upper layer, middle layer and bottom driver module [1]. It is mainly responsible for managing SCSI resources and handling IO requests submitted to SCSI subsystem by other subsystems, such as file system. Therefore, understanding the IO processing mechanism of the SCSI subsystem is very important to understand the whole SCSI subsystem, and it is also helpful to understand the IO processing mechanism of the whole LINUX kernel. This paper expounds the IO processing mechanism of the SCSI subsystem from three aspects: the submission of the SCSI device access request, the processing of the access request by the SCSI subsystem and the error handling of the SCSI subsystem.

Submission of SCSI device access request

The submission of SCSI device access request is divided into two steps: user space submits the access request to the common block layer and the general block layer submits the block access request to the SCSI subsystem.

User space submits access requests to the common block layer

In LINUX user space, there are three ways to submit requests for access to SCSI devices to the common block layer:

Access through the file access interface provided by the file system. Reading and writing to files in the LINUX file system established on SCSI devices belongs to this kind of access mode; RAW device access mode. A common application of this access method is the dd command. The difference between the RAW device access method and the file access interface provided by the file system is that the former provides linear address access to SCSI devices directly and does not need address mapping by the file system; SCSI PASSTHROUGH. Access through SG provided by LINUX belongs to this way, and users can issue CDB [2] commands directly to SCSI devices. Therefore, through this interface, users can do some SCSI management operations, such as SES management and so on.

Figure 1 shows how the LINUX kernel handles the three request submission methods.

Figure 1. How the LINUX kernel handles three access requests

Requests submitted through the file system or RAW devices will be accessed through the underlying block device access layer (ll_rw_block ()), which generates block IO requests (BIO) and submitted to the general block layer [3]; while access requests submitted through the SG interface will call the interface provided by the SCSI intermediate layer and hand the request directly to the general block layer for processing.

The generic block layer submits a block access request to the SCSI subsystem

Why go through the universal block layer? This is because first of all, the general block layer optimizes the request according to the characteristics of disk access; secondly, the general block layer provides scheduling function, which can schedule the request; thirdly, the extensible structure of the general block layer makes it easy to integrate the block drivers of all kinds of devices.

When the request is submitted to the common block layer, the common block layer needs to prepare, schedule and deliver the block access request to the SCSI middle layer. A block access request can be understood as a request that describes the block access area, access method, and associated BIO, which is represented by the 'struct request' structure in the kernel. The block device will have a corresponding block access request device queue, which is used to record the access request that needs to be processed by the device, and the newly generated block access request will be added to the block access request queue of the corresponding device. The processing of IO by the SCSI subsystem is actually processing block access requests on the block access request queue.

The general block layer provides two ways to schedule the processing block access request queue: direct scheduling and scheduling execution through the LINUX kernel work queue mechanism. In both ways, * calls the block access request queue handler for processing, while the SCSI device registers the block access request queue handler defined by the SCSI subsystem with the general block layer during initialization. Listing 1 [4] shows this process. In this way, when the generic block layer handles the block access request queue of the SCSI device, these handling functions defined by the SCSI middle layer are called. In this way, the generic block layer hands over the processing of block access requests to the SCSI subsystem.

Listing 1. Processing function

Struct request_queue * scsi_alloc_queue (struct scsi_device * sdev)

Q = blk_init_queue (scsi_request_fn, NULL)

/ / request generate block layer allocate a request queue

……

Blk_queue_prep_rq (Q, scsi_prep_fn); / / Prepare a scsi request blk_queue_max_hw_segments (Q, shost- > sg_tablesize)

/ / define sg table size

……

Blk_queue_softirq_done (Q, scsi_softirq_done)

}

The SCSI subsystem handles block access requests

When the request queue processing function of the SCSI subsystem is called by the general block layer, the SCSI middle layer generates, initiates and submits the SCSI command (struct scsi_cmd) to the SCSI TARGET side according to the content of the block access request.

The SCSI command records SCSI-related information such as command description block (CDB), sense data cache (SENSE BUFFER), IO timeout, and other information needed by the SCSI subsystem to process commands, such as callback functions. Listing 2 shows the main structure of this command.

Listing 2. Main structure

Struct scsi_cmnd {

……

Void (* done) (struct scsi_cmnd *); / * Mid-level done function * /

……

Int retries; / * retried time*/

Int timeout_per_command; / * timeout define*/

……

Enum dma_data_direction sc_data_direction; / * data transfer direction*/

……

Unsigned char cmnd [Max _ COMMAND_SIZE]; / * cdb*/

Void * request_buffer; / * Actual requested buffer * /

Struct request * request; / * The command we are working on * /

……

Unsigned char sense_ buffer[SCSI _ SENSE_BUFFERSIZE]

/ * obtained by REQUEST SENSE when

* CHECK CONDITION is received on original

* command (auto-sense) * /

/ * Low-level done function-can be used by * /

/ * low-level driver to point to completion function. , /

Void (* scsi_done) (struct scsi_cmnd *)

……

}

The initialization process first takes a block access request from the request queue of the block device according to the elevator scheduling algorithm, and defines the direction, length and address of data transmission in the SCSI command according to the information of the block access request. Secondly, define the callback function of CDB,SCSI middle layer and so on.

After initialization, the SCSI middle layer submits the SCSI command to the underlying driver of SCSI by calling the queuecommand function defined in the scsi_host_template [5] structure. The queuecommand function, which is a SCSI command queue processing function, defines the specific implementation of the queuecommand function in the underlying driver of SCSI. Therefore, in the SCSI middle layer, calling the queuecommand function actually calls the processing entity of the queuecommand function defined by the underlying driver, and submits the SCSI command to the SCSI underlying driver defined by each manufacturer for processing. This process is similar to the mechanism that the general block device layer calls the processing function of the SCSI middle layer to handle the block request, which also reflects the good expansibility of the LINUX kernel code. After the underlying driver receives the request, it is necessary to start processing SCSI commands. This layer is closely related to the hardware, so this code is generally implemented by each manufacturer. The basic process can be summarized as follows: take a SCSI command from the queue maintained by the underlying driver, package it into a request format defined by the manufacturer, and then submit the request to the SCSI TARGET side by DMA or other means, and the SCSI TARGET side will process the request, and return the execution result to the SCSI underlying driver layer.

Processing of the execution result of SCSI command

When the SCSI underlying driver receives the command execution results returned by the SCSI TARGET side, the SCSI subsystem mainly processes the command execution results through two callback processes. After receiving the command execution result returned by the SCSI TARGET end, the SCSI underlying driver will call the callback function defined by the SCSI middle layer and deliver the processing result to the SCSI middle layer for processing. This is a * callback process. After the SCSI middle-tier processing is completed, the callback function defined by the upper layer of SCSI will be called to end the processing of IO in the entire SCSI subsystem, which is the second callback process.

* callback:

The SCSI middle layer not only calls the queuecommand function to submit the SCSI command to the SCSI underlying driver, but also passes the callback function pointer to the SCSI underlying driver. After the underlying driver receives the execution result of the command returned by the SCSI TARGET, the callback function is called to generate a soft interrupt with the interrupt number BLOCK_SOFTIRQ for * callback processing. During this callback process, the SCSI middle layer will first determine whether the request processing is successful or not based on the results of the SCSI underlying driver processing. Successful processing does not mean that there are no errors, but the returned information makes it clear to the middle layer of SCSI that it is no longer necessary for the middle layer to continue processing this command. Therefore, for a successful SCSI command, the SCSI middle layer will call the second callback function to enter the second callback process. Listing 3 shows the handler function for this soft interrupt defined by the SCSI middle layer.

Listing 3. The handler function of the soft interrupt

Static void scsi_softirq_done (struct request * rq)

{

……

Disposition = scsi_decide_disposition (cmd)

……

Switch (disposition) {

Case SUCCESS:

Scsi_finish_command (cmd)

/ / enter to second callback process

Break

Case NEEDS_RETRY:

Scsi_retry_command (cmd)

Break

Case ADD_TO_MLQUEUE:

Scsi_queue_insert (cmd, SCSI_MLQUEUE_DEVICE_BUSY)

Break

Default:

If (! scsi_eh_scmd_add (cmd, 0))

Scsi_finish_command (cmd)

}

Second callback:

Different SCSI upper modules will define their own second callback functions, such as the SD module. In the sd_init_command function, they will define their own second callback function sd_rw_intr. This callback function will further process the results of the SCSI command execution according to the needs of the SD module. Listing 4 shows the code for the SD module to register for the second callback. Although each SCSI upper module can define its own second callback function, these callback functions will eventually end the processing of the block access request by the SCSI subsystem.

Listing 4. Code for registering the second callback for the SD module

Static int sd_init_command (struct scsi_cmnd * SCpnt)

{

……

SCpnt- > done = sd_rw_intr

Return 1

}

Error handling of SCSI Subsystem

Since the underlying driver of SCSI is implemented by the manufacturer itself, it will not be discussed here. In addition, the error handling of the SCSI subsystem is mainly done by the SCSI middle layer. During the * callback, the SCSI underlying driver returns the processing result of the SCSI command and the obtained SCSI status information to the SCSI middle layer. The SCSI middle layer first judges the execution result of the SCSI command returned by the SCSI underlying driver. If a clear conclusion cannot be obtained, the SCSI status and sensing data returned by the SCSI underlying driver are judged. If the SCSI command is judged to be processed successfully, the SCSI middle layer will directly make a second callback; for the command that needs to be retried, it will be added to the block device to request alignment and be reprocessed. This process can be called the basic judgment method of SCSI middle layer to the execution result of SCSI command.

Everything seems to be so simple, but in fact it is not, and some errors have no clear basis for judgment, such as perceptual data errors or TIMEOUT errors. In order to solve this problem, the SCSI subsystem of the LINUX kernel introduces a thread that specializes in error handling, and the SCSI commands that cannot determine the cause of the error will be handed over to this thread. The threading process is closely related to two queues, one is the error handling queue (eh_work_q) and the other is the error handling completion queue (done_q). The error handling queue records the SCSI commands that need to be handled, and the error handling completion queue records the SCSI commands that are processed during the error handling. Listing 5 shows the thread's error handling of commands recorded on the error handling queue.

Listing 5. The process of error handling

Scsi_unjam_host {

……

If (! scsi_eh_get_sense (& eh_work_q, & eh_done_q))

/ / get sense data

If (! scsi_eh_abort_cmds (& eh_work_q, & eh_done_q))

/ / abort command

Scsi_eh_ready_devs (shost, & eh_work_q, & eh_done_q)

/ / reset

Scsi_eh_flush_done_q & eh_done_q)

/ / complete error io on done_q

……

}

The whole process can be divided into four stages:

Sensing data query phase

By querying the sensing data, the judgment basis is provided for processing the SCSI command, and the judgment is made according to the above-mentioned basic judgment method. If the result is successful or retry, the command can be moved from the error handling queue to the error handling completion queue. If the judgment fails, the command will remain in the SCSI error handling queue, and the error handling will enter the ABORT phase.

ABORT stage

During this phase, the SCSI command on the error handling queue is actively ABORT out. Commands by ABORT will be added to the error handling completion queue. If the ABORT process ends and there are still commands on the error handling queue that cannot be processed, you need to enter the START STOP UNIT phase for processing.

START STOP UNIT stage

At this stage, the START STOP UNIT [6] command is sent to the SCSI DEVICE associated with the command on the error handling queue to attempt to recover the SCSI DEVICE. If there are still commands on the error handling queue after the end of the START STOP UNIT phase, it needs to be processed in the RESET phase.

RESET stage

The process of the RESET phase is divided into three levels: DEVICE RESET,BUS RESET and HOST RESET. First, RESET the SCSI DEVICE related to the commands on the error queue. If the SCSI device can be in a normal state after DEVICE RESET, the error commands on the error handling queue related to the device will be added to the error handling completion queue. If you cannot handle all the error commands through DEVICE RESET, you need to enter the BUS RESET phase, and BUS RESET will RESET the BUS related to the commands on the error handling queue. If BUS RESET cannot successfully handle all the SCSI commands on the error handling queue, it will enter the HOST RESET phase, and HOST RESET will RESET the HOST related to the commands on the error handling queue. Of course, it is quite possible that HOST RESET will not be able to handle all the error commands successfully, so it can only be assumed that the SCSI device associated with the error command on the error handling queue cannot be used. These devices that cannot be used will be marked as unusable, and relevant error commands will be added to the error handling completion queue.

For requests added to the error handling completion queue, if the device status is correct and the number of command retries is less than the allowed number of times, these commands will be re-added to the block access request queue for reprocessing; otherwise, the second callback processing will be directly carried out to complete the processing of the block access request by the SCSI subsystem. In this way, the SCSI subsystem completes the whole process of error handling of SCSI commands.

This is the answer to the question on how to analyze the SCSI IO subsystem of the Linux kernel. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.