Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

(technical Analysis) kvm Virtualization principle

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

VMCS structure

VMCS is a data structure kept in memory, which contains the contents of the relevant registers of the virtual cpu and the control information related to the virtual cpu. Each VMCS corresponds to a virtual CPU.

VMCS needs to be bound to a physical CPU when it is used. At any given time, there is an one-to-one binding relationship between VMCS and physical CPU, that is, a physical CPU can only be bound to one VMCS, and a VMCS can only be bound to one physical CPU. VMCS can be bound to different physical CPU at different times, for example, bind to physical CPU1 first in a certain VMCS, unbind it at some time, and rebind to physical CPU2. This change in binding relationships is called VMCS migration.

VT-x provides two instructions for binding and unbinding VMCS.

VMPTRLD: binds the specified VMCS to the physical CPU that executes the instruction.

VMCLEAR: unbind the physical CPU that executes this instruction from its VMCS. This instruction synchronizes the VMCS structure in the physical CPU cache into memory, ensuring that the values in memory are up-to-date when VMCS is bound to a new physical CPU.

VT-x defines the specific format and content of VMCS. It is specified that it is a block of memory that does not exceed the 4KB, and that it is 4KB aligned. The format of VMCS. The description of each domain is as follows:

At offset 0 is the VMCS version ID, which indicates the version number of the VMCS data format.

Offset 4 is the VMX abort indication. VMX abort occurs when VM-Exit execution is not successful. CPU will store the reason for VMX abort here to facilitate debugging.

Offset 8 when the VMCS data field, the format of this field is CPU-related, different models of CPU may use different formats, which format is determined by the VMCS version identification.

The main information of VMCS is stored in the VMCS data domain, and VT-x provides two instructions to access VMCS.

VMREAD

< 索引>

Read the domain specified by the index in VMCS

VMWRITE: writes the domain specified by the index in the VMCS.

VT-x also defines a corresponding index for each field of the VMCS data field, and each field in the VMCS data field can also be accessed directly through the above two instructions.

Specifically, the VMCS data field includes the following six categories of information.

Guest-state (client state domain): saves the CPU state of the client at runtime, that is, when it is not in root mode. When VM-Exit occurs, CPU stores the current state in the client state domain; when VM-Entry occurs, CPU recovers the state from the client state domain. Host-state (host state domain): saves the CPU state of the VMM runtime, that is, the root mode. When VM-Exit occurs, CPU restores the CPU state from the domain. VM-Entry control domain: controls the behavior of the processor during VM-Entry. VM-Execution control domain: control processor behavior in VMX non-root mode. Typically, it can control certain conditions to trigger VM-Exit events, and also control the opening of some virtualization functions of VMX, such as APIC virtualization and EPT mechanism. VM-Exit control domain: controls the behavior of the processor when VM-Exi occurs.

VM-Exit information domain: provides the cause and details of VM-Exit events that VMM uses to determine how to manage and control VM,VM-Exit information domains is read-only.

Detailed analysis of each domain in VMCS:

VM-execution control class field

VIRTUAL_PROCESSOR_ID = 0x00000000, / SECONDARY_EXEC_ENABLE_VPID is 1, valid, provides 16-bit VPID/

POSTED_INTR_NV = 0x00000002, / PIN_BASED_POSTED_INTR is 1 valid /

IO_BITMAP_A = 0x00002000, when / CPU_BASED_USE_IO_BITMAPS is enabled, this field takes effect /

IO_BITMAP_A_HIGH = 0x00002001

IO_BITMAP_B = 0x00002002

IO_BITMAP_B_HIGH = 0x00002003

/ valid when CPU_BASED_USE_MSR_BITMAPS is 1, when a bit 1, accessing the MSR corresponding to that bit will result in a VM-exit,MSR bitmap area of 4k

The lower half of read bitmap, which corresponds to MSR ranging from 00000000H to 00001FFFH, is used to control read access to MSR.

The high half of read bitmap, which corresponds to MSR ranging from C0000000H to C0001FFFH, is used to control read access to MSR.

The lower half of write bitmap, which corresponds to MSR ranging from 00000000H to 00001FFFH, is used to control write access to MSR.

The high half of write bitmap, which corresponds to MSR ranging from C0000000H to C0001FFFH, is used to control write access to MSR.

When a bit of MSR bitmap is 0, accessing the MSR corresponding to that bit will not generate VM-exit/

MSR_BITMAP = 0x00002004

MSR_BITMAP_HIGH = 0x00002005

EXCUTIVE_VMCSP = 0x0000200c

EXCUTIVE_VMCSP_HIGH = 0x0000200d

When / CPU_BASED_USE_TSC_OFFSETING is 1, this field provides a 64-bit offset value and executes the RDTSC,RDTSCP,RDMSR instruction

When reading TSC, the returned value is TSC+TSC offset/

TSC_OFFSET = 0x00002010

TSC_OFFSET_HIGH = 0x00002011

/ when CPU_BASED_TPR_SHADOW is 1, this field is valid. You need to provide a physical address as a 4k page /

VIRTUAL_APIC_PAGE_ADDR = 0x00002012, / Virtual-APIC address (full) /

VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013, / Virtual-APIC address (high) /

/ when SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES is 1, this field is valid and a physical address is required.

As a 4k page /

APIC_ACCESS_ADDR = 0x00002014, / APIC-access address (full) /

APIC_ACCESS_ADDR_HIGH = 0x00002015, / APIC-access address (high) /

POSTED_INTR_DESC_ADDR = 0x00002016

POSTED_INTR_DESC_ADDR_HIGH = 0x00002017

/ when SECONDARY_EXEC_ENABLE_EPT is 1, the physical address on the guest side can be translated into the final physical address on the host side

Bit2:0 indicates the memory type of EPT paging-structure (uc or WB); bit5:3 indicates the level of EPT page table structure, and this value plus 1 is the true series.

Bit6 = 1 indicates that the access and dirty flag bits in the table structure item of the EPT page are valid (the bit8:9 of the EPT table item), and the processor will update the two flag bits of the EPT table item

Bit NMUR 1RV 12 provides the physical address of the EPT PML4T table.

The EPT page table is loaded into the special EPT page table pointer register EPTP. The mapping mechanism of EPT page table to address is the same as that of client page table to address.

EPT_POINTER = 0x0000201a, / EPT pointer (EPTP; full) /

EPT_POINTER_HIGH = 0x0000201b, / EPT pointer (EPTP; high) /

/ when SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY is 1, this field is valid and is used to control whether the EOI command is sent

If VM-exit is generated, and the corresponding bit is 1, VM-exit/ will be generated.

EOI_EXIT_BITMAP0 = 0x0000201c, / corresponding vector number from 0H to 3FH/

EOI_EXIT_BITMAP0_HIGH = 0x0000201d

EOI_EXIT_BITMAP1 = 0x0000201e, / corresponding vector number from 40H to 7FH/

EOI_EXIT_BITMAP1_HIGH = 0x0000201f

EOI_EXIT_BITMAP2 = 0x00002020, / corresponding vector number from 80H to BFH/

EOI_EXIT_BITMAP2_HIGH = 0x00002021

EOI_EXIT_BITMAP3 = 0x00002022, / corresponding vector number from C0H to FFH/

EOI_EXIT_BITMAP3_HIGH = 0x00002023

/ VMCS Shadowing Bitmap Addresses/

VMREAD_BITMAP = 0x00002026

VMWRITE_BITMAP = 0x00002028

/ bit0 = 1 the VM-exit;bit2:1 reserved bit is generated when an external interrupt occurs, which is fixed to 1

When bit3 = 1, NMI produces VM-exit;bit4 reserved bit, which is fixed to 1.

Bit5 = 1 define virtual NMI;bit6 = 1 enable VMX-preemption timer

Bit7 = 1 enable posted-interrupt processing mechanism to handle virtual interrupts

Bit31:8 reserved bit, fixed at 0 /

PIN_BASED_VM_EXEC_CONTROL = 0x00004000, / Pin-based VM-execution controls/

/ bit0 reserved bit, fixed to 0 positional bit 1 reserved bit, fixed to 1

Bit2 = 1 when IF=1 and the interrupt is not blocked, when VM-exit;bit3 = 1 is read, the TSC value is returned plus the offset value

Bit6:4 reserved value, fixed at 1 / Bit7 = 1. Execute HLT instruction to generate VM-exit;bit8 reserved value, fixed at 1.

Bit9 = 1 execute INVLPG instruction to generate VM-exit;bit10 = 1 execute MWAIT instruction to generate VM-exit

Bit11 = 1 execute RDPMC instruction to generate VM-exit;bit12 = 1 execute RDTSC instruction to generate VM-exit;bit14:13 retention value, fixed to 1

Bit15 = 1 write CR3 register produces VM-exit;bit16 = 1 read CR3 register produces VM-exit;bit18:17 retention value, fixed to 1

Bit19 = 1 write CR8 register generation VM-exit;bit20 = 1 read CR8 register generation VM-exit;bit21 = 1 enable virtual-APIC page page virtualization local APIC

Bit22 = 1 when virtual-NMI window is turned on, VM-exit;bit23 = 1 read and write DR registers generate VM-exit

Bit24 = 1 to execute IN/OUT or INS/OUTS class instructions to generate VM-exit;bit25 = 1 enable I _ bitmap;bit26 reserved bit, fixed to 1

Bit27 = 1 enables MTF debugging; bit28 = 1 enables MSR bitmap;bit29 = 1 to execute MONITOR instructions to generate VM-exit

Bit30 = 1 execute PAUSE instruction to generate VM-exit;bit31 = 1Secondary processor-based VM-execution controls field valid /

CPU_BASED_VM_EXEC_CONTROL = 0x00004002, / Primary processor-based VM-execution controls/

The / EXCEPTION_BITMAP field is a 32-bit value, each corresponding to an exception vector. In VMX non-root, if an exception occurs, the processor checks the corresponding bit of EXCEPTION_BITMAP. If the bit is 1, VM-exit is generated. If it is 0, the exception handling routine is executed through guest-IDT. When triple-fault occurs, VM-exit/ is generated directly.

EXCEPTION_BITMAP = 0x00004004, / Exception bitmap, exception control /

PAGE_FAULT_ERROR_CODE_MASK = 0x00004006

PAGE_FAULT_ERROR_CODE_MATCH = 0x00004008

/ maximum value is 4 /

CR3_TARGET_COUNT = 0x0000400a

/ when CPU_BASED_TPR_SHADOW is 1, this field is valid and provides a threshold value for interrupt priority, below which VM-exit/

TPR_THRESHOLD = 0x0000401c

/ * bit0 = 1 Virtualized access APIC-access page;bit1 = 1 enable EPT;bit2 = 1 access to GDTR,LDTR,IDTR,TR

Generate VM-exit

Bit3 = 0 generates # UD exception by executing RDTSCP instruction; bit4 = 1 virtualized access x2APIC MSR;bit5 = 1 enables VPID mechanism

Bit6 = 1 execute WBINVD instruction to generate VM-exit;bit7=1guest can use non-paging protected mode or real mode

Bit8 = 1 supports access to virtual registers in virtual-APIC page; bit9 = 1 supports delivery of virtual interrupts

Bit10 = 1 determines whether the PASUE instruction generates VM-exit;bit11 = 1 executes the RDRAND instruction to generate VM-exit

Bit12 = 1 generates # UD exception when executing INVPCID instruction; bit13 = 1VMX non-root operation can execute VMFUNC instruction

Bit31:14 reserved bit, fixed at 0 /

SECONDARY_VM_EXEC_CONTROL= 0x0000401e, / Secondary processor-based VM-execution controls*/

PLE_GAP = 0x00004020

PLE_WINDOW = 0x00004022

A / bit of 1 means that the bit right belongs to host. If it is 0, the bit guest has the right to set /

CR0_GUEST_HOST_MASK = 0x00006000, / accelerate client to write CR0 instruction /

CR4_GUEST_HOST_MASK = 0x00006002

CR0_READ_SHADOW = 0x00006004, / accelerate client to read CR0 instruction /

CR4_READ_SHADOW = 0x00006006

CR3_TARGET_VALUE0 = 0x00006008

CR3_TARGET_VALUE1 = 0x0000600a

CR3_TARGET_VALUE2 = 0x0000600c

CR3_TARGET_VALUE3 = 0x0000600e

VM-entry control class field

VM_ENTRY_MSR_LOAD_ADDR = 0x0000200a

VM_ENTRY_MSR_LOAD_ADDR_HIGH = 0x0000200b

/ * bit1:0 reserved bit, fixed to 1: 10 bit 2 = 1 to load debug register; bit8:3 reserved bit, fixed to 1; bit9 = 1 to enter IA-32e mode; bit10 = 1 to enter SMM mode; bit11 = 1 to return executive monitor, turn off SMM dual monitoring processing; bit12 reserved bit, fixed to 1 SMM 13 = 1 load IA32_PERF_GLOBAL_CTRL;bit14 = 1 load IA32_EFER;bit31:16 retention value, fixed at 0 percent /

VM_ENTRY_CONTROLS= 0x00004012, / VM-Entry Controls, controlled by register MSR_IA32_VMX_ENTRY_CTLS /

VM_ENTRY_MSR_LOAD_COUNT = 0x00004014

/ * bit7:0 interrupt or exception vector number; bit10:8Interruption type: 0: External interrupt 1: Reserved 2: Non-maskable interrupt (NMI) 3: Hardware exception 4: Software interrupt 5: Privileged software exception 6: Software exception 7: Other event bit11 = 1 indicates that there is an error code to be submitted; bit30:12 reserved bit; bit31 = 1 indicates that the VM_ENTRY_INTR_INFO_FIELD field is valid * /

VM_ENTRY_INTR_INFO_FIELD = 0x00004016, / event injection control field /

VM_ENTRY_EXCEPTION_ERROR_CODE = 0x00004018, / VM-entry exception error code/

VM_ENTRY_INSTRUCTION_LEN = 0x0000401a, / VM-entry instruction length/

VM-exit control class field

VM_EXIT_MSR_STORE_ADDR = 0x00002006

VM_EXIT_MSR_STORE_ADDR_HIGH = 0x00002007

VM_EXIT_MSR_LOAD_ADDR = 0x00002008

VM_EXIT_MSR_LOAD_ADDR_HIGH = 0x00002009

/ bit1:0 retention value, fixed at 1 / 10 bit 2 = 1 to save debug register; bit8:3 retention value, fixed at 1 / 10 bit 9: 1 to return to

IA-32e mode

Bit11:10 retention value, fixed at 1: bit 12: 1 loading IA32 _ PERF_GLOBAL_CTRL;bit14:13 retention value, fixed at 1

When bit15=1VM-exit, the processor responds to the interrupt controller and reads the interrupt vector number; the bit17:16 retention value is fixed to 1.

Bit18=1 Save IA32_PAT;bit19=1 load IA32 _ PAT;bit20=1 Save IA32_EFER;bit21=1 load IA32 _ EFER

Save VMX timer value when bit22=1VM-exit; bit31:23 retention value, fixed at 0 /

VM_EXIT_CONTROLS = 0x0000400c, / VM-exit controls/

VM_EXIT_MSR_STORE_COUNT = 0x0000400e

VM_EXIT_MSR_LOAD_COUNT = 0x00004010

VM-exit information field

VM_INSTRUCTION_ERROR = 0x00004400, / instruction failure class /

/ basic information class /

GUEST_PHYSICAL_ADDRESS = 0x00002400, / Guest-physical address saved due to EPT violation or /

The GPA value of GUEST_PHYSICAL_ADDRESS_HIGH= 0x00002401 when the VM-exit is caused by the EPT misconfiguration failure /

VM_EXIT_REASON = 0x00004402, / Exit reason/

EXIT_QUALIFICATION = 0x00006400, / execute instruction VM-exit reason, different instructions, this field has different format /

GUEST_LINEAR_ADDRESS = 0x0000640a, / saves linear address values for some events that cause VM-exit /

/ Direct Vector event Class /

VM_EXIT_INTR_INFO = 0x00004404, / VM-exit interruption information virtual machine exit reason /

VM_EXIT_INTR_ERROR_CODE = 0x00004406

/ indirect vector event class information field /

IDT_VECTORING_INFO_FIELD = 0x00004408

IDT_VECTORING_ERROR_CODE = 0x0000440a

/ instruction information class /

VM_EXIT_INSTRUCTION_LEN = 0x0000440c

VMX_INSTRUCTION_INFO = 0x0000440e

/ end VM-exit information class field /

/ start guest-state area field /

GUEST_DR7 = 0x0000681a, / debug register /

GUEST_RSP = 0x0000681c, / stack pointer /

GUEST_RIP = 0x0000681e, / instruction pointer /

GUEST_RFLAGS = 0x00006820, / flag register /

/ control register /

GUEST_CR0 = 0x00006800

GUEST_CR3 = 0x00006802

GUEST_CR4 = 0x00006804

/ 6 data / code segment register fields, respectively, ES,CS,SS,DS,FS,GS register, 2 system segment register, respectively

LDTR and TR registers.

Each segment register has four fields that describe each domain of the segment register:

Selector:16 bit field; base:64 bit system is 64-bit, otherwise 32-bit

Limit:32 bit; access right:32 bit

Access right field format:

The type value of bit3:0 type segment; the access rights of bit4 0 segment systembook 1 encoding codemap datashare bit6v4 segment 5

Bit7: 0=no present,1=present;bit11:8 reserved; bit12 system software available

Bit13 is the L flag in IA-32e mode and reserved bit in legacy; bit14 default Operand size,0= 16 bits, 1 = 32 bits

Limit granularity of bit15 segment: 0: 1 bytere1: 1 bytere1: 4 kb: bit 16: 0: 1: unusable: 1: 1: unusable: bit31: 17: reserved /

/ ES/

GUEST_ES_SELECTOR = 0x00000800

GUEST_ES_LIMIT = 0x00004800

GUEST_ES_AR_BYTES = 0x00004814

GUEST_ES_BASE = 0x00006806

/ CS/

GUEST_CS_SELECTOR = 0x00000802

GUEST_CS_LIMIT = 0x00004802

GUEST_CS_AR_BYTES = 0x00004816

GUEST_CS_BASE = 0x00006808

/ SS/

GUEST_SS_SELECTOR = 0x00000804

GUEST_SS_LIMIT = 0x00004804

GUEST_SS_AR_BYTES = 0x00004818

GUEST_SS_BASE = 0x0000680a

/ DS/

GUEST_DS_SELECTOR = 0x00000806

GUEST_DS_LIMIT = 0x00004806

GUEST_DS_AR_BYTES = 0x0000481a

GUEST_DS_BASE = 0x0000680c

/ FS/

GUEST_FS_SELECTOR = 0x00000808

GUEST_FS_LIMIT = 0x00004808

GUEST_FS_AR_BYTES = 0x0000481c

GUEST_FS_BASE = 0x0000680e

/ GS/

GUEST_GS_SELECTOR = 0x0000080a

GUEST_GS_LIMIT = 0x0000480a

GUEST_GS_AR_BYTES = 0x0000481e

GUEST_GS_BASE = 0x00006810

/ LDTR local descriptor table register, instruction LLDT instruction loaded to LDTR/

GUEST_LDTR_SELECTOR = 0x0000080c

GUEST_LDTR_LIMIT = 0x0000480c

GUEST_LDTR_AR_BYTES = 0x00004820

GUEST_LDTR_BASE = 0x00006812

/ TR task register /

GUEST_TR_SELECTOR = 0x0000080e

GUEST_TR_LIMIT = 0x0000480e

GUEST_TR_AR_BYTES = 0x00004822

GUEST_TR_BASE = 0x00006814

/ two descriptor registers, GDTR and IDTR. Consists of two fields: base: provide the base address of the descriptor table; limit: provide the length of the descriptor table. The GDTR global descriptor table register into which the LGDT instruction loads the entry address of the GDT. /

GUEST_GDTR_LIMIT = 0x00004810

GUEST_GDTR_BASE = 0x00006816

/ IDTR interrupt descriptor table register /

GUEST_IDTR_LIMIT = 0x00004812

GUEST_IDTR_BASE = 0x00006818

/ MSR/

GUEST_IA32_DEBUGCTL = 0x00002802

GUEST_IA32_DEBUGCTL_HIGH = 0x00002803

GUEST_IA32_PAT = 0x00002804

GUEST_IA32_PAT_HIGH = 0x00002805

GUEST_IA32_EFER = 0x00002806

GUEST_IA32_EFER_HIGH = 0x00002807

GUEST_IA32_PERF_GLOBAL_CTRL = 0x00002808

GUEST_IA32_PERF_GLOBAL_CTRL_HIGH= 0x00002809

GUEST_SYSENTER_CS = 0x0000482A

GUEST_SYSENTER_ESP = 0x00006824

GUEST_SYSENTER_EIP = 0x00006826

Non-register field

GUEST_INTR_STATUS = 0x00000810 / indicates the status of the virtual interrupt /

VMCS_LINK_POINTER = 0x00002800

VMCS_LINK_POINTER_HIGH = 0x00002801

GUEST_PDPTR0 = 0x0000280a, / enable the fields used by EPT /

GUEST_PDPTR0_HIGH = 0x0000280b

GUEST_PDPTR1 = 0x0000280c

GUEST_PDPTR1_HIGH = 0x0000280d

GUEST_PDPTR2 = 0x0000280e

GUEST_PDPTR2_HIGH = 0x0000280f

GUEST_PDPTR3 = 0x00002810

GUEST_PDPTR3_HIGH = 0x00002811

GUEST_ACTIVITY_STATE = 0X00004826 Magneol hand GuestMurstate indicates virtual machine entry / exit, virtual processor activity status /

GUEST_INTERRUPTIBILITY_INFO = 0x00004824.1 / interruptibility of the current virtual processor /

VMX_PREEMPTION_TIMER_VALUE = 0x0000482E

GUEST_PENDING_DBG_EXCEPTIONS = 0x00006822 dyadic pending debug exceptions/

Host-state area field

HOST_RSP = 0x00006c14, / stack pointer /

HOST_RIP = 0x00006c16, / instruction pointer /

/ control register /

HOST_CR0 = 0x00006c00

HOST_CR3 = 0x00006c02

HOST_CR4 = 0x00006c04

/ segment selection register /

HOST_ES_SELECTOR = 0x00000c00

HOST_CS_SELECTOR = 0x00000c02

HOST_SS_SELECTOR = 0x00000c04

HOST_DS_SELECTOR = 0x00000c06

HOST_FS_SELECTOR = 0x00000c08

HOST_GS_SELECTOR = 0x00000c0a

HOST_TR_SELECTOR = 0x00000c0c

/ segment base address register /

HOST_FS_BASE = 0x00006c06

HOST_GS_BASE = 0x00006c08

HOST_TR_BASE = 0x00006c0a

HOST_GDTR_BASE = 0x00006c0c

HOST_IDTR_BASE = 0x00006c0e

/ MSR register /

HOST_IA32_PAT = 0x00002c00

HOST_IA32_PAT_HIGH = 0x00002c01

HOST_IA32_EFER = 0x00002c02

HOST_IA32_EFER_HIGH = 0x00002c03

HOST_IA32_PERF_GLOBAL_CTRL = 0x00002c04

HOST_IA32_PERF_GLOBAL_CTRL_HIGH = 0x00002c05

HOST_IA32_SYSENTER_CS = 0x00004c00

HOST_IA32_SYSENTER_ESP = 0x00006c10

HOST_IA32_SYSENTER_EIP = 0x00006c12

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 282

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report