How to solve the common problems of big data's online migration? 07/06 Update SLTechnology News&Howtos

How to solve the common problems of big data's online migration?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you how to solve the common problems of big data's online migration. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Roughly describe the scenario: the system uses a computing storage loosely coupled structure, the image files of the virtual machine are on the remote shared storage, so the migration speed is very fast. In our system, the fastest one took 6 seconds, that is, to complete the online migration, this is the real live migration, while we migrate, while deliberately writing data in the virtual machine, it is also completed normally.

Configuration scheme

1. Modify the Nova.conf file

Add:

Image_cache_manager_interval=0

Live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_UNSAFE

Modify:

Vncserver_listen=0.0.0.0

two。 The machine names of the participating computing nodes can all be ping.

3. Modify the / etc/libvirt/libvirtd.conf on the compute node:

Before: # listen_tls = 0

After: listen_tls = 0

Before: # listen_tcp = 1

After: listen_tcp = 1

Add: auth_tcp = "none"

4. Modify / etc/sysconfig/libvirtd:

Before: # LIBVIRTD_ARGS= "- listen"

After: LIBVIRTD_ARGS= "- listen"

5. Modify the / var/run/libvirt/qemu/instance-xxx.xml file to migrate the virtual machine on the source computing node, delete the line migrate-qemu-fd, and modify the vnc parameter to 0.0.0.0

6. Restart nova on the compute node

7 remarks:

1. Since the cloud machine has not been configured for online migration, you need to restart the virtual machine before migrating the virtual machine.

2.. Because auth_tcp= "none" is added to the configuration of libvirtd on the computing node, which is a security loophole, you need to find a more secure way, or after the migration is complete, comment out this line and restart libvirt.

3 has written an auxiliary program to do the migration automatically.

Problems encountered and solutions

1. The cache mode of virtual machine disk is writethrough, and an error is reported during migration.

OpenStack believes that migration is not secure when disk cache mod is writethrough on centos.

Solution: add VIR_MIGRATE_UNSAFE after the nova.conf live_migration_flag parameter, which is not found in the official online migration configuration file.

A bug of 2.qemu1.4 caused the migration to fail

Migration failed, on the destination node / var/log/libvrit/qemu/instances--xxxx.log:

Char device redirected to / dev/pts/1 (label charserial1)

Qemu: warning: error while loading state section id 2

Load of migration failed

Solution:

1. Delete the line migrate-qemu-fd in the / var/run/libvirt/qemu/instance-xxx.xml text where the virtual machine is to be migrated on the source computing node

two。 Restart libvirtd on the source compute node

3. Then execute the nova live-migration command

A program has been written to perform this operation automatically.

Due to the problem with 3.vncserver, the virtual machine needs to be restarted before it can be migrated.

Due to the previous ip of the vncserver_listen= computer node in Nova.conf, the vnc= computing node's ip in the parameters of the virtual machine Kvm process reported an error during migration, and the IP of the source node could not be bound at the destination node, so you need to modify the Libvirt.xml configuration file and restart the virtual machine before the migration can be carried out.

Solution:

1. Modify the parameter of vnc to 0.0.0.0 on the source compute node / var/run/libvirt/qemu/instance-xxx.xml text

two。 Restart the source compute node libvirtd

3. Then execute the nova live-migration command

This operation has been programmed to complete automatically.

4. After the migration, the console.log,disk owner becomes root/root.

After the migration is completed, it is found that the console.log and disk file ownership of the virtual machine has changed from qemu.qumu to root.root. This is estimated to be a problem with the OpenStack migration program, which does not affect the virtual machine at present.

Solution:

Modify the owner of the file, this operation has been programmed to complete automatically

5. Cpu incompatibility between source node and destination node

Migration failed, log in / var/log/nova/compute:

"InvalidCPUInfo: Unacceptable CPU info: CPU doesn't have compatibility.\ n\ n0\ n\ nRefer to http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult\n"]

Solution:

There is no solution yet.

6. Destination node memory is negative

The migration failed, and the error output from the api.log of the control node is that the memory of the destination node is-3400, which cannot meet the needs of the migration.

Solution:

Using the nova command to specify the creation of a virtual machine on this compute node can be successful. It is estimated that the scheduling algorithm at the time of migration is inconsistent with that when the virtual machine is created.

7. Error log, api.log on 2.4

The logs you usually read during migration are as follows:

1. / var/log/libvirt/qemu/instances-xxx.log on the destination node

two。 / var/log/nova/compute.log on the destination node

3. / var/log/nova/compute.log on the source node

Sometimes the migration fails and an error is reported after the command line is executed:

ERROR: Live migration of instance bd785968-72f6-4f70-a066-b22b63821c3b to host compute-13 failed (HTTP 400) (Request-ID: req-180d27b5-9dc7-484f-9d9a-f34cccd6daa2)

However, you can't see any error messages in the above three log files.

Solution:

/ var/log/nova/api.log has an error message on the control node or on the node that operates the migration command

Take a detour

1. Try not to change the vncserver_listen parameter in nova.conf to 0.0.0.0

Change the vnc in / var/run/log/libvirt/qemu/instances--xxx.log to the ip of the destination node, restart libvritd, and then migrate, which can be successful. But if the migration fails, when the virtual machine needs to be restarted, the virtual machine fails to start. The error in / var/log/libvrit/qemu/instances-xx.log is

Failed to start VNC server on `172.18.2.15 virtual: Failed to bind socket: Cannot assign requested address

There is no Ip modified to the destination node in / mnt/instances/instance--xxx/libvirt.xml. I don't know where this parameter is saved.

2.vnc Port issu

After a failed migration, the error in the destination node / var/log/libvirt/qemu/instance--xxx.log is:

2013-11-05 05 purl 42purl 39.401 0000: shutting down

Qemu: terminating on signal 15 from pid 10271

It is assumed that the vnc listening port of the virtual machine on the source node is occupied on the destination node, so it cannot be started. Later, tests on other machines found that the port of the vnc would adjust itself after being migrated to the destination node. The above is the editor for you to share the common problems of big data online migration is how to solve, if there happen to be similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.