Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to add Linux Compute Node to Microsoft HPC Cluster

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to add Linux computing nodes to Microsoft HPC cluster. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

This paper takes hpc pack 2016 update1 as an example.

Supported linux versions: Ubuntu Server 14.04 LTS and linux 16.04 LTS and 16.04 LTS

Environment introduction

08dc

10.0.0.2 255.0.0.0

Hpc01-head.oa.com

Windows server 2016 data Center Edition

Enterprise 10.0.0.8 255.0.0.0 dns 10.0.0.2

HPC 18.0.0.1 255.0.0.0

Linuxnode1

Centos6.7

Enterprise 10.0.0.51 255.0.0.0 dns 10.0.0.2

HPC 18.0.0.3 255.0.0.0

The configuration steps are summarized as follows

Modify the hostname for the linux host

Add the linux host DNS record on the windows dns server to ensure that the head host can find the linux compute node during installation

Import the windows environment root certificate and the head node certificate public key for the linux host to ensure that the certificate can be verified and that the linux host can correctly open the https:// head node fqdn (this step requires restart to take effect)

Add the hostname and FQDN name of the header node to the linux host to enter / etc/resolv.conf. It is best to use a permanent method to ensure that they resolve to each other normally.

Copy the hpc pack linux aget tar gz,setup.py, header node with private key certificate into the linux directory through FileZilla or other tools

Installing the hpc pack agent agent on the linux host requires networking to download dependent components through yum install, so it is necessary to make the installation process linux hosts temporarily networked to download components

Use the python command to start the installation of setup.py. Normally, this command is available after the installation of centos or redhat is completed. If entering python in the terminal is invalid, you need to download and install it yourself.

Install the agent strictly according to the instructions of the setup.py script

Check the installation log and nodemanager.json. If the hostname appears, change it to FQDN, because the hostname cannot be verified by SSL.

Everything is going well, after the installation is successful, after a cup of tea, you can see the added linux computing node in the head node, and you can go online.

1. Modify the hostname for the linux host, which is best specified during installation, otherwise you can modify it using hostname linuxnode01

two。 Add linux host DNS records on the windows dns server

3. Export windows CA root certificate and header installation certificate without private key version. Select Base64 encoding for export format.

After the export is completed, you will get two files, one is the cer of the installation certificate of the head node without the private key, and the other is the cer of the trust certificate of the enterprise root. If it is a self-signed certificate, you can only export the installation certificate of cer without the private key.

Directly rename the file suffix to pem, so that we can import the certificate on linux.

Through the filezilla tool, copy the two certificates and place them under the / etc/pki/ca-trust/source/anchors path of centos or redhat

Enter the bin directory on centos or redhat, enter update-ca-trust, update the certificate list, and import the certificate we put into anchors.

Restart the operating system after import, and enter https://hpc01-head.oa.com in the browser bar. The following result indicates that the configuration is successful.

If there is a prompt that the certificate is not trusted, be sure to reconfigure, check the certificate location, and whether it is imported, and make sure that the address can be opened correctly, because the certificate requirements in linux are very strict. If the SSL certificate is not trusted, it is not allowed to open the SSL website directly.

But we install the hpc pack agent on linux, and the last step is to use the https://hpc01-head.oa.com:443/HpcNaming/api/fabric/resolve/singleton/ written by NamingServceUri in json

Path, to contact the head node to register the compute node. If the trust in this step is not done, then the address in the last step cannot be opened, so that even if the linux side installs the agent successfully, the linux compute node will not be displayed in the windows header node.

4. Add the host name and FQDN name of the head node to the linux host to enter / etc/resolv.conf. This step is to enable the linux host to resolve the head node host normally. In fact, after we have set dns on linux, it should be possible to resolve in theory. Both the host name and FQDN of the ping head node can be ping, but some foreign friends have mentioned that this is a bug and told us that we still need to add / etc/resolv.conf. As double insurance, we'd better add

Add methods using the linux vi editor

Enter vi / etc/resolv.conf in the terminal to open the editor, type ESC when the input is complete, and then type: W to save the document

This is a temporary modification method, which becomes invalid once it is restarted. Although you can ensure that the agent installation process is correct, if you have a friend who knows linux, it is recommended to use the permanent modification method.

After adding, try to add the hostname and domain name of the ping linux node in the head node, and the hostname and FQDN name of the ping header node in the linux node, all of which can be configured immediately by ping.

5. Through FileZilla or other tools, copy hpcnodeagent.tar.gz,setup.py, head node with private key certificate to a linux directory when it is installed

Hpcnodeagent.tar.gz,setup.py can be found in the unzipped directory of the hpc pack head node installation package

HPCcom.pfx is the certificate we applied for when we installed the hpc pack header node. Export that certificate and export it with the private key.

6. Installing the hpc pack agent agent on the linux host requires networking to download dependent components through yum install, so it is necessary to make the installation process linux hosts temporarily networked to download components

When installing hpc pack liunx agent, you need to download some necessary components online. In the experiment, Lao Wang temporarily connected the HPC network to the vmwareNAT network. In the actual enterprise environment, it is recommended to temporarily connect one of the networks, or temporarily add a new network card.

If you cannot connect to the network when installing the linux hpc pack agent, the following error occurs

7. Use the python command to start the installation of setup.py. Normally, this command is available after the installation of centos or redhat is completed. If entering python in the terminal is invalid, you need to download and install it yourself.

Go to the directory where setup.py is located and run the python command

Python setup.py-install-connectionstring:'hpc01-head'-certfile:'/opt/HPCcom.pfx'-certpassword:'123.com'-managehosts

If you are afraid of trouble, you can enter python setup.py directly, the setup.py help content will pop up, and you can copy and paste it directly, and then modify it to our content.

Install the agent strictly according to the setup.py instructions. A parameter and case cannot be wrong. It is best to copy the sample content directly and then modify it.

At about this point in the installation step

Open / opt/hpcnodemanager/nodemanager.json and check the NamingServceUri column. If it is a hostname like hpc01-head, be sure to change it to hpc01-head.oa.com.

Because the certificate name we bind on the windows side is the fqdn name, the certificate we import into linux will only match the fqdn name. If the certificate is accessed by the host name here, the web page cannot be opened directly because the name does not match the certificate, resulting in no way to register with the header node.

After the installation is successful, after a cup of tea, you can see the added linux computing node in the head node.

Installation process log error

Linux Node

/ opt/hpcnodemanager/logs/nodemanager.txt,hpclinuxagent.log

/ opt/hpcnodemanager/nodemanager.json

Head node

Install the directory Microsoft HPC Pack 2016\ Data\ LogFiles\ Scheduler\ HpcScheduler*.bin

Use hpctrace to convert bin files to txt view

Online linux computing node. Now the linux computing node has successfully joined the Microsoft HPC cluster and can normally carry the jobs assigned to it by the head node.

Support to execute commands on linux nodes directly in Cluster Administrator

Support the use of clusrun to submit jobs to be executed directly on linux compute nodes

Support for summary display of linux compute node data through Cluster Administrator

Support to submit parameter scan jobs to linux nodes through the client program, portal

Scenarios not supported by Linux compute nodes

Linux compute nodes only support the deployment of single head nodes. If cluster head nodes are used, linux compute nodes cannot be used.

To run the MPI application on the Linux node, you must install your own MPI on the node. The Microsoft MPI (MS-MPI) included in HPC Pack runs only on the Windows node. The scheduler must establish mutual trust between Linux nodes, and HPC Pack 2016 Update 1 automatically generates a key pair for the user.

GPU and SOA workloads are not supported-currently HPC Pack does not support scheduling GPU or running SOA workloads on Linux nodes

Except for the above scenarios, all other experiences are consistent with windows computing nodes.

This is the end of the article on "how to add Linux computing nodes to Microsoft HPC cluster". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report