In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article focuses on "how to use Docker Compose to manage GPU resources", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use Docker Compose to manage GPU resources.
Under the general trend of AI-oriented development, containerization can seamlessly migrate the environment and reduce the cost of configuring the environment indefinitely. However, configuring CUDA in the container and running TensorFlow can be troublesome for a while, so we'll introduce and use it here.
Enabling GPU access with Compose
Runtime options with Memory, CPUs, and GPUs
The Compose Specification
The Compose Specification-Deployment support
The Compose Specification-Build support
Using GPU resources in Compose
If the corresponding configuration is correctly installed and set on the host where we deploy the Docker service, and there are also corresponding GPU graphics cards on the host, then these GPU graphics cards can be defined and set in Compose.
# configuration to be installed $apt-get install nvidia-container-runtime
Old version = 19.03
# with-- gpus$ docker run-it-- rm-- gpus all ubuntu nvidia-smi# use device$ docker run-it-- rm-- gpus\ device=GPU-3a23c669-1f69-c64e-cf85-44e9b07e7a2a\ ubuntu nvidia-smi# specific gpu$ docker run-it-- rm-- gpus''device=0,2 "ubuntu nvidia-smi# set nvidia capabilities$ docker run-- gpus' all,capabilities=utility'-- rm ubuntu nvidia-smi
According to the old version (v2.3) configuration file of the Compose tool, if you want to use the GPU graphics card resources in the deployed service, you must use the runtime parameter to configure it. Although it can be used as a runtime to provide access to and use of GPU for containers, control over specific properties of GPU devices is not allowed in this mode.
Services: test: image: nvidia/cuda:10.2-base command: nvidia-smi runtime: nvidia environment:-NVIDIA_VISIBLE_DEVICES=all
In the version of Compose v1.28.0 +, the Compose Specification configuration file is used, and some configuration properties that can control GPU resources with finer granularity can be used, so our needs can be accurately expressed at startup. Ahem, let's have a look at it here.
Capabilities-required field
Specify features that need to be supported; multiple different features can be configured; fields that must be configured
Man 7 capabilities
Deploy: resources: reservations: devices:-capabilities: ["gpu"]
Count
Specify the number of GPU to be used; the value is of type int; choose one of the two fields with the device_ids field
Deploy: resources: reservations: devices:-capabilities: ["tpu"] count: 2
Device_ids
Specifies that the GPU device ID value is used; choose one of the two fields with the count field
Deploy: resources: reservations: devices:-capabilities: ["gpu"] device_ids: ["0", "3"] deploy: resources: reservations: devices:-capabilities: ["gpu"] device_ids: ["GPU-f123d1c9-26bb-df9b-1c23-4a731f61d8c7"]
Driver
Specify the GPU device driver type
Deploy: resources: reservations: devices:-capabilities: ["nvidia-compute"] driver: nvidia
Options
Specify specific options for the driver
Deploy: resources: reservations: devices:-capabilities: ["gpu"] driver: gpuvendor options: virtualization: false
Ahem, I've seen it and said it, so let's simply write a sample file to let the startup cuda container service use a GPU device resource, and run it to get the following output.
Services: test: image: nvidia/cuda:10.2-base command: nvidia-smi deploy: restart_policy: condition: on-failure delay: 5s max_attempts: 3 window: 120s resources: limits: cpus: "0.50" memory: 50m reservations: cpus: "0.25m" memory: 20m Devices:-driver: nvidia count: 1 capabilities: [gpu Utility] update_config: parallelism: 2 delay: 10s order: stop-first
Note here that if you set count: 2, you will see the settings of the two graphics cards in the output below. If we do not set the count or device_ids fields here, all GPU on the host will be used together by default.
# the foreground runs $docker-compose upCreating network "gpu_default" with the default driverCreating gpu_test_1 directly. DoneAttaching to gpu_test_1test_1 | +-+ test_1 | | NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.1 | test _ 1 | |-- + test_1 | | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | test_1 | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | test_1 | | MIG M. | test_1 | | = | test_1 | | Tesla T4 On | 00000000:00:1E.0 Off | | test_1 | | Nameza 23C | P8 9W / 70W | MiB / 15109MiB |% Default | test_1 | | NumberA | test_1 | +-+- -+-- + test_1 | test_1 | +-+ test_1 | | Processes: | | test_1 | | GPU GI CI PID Type Process name GPU Memory | test_1 | | ID ID Usage | test_1 | | = = | test_1 | | No running processes found | | | test_1 | +-+ gpu_test_1 exited with code |
Of course, if you set the count or device_ids field, you can use multiple graphics card resources in the program in the container. You can validate and use the following deployment configuration files.
Services: test: image: tensorflow/tensorflow:latest-gpu command: python-c "import tensorflow as tf;tf.test.gpu_device_name ()" deploy: resources: reservations: devices:-driver: nvidia device_ids: ["0", "3"] capabilities: [gpu]
As a result, as shown below, we can see that both graphics cards can be used.
# the foreground runs $docker-compose up...Created TensorFlow device (/ device:GPU:0 with 13970 MB memory-> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:1b.0, compute capability: 7.5)... Created TensorFlow device (/ device:GPU:1 with 13970 MB memory)-> physical GPU (device: 1, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5)... gpu_test_1 exited with code
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.