Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of the PyTorch GPU version of the old graphics card under the Windows environment

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

According to the example analysis of the PyTorch GPU version of the old graphics card in the Windows environment, I believe that many inexperienced people do not know what to do about it. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

The old graphics card runs the PyTorch GPU version in Windows environment.

Graphics cards with GPU computing power of 3.5 and below are no longer supported from pytorch 1.3. At this point, you can only install version 1.2 of the official Pytorch. Install it if it is enough to save worry and effort. But if you want to use other packages, you need a higher version of torch support, for example: torch geometric needs at least version 1.4. You can only use the official CPU version, because the official version no longer directly supports GPU less than 3.5, simplified version of the distribution (Torch is already very large). By Torch 1.7, GPU must have a computing power of at least 5.2. At this point, we need to recompile the Pytorch source code on Windows to get the Torch suitable for our graphics card. When compiling, the system will automatically set the lower limit of the GPU computing power to that of the current machine graphics card, such as the GT 730m 1G graphics card with a computing power of 3.5.

Torch 1.7 is taken as an example to illustrate the whole operation process.

1. Compilation tools and third-party libraries 1.1 Visual studio 2019

The original post said that there was something wrong with the latest VS 2019, but I didn't confirm it. But personally, I think this should not be the case in principle, or it is not a key issue. But I still installed the 16.6.5 version of Professional as stated in the post.

Version 16.6.5 of https://docs.microsoft.com/en-us/visualstudio/releases/2019/history.

Only need to install C++ desktop development.

1.2 Cuda toolkit

Go to https://developer.nvidia.com/cuda-toolkit on Nvidia's official website.

The downloaded version is cuda 10.1, Cuda, 10.1.105, 418.96, win10.exe.

You need to install NVCC, the plug-in section related to Visual Studio.

1.3 cudnn

For this installation, please refer to windows_cudnn_install. (https://github.com/pytorch/pytorch/blob/master/.circleci/scripts/windows_cudnn_install.sh)

You can see that the corresponding cuda10.1 is v7.6.4.38, that is, cudnn-10.1-windows10-x64-v7.6.4.38.zip (constantly updating)

Https://developer.nvidia.com/compute/machine-learning/cudnn/secure/7.6.4.38/Production/10.1_20190923/cudnn-10.1-windows10-x64-v7.6.4.38.zip

After unzipping the file, move to the Cuda toolkit installation path. (just to simplify the setting of directories in the configuration process)

1.4 mkl

You can refer to install_mkl for this installation.

Https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/win-test-helpers/installation-helpers/install_mkl.bat

According to the current official document download:

Https://s3.amazonaws.com/ossci-windows/mkl_2020.0.166.7z

1.5 magma

Reference install_magma

Https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/win-test-helpers/installation-helpers/install_magma.bat

It is important to note that you should pay attention to the difference between the release and debug versions. If you want to download the release version of cuda110, it is xxx_cuda110_release, otherwise it is xxx_cuda110_debug. When compiling, release uses release version, and debug uses debug version.

According to the current official document download:

Https://s3.amazonaws.com/ossci-windows/magma_2.5.4_cuda101_release.7z

1.6 sccache

Reference install_sccache

Https://github.com/pytorch/pytorch/blob/master/.jenkins/pytorch/win-test-helpers/installation-helpers/install_sccache.bat

Download according to the official documentation:

Https://s3.amazonaws.com/ossci-windows/sccache.exe

Https://s3.amazonaws.com/ossci-windows/sccache-cl.exe

1.7 ninja

Download: ninja-win.zip

Https://github.com/ninja-build/ninja/releases

Mkl, magma, sccache and ninja should be downloaded and decompressed in the same directory.

Nvcc and randomtemp.exe ignore them and talk about them later.

1.8 install Ananconda or miniconda

The conda environment is recommended in the Torch official documentation. I still use pip here. Practice has proved that there is no problem.

1.9 install python package Pip install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses2. Set environment variabl

Generate a batch file for setting environment variables.

Set-xset BUILD_TYPE=releaseset USE_CUDA=1set DEBUG=rem set DEBUG=1 for debug versionset USE_DISTRIBUTED=0set CMAKE_VERBOSE_MAKEFILE=1set TMP_DIR_WIN=D:\ pytorch\ needed\ set CMAKE_INCLUDE_PATH=%TMP_DIR_WIN%\ mkl\ includeset LIB=%TMP_DIR_WIN%\ mkl\ lib % LIBset MAGMA_HOME=%TMP_DIR_WIN%\ magmarem version transformer For example 10.1 to 10_1.set CUDA_SUFFIX=cuda10_1set CUDA_PATH_V%VERSION_SUFFIX%=%CUDA_PATH%set CUDA_PATH=C:\ Program Files\ NVIDIA GPU Computing Toolkit\ CUDA\ v10.1set CUDNN_LIB_DIR=C:\ Program Files\ NVIDIA GPU Computing Toolkit\ CUDA\ v10.1\ lib\ x64set CUDA_TOOLKIT_ROOT_DIR=C:\ Program Files\ NVIDIA GPU Computing Toolkit\ CUDA\ v10.1set CUDNN_ROOT_DIR=C:\ Program Files\ NVIDIA GPU Computing Toolkit\ CUDA\ v10.1set NVTOOLSEXT_PATH=C :\ Program Files\ NVIDIA Corporation\ NvToolsExtset CUDNN_INCLUDE_DIR=C:\ Program Files\ NVIDIA GPU Computing Toolkit\ CUDA\ v10.1\ includeset NUMBAPRO_CUDALIB=%CUDA_PATH%\ binset NUMBAPRO_LIBDEVICE=%CUDA_PATH%\ nvvm\ libdeviceset NUMBAPRO_NVVM=%CUDA_PATH%\ nvvm\ bin\ nvvm64_33_0.dllset PATH=%CUDA_PATH%\ bin % CUDA_PATH%\ libnvvp;%PATH%set DISTUTILS_USE_SDK=1set TORCH_CUDA_ARCH_LIST=3.5set TORCH_NVCC_FLAGS=-Xfatbin-compress-allset PATH=%TMP_DIR_WIN%\ bin % PATH%set SCCACHE_IDLE_TIMEOUT=0sccache-stop-serversccache-start-serversccache-zero-statsset CC=sccache-clset CXX=sccache-clset CMAKE_GENERATOR=Ninjaif "% USE_CUDA%" = = "1" (copy% TMP_DIR_WIN%\ bin\ sccache.exe% TMP_DIR_WIN%\ bin\ nvcc.exe:: randomtemp is used to resolve the intermittent build error related to CUDA. :: code: https://github.com/peterjc123/randomtemp:: issue: https://github.com/pytorch/pytorch/issues/25393: Previously, CMake uses CUDA_NVCC_EXECUTABLE for finding nvcc and then:: the calls are redirected to sccache. Sccache looks for the actual nvcc:: in PATH, and then pass the arguments to it. :: Currently, randomtemp is placed before sccache (% TMP_DIR_WIN%\ bin\ nvcc):: so we are actually pretending sccache instead of nvcc itself. :: curl-kL https://github.com/peterjc123/randomtemp/releases/download/v0.3/randomtemp.exe-- output% TMP_DIR_WIN%\ bin\ randomtemp.exe set RANDOMTEMP_EXECUTABLE=%TMP_DIR_WIN%\ bin\ nvcc.exe set CUDA_NVCC_EXECUTABLE=%TMP_DIR_WIN%\ bin\ randomtemp.exe set RANDOMTEMP_BASEDIR=%TMP_DIR_WIN%\ bin) set CMAKE_GENERATOR_TOOLSET_VERSION=14.26set CMAKE_GENERATOR=Visual Studio 16 2019 "C:\ Program Files (x86)\ Microsoft Visual Studio\ 2019\ Professional\ VC\ Auxiliary\ Build\ vcvarsall.bat "x64-vcvars_ver=14.26

Note a few points:

1) if you are compiling Cuda10, the original post says: "set TORCH_CUDA_ARCH_LIST=3.7+PTX;5.0;6.0;6.1;7.0;7.5;8." TORCH_CUDA_ARCH_LIST wants to remove 8.0.

I directly set it to 3.5 here. What is set here is the lower limit of calculation.

2) to compile the debug/release version, download the corresponding Magma debug/release version package

3) if only the CPU version is compiled, set USE_CUDA=0

4) download the randomtemp.exe directly to the local server here.

5) the script automatically copies nvcc.exe to the set directory. It is a tool for compiling gpu code.

6) turn off the function of distributed training. Set USE_DISTRIBUTED=0

3. Prepare the Torch source code

According to the official Torch documentation:

Get the PyTorch Sourcegit clone-recursive https://github.com/pytorch/pytorchcd pytorch# if you are updating an existing checkoutgit submodule syncgit submodule update-init-recursive

Here-recursive is the third-party git library that recursively downloads Torch dependencies. The problem here is that the speed of connecting to github in China is not stable. In order to avoid problems with the downloaded source code, I choose to download the packaged source code of zip.

Note: it is most convenient to download directly on github. It only takes a few hours to download, the download speed is only 20~50k/s, and it takes hours to download the entire project. The project has more than 600m, I did not insist on downloading completely. The line was cut off midway)

If you are impatient, use the following method. Fast, but you need to be careful.

The important thing is that this packaged download is not an automatic download of the third-party library on which Torch depends. You can only download the third-party libraries corresponding to the Torch version manually. (the third-party library corresponding to the current Torch version is ready to be compiled, which is very important, tears! )

Through the Tag tag, find the 1.7 version you are going to compile.

Download the packaged source code

Https://github.com/pytorch/pytorch/archive/v1.7.0.zip

This can also be downloaded from the domestic Gitee mirror station. The instruction git clone-recursive can be used directly on the domestic Gitee (I have not tried it, but it is feasible in principle. However, when you download a third-party library, you will still access the library on github)

Https://gitee.com/mirrors/pytorch?_from=gitee_search

Then, enter the third_party of the Torch 1.7.0 branch. Download the third-party library on which it depends.

Https://github.com/pytorch/pytorch/tree/v1.7.0/third_party

A total of 36 library files. I downloaded all the packaged zip source code.

Among them, fbgemm library is a key library, which relies on three other third-party libraries.

Please note the version correspondence between them (read the version link above on github).

For Torch,FBGEMM, it is not necessarily the latest version. Different versions of the function interface are different, which will lead to the failure of compilation.

In addition, ideep relies on its own third-party library, mkl-dnn @ 5ef631a. This is also a key library.

Finally, all the source code is integrated according to the form on Git. At this time, the source code is all ready.

Note: due to the manual integration of the source code, the main problem is to ensure that the versions of the third-party libraries are compatible with the current version of Torch. It's a match. It's a match. If you download the source code directly from the git on the Internet, there will be no problem with the wrong version, but there may be a risk that the download will not work, or it may be incomplete.

Do it yourself and have plenty of food and clothing!

4. Start compilation

Please use Powershell terminal instead of cmd. (there is a problem with the display of Chinese in cmd. If the compilation goes wrong, you don't even know what it's talking about. No administrator privileges are required. You can give it to it.

To enter Powershell terminal, first run the previously generated batch file. / set_env.bat to set the environment variable.

Then, compile the libtorch.

Python tools\ build_libtorch.py

I only compiled the library files. On the i5-4200m machine, the whole compilation process takes nearly 6 hours to generate about 12G temporary files and compiled files.

(note: do not try to cmake and compile libraries directly in the VS environment, because the official script does other things.)

5. graft one twig on another

Download version 1.7 GPU cu101 from the official website.

After installation, replace the official ones with the two library files compiled and generated by yourself. The official cuda below is 600m.

Only 1/8 of it is generated by itself. Of course, there is no distributed training function, because it was not selected earlier.

Run the test again.

After reading the above, have you mastered the method of analyzing the PyTorch GPU version of the old graphics card in Windows environment? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report