Linux system configuration Python environment

        The method and process of configuring the environment are generally the same for Linux and Windows, and can be referred to each other.

Pytorch does not need to install additional cudnn, it only needs a graphics card driver, but tensorflow needs to be installed manually.

        First, you need to install an Nvidia graphics card driver , because some packages need to be downloaded to detect the graphics card. If it is not detected, it may fail. Remember to install the dependencies. You must have a graphics card driver CUDA to work normally. You must have a graphics card driver to use the command nvidia-smi to view the highest cuda version supported by the graphics card driver . Even the command nvidia-smi cannot be used without a graphics card driver.          Graphics card driver installation    Linux system can be installed directly from Ubuntu software . ; So you can use the nvidia-smi command first to see if there is a graphics card driver, and if not, install the driver. 

       The nvidia-smi command will return the highest supported CUDA version of the driver. The nvcc command may not be used at this time, because the full version of CUDA may not be installed (only for running deep learning, but the full version of CUDA is not available, pytorch comes with The cuda and cudnn of the minimal environment are enough). The return of the nvidia-smi command only means that the graphics card driver is installed, and it does not mean that cuda is installed. After the graphics card driver is installed, you can use the nvidia-smi command to check the highest cuda version supported by the driver, or you can open the nvidia control panel, find the system information button, and you can also see the highest cuda version supported by the graphics card driver.

Install NVIDIA driver on ubuntu!

       Someone in the dl stupid exchange qq group said: The nvidia-smi command is used to find out the highest cuda version supported by the GPU, and the nvcc-V command is used to find out the cuda version currently installed on the computer (the full version of cuda).

The reason why the versions displayed by nvidia-smi and nvidia-V are different

 The versions displayed by nvcc and nvidia-smi are inconsistent

        nvcc is a CUDA compiler that compiles programs into executable binary files. The full name of nvidia-smi is NVIDIA System Management Interface. It is a command-line utility designed to help manage and monitor NVIDIA GPU devices.

        CUDA has runtime api and driver api, both of which have corresponding CUDA versions, nvcc --version displays the CUDA version corresponding to runtime api, and nvidia-smi displays the CUDA version corresponding to driver api.

        The necessary files to support the driver api are installed by the GPU driver installer, and nvidia-smi belongs to this type of API; the necessary files to support the runtime api are installed by the CUDA Toolkit installer. nvcc is the CUDA compiler-driver tool installed together with CUDA Toolkit. It only knows the CUDA runtime version when it builds itself, and does not know what version of the GPU driver is installed, or even whether the GPU driver is installed.

        CUDA Toolkit Installer usually integrates GPU driver Installer. If your CUDA is installed through CUDA Tooklkit Installer, then the versions of runtime api and driver api should be the same, that is, nvcc --version and nvidia-smi display The version should be the same. Otherwise, you may use a separate GPU driver installer to install the GPU driver, which will cause the versions displayed by nvidia-smi and nvcc --version to be inconsistent.

       Usually, the version of the driver api is backward compatible with the version of the runtime api, that is, the version displayed by nvidia-smi is greater than the version of nvcc --version, and there is usually no major problem.

The reason why torch.cuda.is_available() returns false , and how to upgrade the driver   torch.cuda.is_available returns False

When installing pytorch, it is best not to change the source installation , because the installed version may be different from the one I checked. I have this problem. I obviously checked the GPU version, but pip temporarily changed the source and used Tsinghua source. , it is installed as the CPU version, and torch.cuda.is_available() directly returns false. You can use the torch.__version__ command to check the version of pytorch, whether it is CPU or GPU.


Install miniconda , remember to replace the conda source and pip source. miniconda installation method                    Tsinghua open source mirror station

To change the source , you can directly use the terminal command to replace it, or you can directly copy the source to the .condarc under the hidden file in the main directory . The terminal command is actually the essence of modifying this file.

pip source change temporary use

pip install package -i https://pypi.tuna.tsinghua.edu.cn/simple

pip change source permanent use

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

 Install opencv pip install opnecv-python


Install Vim Notepad          Vim Getting Started         vim Editor

Linux change software sourceLinux  change source

Install pycharm , create a new python environment torch_env , because it is newly built, there are no packages


Sudo means
           sudo means the command after su do;
            execute this command as a super user
            Some places need sudo, some places don't need sudo
                
            If you operate in the home directory (~), you don't need sudo. The home directory is ~
            if you are in Operation under the root directory (/) requires sudo

sudo apt-get update //Update software source, installed application store 

Linux E: Unable to acquire lock /var/lib/dpkg/lock-frontend - open

When running sudo apt-get install/update/ or other commands, the following prompts sometimes appear due to various inexplicable reasons:
E: Unable to acquire lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)

E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?
Reason: Under the Ubuntu system terminal, if you use apt-get install or apt install in the incomplete download If the terminal is closed forcibly, the apt-get process may not end at this time. When running the apt-get install command again to install the software, the above error will appear. That is, another program is occupying the apt-get install process, because it will occupy the system lock (referred to as 'system update lock') when the software source is updated when it is running, and the resource is locked at this time. unlock method


Pytorch installation (remember to enter the torch_env environment before installing, otherwise it will be directly installed in the base environment)

      If it is not easy to use conda install to install the package, you can change to pip install to have the same effect, but if it is a new environment, you may not even have a pip package, so you must first use conda install to install pip, and then use pip to install the package.

      When installing the Pytorch GPU version, the minimum environment of cuda and cudnn will be automatically bundled and installed. These two minimum environments can only be valid in the virtual environment of pytorch. My understanding is that they can only be used with pytorch and cannot meet other needs, so use The two commands nvcc --version or nvcc -V do not return ( the nvcc command cannot be called, which proves that the complete cuda is not installed or the environment variable is not added ), only the two commands that have installed the full version of CUDA will return , CUdnn is the same reason. If nvcc -V returns, it means that cuda is installed completely. nvcc is the compiler of cuda language. When installing cuda separately on the official website, remember to remove the driver option, it will install the driver by default, it will remove the original driver and a lot of problems will appear . Generally speaking, cuda is only needed for deployment, and pytorch is enough for running algorithm models.

       The cudnn version must correspond to the cuda version. Whether it is cuda installed separately or cuda bundled with pytorch, the version must be lower than the highest supported cuda version displayed by nvidia-smi. The full version of CUDA and CUdnn installation tutorial              nvcc -V and nvidia-smi appear different cuda versions            Pytorch, CUDA and cuDNN installation graphic detailed explanation win11    why the versions of nvidia-smi and nvcc results are inconsistent (the writing is relatively simple and clear)               PyTorch installation ( Do not install cuda separately)

How to install CUDA in the Anaconda virtual environment instead of installing CUDA in the system

       If you install CUDA in a virtual environment, using ncvv -V after installation is useless. Because the cuda here is installed in the virtual environment, not directly installed in the system. Therefore, the corresponding file cannot be found in the corresponding folder, and naturally the version cannot be checked using ncvv -V/ncvv --version. You should visit pytorch first, and then call cuda and cudnn.

       So if you want to install cuda, install it directly in the system, not in the virtual environment, because conda installation is not easy to maintain.


 Linux server environment configuration without a visual interfaceServer      environment configuration

         After creating and entering a new user in the server, there are only configured  .ssh directories here at this time. The environments under the user directory, including Python, Conda, Git, Bash, etc., need to be configured one by one.

/home/user Directories usually have limited space, so for private servers with high open permissions , user directories are only used to store configuration files for personal use. Projects and large files will be stored in another  /data/user directory to save storage space on the server. Below we call  /home/user the directory  ~.

Anaconda environment configuration                            Install Anaconda and pytorch on a Linux server

         Use  wget the command to download from Tsinghua mirror source (this command comes with Unix), no need to download the latest version. I chose this version, so the command would be

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2021.11-Linux-x86_64.sh   

.sh After the download is complete, a script-style installation package          will appear in the current directory  . Enter the following command to run:

 bash /home/ysy/Anaconda3-2021.11-Linux-x86_64.sh

          Enter all the way to confirm, and finally enter yes, and you can view the version after installation  conda -V . If it prompts that the command cannot be recognized, you need to perform the following steps.

          In the ~ directory, there is also a  .bashrc file. For most Unix systems, Bash is installed as the default terminal. Any modifications made to bashrc will take effect the next time Terminal is started . If you want to take effect immediately , run the following command:

$ source ~/.bashrc

At this point, you will find that there will be a in front of the user name  (base), indicating that you are now in the basic environment .

       Conda integrates the Python distribution and can manage the Python virtual environment. The commonly used commands are:

  • conda create -n <环境名> python=<版本>: Create a conda environment.
  • conda create -n <环境名> --file <this file>: Create a conda environment and specify the requirement.txt path configuration.
  • conda create -n <新环境名> --clone <旧环境名>: Copy the original conda environment.
  • conda env list Or  conda info -e: view an existing conda environment.
  • conda activate <环境名>: Switch conda environment.
  • conda deactivate: Close the virtual environment.
  • conda remove -n <环境名>: Delete the environment.

The environment created by the user will be saved in  ~/anaconda3/env/xxxand can be copied directly when needed.

CUDA environment and PyTorch

       After installing Anaconda, we can use  pip and  conda two package managers (previously only the ones that come with Linux can be used  apt).

In the server, we only recommend using  conda the installation  cudatoolkitand  related packages, and other packages are recommended for  cudnn installation   . (I don't get it)jupyterpip

        Here it is recommended to use the 11.1 version of the CUDA environment, and   install it under the source: conda (  I don't understand why cuda is installed, and if you use the conda command to install it, there may be some problems if you want to use other languages. I don't know much about it here. Understand, I think you only need to install pytorch) Install CUDA in the Anaconda virtual environment instead of installing CUDA in the system. If you install CUDA in                  the virtual environment, it is useless to use ncvv -V after installation. Because the cuda here is installed in the virtual environment, not directly installed in the system. Therefore, the corresponding file cannot be found in the corresponding folder, and naturally the version cannot be checked using ncvv -V/ncvv --version. You should visit pytorch first, and then call cuda and cudnn.  So I think it is better to install cuda directly in the system, although it is a bit troublesome . And conda installation is not easy to maintain.conda-forge 

$ conda install -c conda-forge cudatoolkit=11.1 cudnn 

Some servers can use GPU without installing CUDA, because the Nvidia graphics card has its own driver. If you can use the  nvidia-smi command to view the status of the graphics card, it means that the built-in driver (the driver version and CUDA version will be indicated above, but  the command may not be used  at this time. nvcc).

You can test it for the first login  nvidia-smi, otherwise you need to manually install the N card driver:

$ sudo add-apt-repository ppa:graphics-drivers/ppa  # 把显卡驱动加入PPA
$ ubuntu-drivers devices  # 查找当前设备适合的驱动版本
$ sudo apt-get install nvidia-driver-418  # 安装对应版本

third party library

        When using  pip to install, it is recommended to install in a virtual environment , especially some uncommon packages. If the download is too slow or fails, you can use the Tsinghua source to download, and use the following commands to update and change the source in the personal configuration :

$ pip install -U pip -i https://pypi.tuna.tsinghua.edu.cn/simple
$ pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

If it is inconvenient to change the source, the temporary method is:

$ pip install <package> -i https://pypi.tuna.tsinghua.edu.cn/simple

Here we take PyTorch as an example, we install it in a virtual environment , use the following command, separate different libraries with spaces and install them together:

$ pip install torch torchvision torchaudio

Why is PyTorch not installed in the base environment? Because when installing a higher version of PyTorch, it will automatically replace dependent libraries such as Numpy with the corresponding version. These automatically replaced basic libraries are likely to have mismatched conflicts with other advanced libraries, resulting in the failure of the original functions, that is, the original operating environment will be polluted. In addition, it is also convenient for copying the virtual environment.

After successful installation, you can open the Python test in the command line:

>>> import torch
>>> torch.__version__
>>> torch.cuda.is_available()
>>> torch.cuda.device_count()
>>> torch.tensor([1.0, 2.0]).cuda()
# 1.9.1+cu11.1
# True
# 4
# tensor([1., 2.], device='cuda:0')

The following lists the common libraries that need to be installed in the basic environment , and the installation method is the same as above:

  • gpustat: Used to view dynamic GPU status (requires Nvidia driver), watch -n1 -c gpustat --color.
  • ipdb: It is used for simple breakpoint debugging, which can be regarded as  pdb an upgraded version of python -m ipdb main.py.

Notebook

Sometimes you need to use Jupyter Notebook in a remote server, you need to install it remotely first:

$ conda install -c conda-forge jupyter notebook

Then you can start the Jupyter service remotely, and the kernel is placed remotely:

$ jupyter notebook --port=8889

After starting, you will get the URL address of the service. This address can be opened directly with a browser locally, or after installing the Jupyter plug-in in VSCode.

personal configuration

If this server needs to be used for a long time , it is necessary to perform the following configuration. Most of the following content needs to be installed with  apt a package manager, so you need to obtain  sudo temporary administrator privileges , if not, you can apply to the administrator.

Git configuration

After installing Git ~ ,  there will be an .gitconfig or  .config/git/config file below, if not, you need to install it yourself:

$ sudo apt-get install git -y  # -y 表示默认 yes

There are three layers to configure Git in the server. The configuration of each layer (system  --system , global  --global, warehouse  --local) will overwrite the~ configuration  of the previous layer. To modify the personal configuration here  , you need to use  --global:

# 替换成个人信息!
$ git config --global user.name "hewei2001"
$ git config --global user.email "[email protected]"
# 常用的指令可以取 alias 别名
$ git config --global alias.showlog "log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative"

If you need to establish remote SSH authentication, use the following command to print out the public key and manually copy it to GitHub:

$ ssh-keygen -t rsa -C "[email protected]"  # 生成
$ cat ~/.ssh/id_rsa.pub  # 打印
$ ssh -T [email protected]  # 测试

oh-my-zsh

First check whether  the available terminal of the serverzsh is included , if there is a direct switch, and install the plugin manager oh-my-zsh :

$ cat /etc/shells  # 打印可用终端,看有没有 /bin/zsh 
$ chsh -s /bin/zsh 
$ git clone git://github.com/robbyrussell/oh-my-zsh.git ~/.oh-my-zsh
$ cp ~/.zshrc ~/.zshrc.bak  # 备份原配置
$ cp ~/.oh-my-zsh/templates/zshrc.zsh-template ~/.zshrc
$ source ~/.zshrc

If it is not included  zsh, you can successfully install both after completing the following steps:

$ sudo apt-get install zsh -y
$ git clone git://github.com/robbyrussell/oh-my-zsh.git ~/.oh-my-zsh
$ cp ~/.zshrc ~/.zshrc.bak  # 备份原配置
$ cp ~/.oh-my-zsh/templates/zshrc.zsh-template ~/.zshrc
$ chsh -s /bin/zsh  # 切换终端

If you do not have  sudo permission to execute the first step, you need to install the resource package and dependencies yourself, refer to: Install zsh as a non-root user on Linux .

Restart the system, and now you have entered a new  zsh terminal, download the themes and plugins to be used:

# 主题 powerlevel10k
$ git clone --depth=1 https://gitee.com/romkatv/powerlevel10k.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/themes/powerlevel10k
# 两个插件
$ git clone --depth=1 https://github.com/zsh-users/zsh-autosuggestions ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-autosuggestions
$ git clone --depth=1 https://github.com/zsh-users/zsh-syntax-highlighting.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/zsh-syntax-highlighting

Open  .zshrcand manually modify the following configuration:

# 主题,如果要用自带的主题推荐 bira
ZSH_THEME="powerlevel10k/powerlevel10k"
# 插件
plugins=(git 
		 tmux 
		 z 
		 extract 
		 zsh-autosuggestions
		 zsh-syntax-highlighting)

oh-my-zsh The built-in plug-in defines commonly used aliases, which can  alias be viewed with the command:

# git 相关
gaa = git add --all 
gcmsg = git commit -m 
ga = git add 
gst = git status 
gp = git push
# tmux 相关
tl = tmux list-sessions
tkss = tmux kill-session -t
ta = tmux attach -t
ts = tmux new-session -s

The following is an introduction to some built-in plug-ins and plug-ins that need to be installed manually:

  • z: To jump directly between frequently used directories , use  z <dir_name>.
  • extract: One-click decompression of various forms of compressed packages, use  x <file_name>.
  • gitignore: Generate a gitignore template with one click, use it  gi python > .gitignore.
  • cpv: Copy files with a progress bar, use  cpv <a> <b>.
  • colored-man-pages: Menu with color  man .
  • sudo: If you forgot to add the last command  sudo, double-click it  ESC and it will be added automatically.

  • zsh-autosuggestions: The command completion plugin needs to be installed manually.

  • zsh-syntax-highlighting: The command highlights the plugin, which needs to be installed manually.

Some configurations in the previous ones  .bashrc will also be moved here, mainly including  the path configuration of the Conda environment and some aliases. After the configuration is complete, use  source ~/.zshrc the command to restart the terminal . After that, you will enter the boot settings of the theme. If you want to reconfigure, you can enter  p10k configure the command.

reverse proxy

When loading the pre-trained model, it is always inevitable to visit foreign download sources, but the server deployed on the domestic network is powerless. At this time, it is necessary to use the VPN on the host, and the server can also use the VPN of the host through the " reverse proxy " method.

Enter the following command in the local terminal and keep the terminal open:

$ ssh -NR 12306:localhost:7890 [email protected]
# 其中 12306 可以是任意端口,7890 则必须改成 VPN 的代理端口!

Enter the following command in the server terminal:

$ export http_proxy=http://127.0.0.1:12306/
$ export https_proxy=http://127.0.0.1:12306/
# 其中 12306 必须和 前述端口 一致!

Once done, you can use  wget google.com the test to see if it was successful. In order to facilitate the service on the server, you can  ~./zshrc add an alias in:

$ alias proxyon='export http_proxy=http://127.0.0.1:12306 https_proxy=http://127.0.0.1:12306 && echo Proxy On!'
$ alias proxyoff='unset http_proxy https_proxy && echo Proxy Off!'

If the above method does not work, in addition to downloading the pre-trained model locally  scp to the server, you can also try this solution: How to gracefully download the huggingface-transformers model .

One-key configuration

The following is a summary of temporarily used commands , which are executed with one click in the form of Shell scripts. Just  /home/user create a new  hello.sh file in the directory and enter the following:

#!/bin/bash

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2020.02-Linux-x86_64.sh
bash Anaconda3-2020.02-Linux-x86_64.sh
source .bashrc

conda install -c conda-forge cudatoolkit=11.1 cudnn

pip install -U pip -i https://pypi.tuna.tsinghua.edu.cn/simple
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
pip install gpustat
pip install ipdb

# 别名配置
echo "alias gpus='watch -n1 -c gpustat --color'" >> ~/.bashrc

echo "alias ca='conda activate'" >> ~/.bashrc
echo "alias cda='conda deactivate'" >> ~/.bashrc

echo "alias proxyon='export http_proxy=http://127.0.0.1:12306 https_proxy=http://127.0.0.1:12306 && echo Proxy On!'" >> ~/.bashrc
echo "alias proxyoff='unset http_proxy https_proxy && echo Proxy Off!'" >> ~/.bashrc

echo "alias tl='tmux list-sessions'" >> ~/.bashrc
echo "alias tkss='tmux kill-session -t'" >> ~/.bashrc
echo "alias ta='tmux attach -t'" >> ~/.bashrc
echo "alias ts='tmux new-session -s'" >> ~/.bashrc

echo "alias ll='ls -alF'" >> ~/.bashrc
echo "alias la='ls -A'" >> ~/.bashrc
echo "alias l='ls -CF'" >> ~/.bashrc
echo "alias ls='ls --color=tty'" >> ~/.bashrc

echo "alias cd..='cd ..'" >> ~/.bashrc
echo "alias md='mkdir -p'" >> ~/.bashrc
echo "alias rd=rmdir" >> ~/.bashrc

source .bashrc

# git 配置,替换成自己的
git config --global user.name "hewei2001"
git config --global user.email "[email protected]"

Execute on the command line  bash hello.sh.

How to make your model run on the server? _Queen Chef's Blog-CSDN Blog_Server Running Model

Guess you like

Origin blog.csdn.net/weixin_47441391/article/details/127283685