ubuntu下docker: Error response from daemon: could not select device driver with capabilities: [[gpu]]

foreword

When I encountered this problem for the first time, I ran this command and reported the following error,

运行:
sudo docker run --rm --gpus=all nvidia/cuda:10.0-base


报错:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Then I refer to this article docker: Error response from daemon: could not select device driver ““ with capabilities: [[gpu]]Problem Solved_"docker: error response from daemon: could not sel_A rookie's struggling blog- CSDN blog is solved by installing nvidia-container- toolkit .

But I encountered this error inexplicably the next day. I have tried most of the methods on the Internet, such as installing nvidia-container- toolkit  and nvidia-container-runtime, but none of them solved my problem.

Later, I found that the problem came from the fact that the /etc/docker/daemon.json file was not properly configured. It should be configured as follows (the premise is that nvidia-container-runtime is installed. If there is no /etc/docker/daemon.json file, create one. Refer to This article installs nvidia-container-runtime docker: Error response from daemon: could not select device driver ““ with capabilities: [[gpu]]_stoneyshi's blog-CSDN blog_"docker: error response from daemon: could not sel

Solution

1. Install nvidia-container-runtime

Create a script in the current directory location

vi t.sh

Copy the following into it:

sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update

 then execute it

sudo bash t.sh

Then execute the following

sudo apt-get install nvidia-container-runtime

 2. Configure the /etc/docker/daemon.json file

sudo vi /etc/docker/daemon.json

Copy the following content into 

(Note: the path after path is an absolute path, you cannot just write nvidia-container-runtime)


{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

3. Then run this command to restart Docker to solve it!

sudo systemctl restart docker

I have encountered this error docker: Error response from daemon: could not select device driver with capabilities: [[gpu]] 3 times in a row. The solution is different each time. It can be said that I have a little experience.

 1. First, check whether docker is installed correctly, use the following command

sudo docker version

         If this is displayed, it means that docker is installed correctly. If there is nvidia, it means that the --gpus all option is available, and there will be no error docker: Error response from daemon: could not select device driver with capabilities: [[ gpu]].

 

        If the must have part does not indicate that docker is not installed, it is best to uninstall and reinstall. If some GitCommit: is empty, it also means that docker is not installed, for example

runc:
    version:    1.12.0
    GitCommit: 

        It is recommended to download the latest version of docker directly. To install docker, you can refer to Ubuntu Docker installation | rookie tutorial

2. The Docker container cannot be run with the --gpus=all option.

        It may be that the NVIDIA driver is not properly installed, NVIDIA-container-runtime or NVIDIA-container-toolkit is not properly installed or configured.

        Some people say that the NVIDIA driver is not compatible with the docker version. I think that as long as your NVIDIA driver has been installed in the past 2 years, you don't need to consider this issue.

        Use the following command to check whether the NVIDIA driver is installed normally

nvidia-smi

        If you can see something like this, it means that the NVIDIA driver is installed and running normally.

         As long as NVIDIA-container-toolkit and NVIDIA-container-runtime are installed with reference to the two blogs in the preface above, there is no problem.

3. Check whether the docker daemon is using the nvidia driver.

         You can use the following command to verify that the docker daemon is using the nvidia driver,

sudo docker info |grep -i nvidia

        If it displays "nvidia:yes" or other normal displays, it means that the docker daemon is using the nvidia driver. At this time, using the --gpus=all option will not report an error.

        If the display shows WARNING: No swap limit support or nothing is displayed, it means that the nvidia driver is not used by the docker daemon. At this time, it is a bit troublesome, probably because the /etc/docker/daemon.json configuration file is not configured correctly. It may also be that docker is not installed properly.

        What I encountered before was that the docker was not installed properly, which led to a good day before, and the next day I used --gpus=all to run the container and reported an error docker: Error response from daemon: could not select device driver with capabilities: [[gpu ]], in this case, you only need to reinstall the latest version of docker.
        

        PS: Docker daemon is the daemon process of Docker. There is no daemon.json configuration file by default after docker is installed, and it needs to be created manually (it is best to add sudo before manual creation, and create it as root). The default path of the configuration file: /etc/docker/daemon.json. 

        If it is configured in the daemon.json file, the docker version needs to be higher than 1.12.6 (it does not take effect on this version, but 1.13.1 and above are valid).

        When we need to adjust the configuration of the docker service, there is no need to modify the parameters of the main file docker.service, which can be managed through the daemon.json configuration file.

Guess you like

Origin blog.csdn.net/qinzhihao12345/article/details/128787822