47 common causes and solutions of Docker failures

This article provides targeted explanations and solutions for the problems and failures that occur during the deployment and maintenance of Docker containers. I hope it can help everyone quickly locate and solve similar problems and failures.

Docker is a relatively simple container to use. We can obtain information in the following ways:

1. Execute the command through docker run, and may return information

2. Obtain logs through docker logs and perform targeted filtering.

3. Check the docker service status through systemctl status docker

4. View the log through journalctl -u docker.service

The following is a collection of docker container problems and failures, divided into 9 categories:

1. Startup fault

1、docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Reason: Docker did not start normally

Solution: systemctl start docker

2、can't create unix socket /var/run/docker.sock: is a directory

Reason: docker.sock cannot be created

Solution: rm -rf /var/run/docker.sock

Then restart docker

3、Job for docker.service failed. Failed to start Docker Application

Cause: Caused by Selinux

Solution: /etc/sysconfig/selinux, change the selinux value to disabled

Restart docker to solve the problem

4、docker: Error response from daemon:

/var/lib/docker/overlay/XXXXXXXXXXXXXXXXXXXXXXX: no such file or directory.

Reason: docker did not specify a directory or file

Solution:

systemctl stop docker

rm -rf /var/lib/docker/*

systemctl start docker

Restart the run image to start the container

5、docker: Error response from daemon: Conflict. The container name "XXX" is already in use by container "XXX". You have to remove (or rename) that container to be able to reuse that name.

Reason: Docker name has the same name

Solution: Rename the container or delete and rebuild the container

6、Error: Connection activation failed: No suitable device found for this connection

Reason: Network card configuration problem

Solution: Restart the network card

7. Docker cannot start after system restart

The error reported is: docker0: iptables: No chain/target/match by that name

Reason: docker service iptables problem

Solution: Restart the docker service system restart docker

8、Error starting daemon: error initializing graphdriver: driver not supported

Error when starting docker daemon using overlay2 storage driver

Reason: daemon lacks configuration

Solution:

Add configuration:

/etc/docker/daemon.json

{"storage-driver": "overlay2",

"storage-opts": ["overlay2.override_kernel_check=true"]}

9、Failed to start docker.service: Unit docker.service is masked.

Unknown reason: docker is masked

Solution:

systemctl unmask docker.service

systemctl unmask docker.socket

systemctl start docker.service

10、Failed to start docker.service: Unit is not loaded properly: Invalid argument.

Unknown reason: docker service cannot load normally

Solution: Uninstall docker and delete docker.service

Reinstall docker

11. Docker-compose reports an error when starting the container:

/usr/lib/python2.7/site-packages/requests/init.py:80: RequestsDependencyWarning: urllib3 (1.22) or chardet (2.2.1) doesn't match a supported version! RequestsDependencyWarning)

Unknown reason: The corresponding component version of pip is not supported

Solution:

pip uninstall urllib3

pip uninstall chardet

pip install requests

12. Docker container restart failure

After killing the docker process, restart docker. The container in docker cannot start and reports an error

docker restart XXXXXXX Error response from daemon: Cannot restart container XXXXXXX: container "XXXXXXXXXXXXXXXX": already exists

Reason: The old container did not exit safely

Solution: docker-containerd-ctr --address /run/docker/containerd/docker-containerd.sock --namespace c rm <container hash_id>

docker start container

13. Docker restart error - the restart command keeps getting stuck

systemctl restart docker stuck

Unknown reason: There may be too many containers started, or disk IO problems

Solution:

systemctl start docker-cleanup.service

systemctl start docker

2. Error reporting on permission issues

14、Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock

Solution:

Check the user group where /var/run/docker.sock is located

Rejoin the user to the docker group, usermod -aG docker ${USER}

15、chown socket at step GROUP: No such process

Reason: Docker cannot find the Group group information. The docker group may be deleted accidentally.

Solution: groupadd docker

16、Post http:///var/run/docker.sock/v1.XXX /auth: dial unix /var/run/docker.sock: permission denied. Are you trying to connect to a TLS-enabled daemon without TLS?

Reason: When a non-Root user manages Docker, the permissions are insufficient.

Solution:

groupadd docker

usermod -a -G docker user

17. Docker commit image error

Error processing tar file(exit status 1): unexpected EOF

Reason: It may be caused by permission issues

Solution: chmod +x to add execution permissions

3. Error reporting on mirroring and warehouse issues

18、Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io

Reason: Docker repository cannot be accessed

Solution:

Modify the Docker warehouse source to a domestic or self-built warehouse source

Modify /etc/docker/daemon.json

19. Error when pushing local image

The push refers to a repository [XXXX] Get https://xxx/v1/_ping: http: server gave HTTP response to HTTPS client

Reason: Docker registry does not use https service

Solution:

/etc/docker/daemon.json file writes:

{ "insecure-registries":[""] }

20、/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go: starting container process caused "exec: \"/bin/bash\": executable file not found in $PATH".

Reason: Docker image itself has problems or the Docker engine version is relatively low.

Solution: You can upgrade the Docker version service

21. When building an image, executing chown -R is very slow.

Reason: Docker uses a copy-on-write strategy, so when the chown command is executed, all upper-layer image files will be copied to the current layer, and then the permissions will be modified and then written to the file system.

Solution: You should not use commands such as chown -R that modify files in large batches.

22. Docker build reports an error when building the image:

Message from syslogd kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

Reason: The docker engine version is too high

Solution: The docker engine version needs to match the kernel version of the docker internal image

23、docker: Error response from daemon: containerd: container did not start before the specified time-out.ERRO[0133] error getting events from daemon: context canceled

Reason: After modifying the docker root dir and restarting, an error occurs when downloading the image.

Solution: Restart the docker service or restart the server

4. Error reporting on resource issues

24、Docker no space left on device

Reason: Not enough space

Solution: Clean up space, delete unused containers, images and other resources

docker system prune -a

25. /var/lib/docker/containers takes up too much space

Reason: The log file takes up too much space

Solution:

cat /dev/null > *-json.log

or

Add dockerd startup parameters, /etc/docker/daemon.json

{"log-driver":"json-file",

"log-opts": {"max-size":"2G", "max-file":"10"}

26、max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

Reason: The default configuration of system parameters is too small

Solution: Modify vm.max_map_count in /etc/sysctl.conf and increase it

27、Got starting container process caused "process_linux.go:301:

running exec setns process for init caused \"exit status 40\"": unknown.

from time to time

Reason: It may be caused by cache problem

Solution: echo 1 > /proc/sys/vm/drop_caches

28. Docker starts multiple containers locally, causing subsequent container startup failures.

Reason: Check whether the hard disk space is full, if it is not caused by hard disk space problem

Solution:

vim /etc/sysctl.conf

Add parameter fs.aio-max-nr = 1048576

sysctl -p

29. Docker starts abnormally and the status restarts repeatedly.

Docker logs container name, view exception logs

View /var/log/messages

Reason: The memory is full, causing OOM

Solution: Release the memory and then start the container

5. Version incompatibility error report

30、overlayfs: Can't delete file moved from base layer to newly created dir even on ext4

Reason: Caused by compatibility issues between XFS and Overlay, the file system provided by Centos.

Solution: This problem is fixed in kernel 4.4.6 or above

31、docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:297: getting the final child's pid from pipe caused \"read init-p: connection reset by peer\"": unknown.

Reason: Docker version and operating system version do not match

Solution: Reinstall the docker version supported by the operating system kernel

6. Error reporting for network or port problems

32、WARNING: IPv4 forwarding is disabled. Networking will not work.

Reason: IPv4 network cannot forward

Solution:

/usr/lib/sysctl.d/00-system.conf

Add net.ipv4.ip_forward=1 in the last line

Restart the network service. Delete the wrong container and create a new one again

33、Creating network "xxxxxxx" with the default driver

Reason: docker gateway conflict

After starting the container and docker-compose starting the container, the network is disconnected.

Solution: Configure the network_mode: "bridge" configuration parameter for the started container in docker-compose.yml

34、Unable to find a node that satisfies the following conditions [port xxxx]

Reason: When the container uses port mapping (docker run -p xxxx:xxxx or in compose template

ports), the system will create a port on the host and access the specified port of the container through NAT. If the port on the host is occupied by a container or system process, port allocation will fail.

Solution: Clear the container or process occupying the port, or adjust the host port of the container port mapping to avoid conflicts.

35、Error response from daemon: service endpoint with name xxx already

Reason: The port is already occupied

Solution: Restart the docker container

36、docker: Error response from daemon: driver failed programming external connectivity on endpoint XXXXX: Bind for 0.0.0.0:80 failed: port is already allocated

Reason: Container port conflict

Solution: Change the host bound port

7. Docker installation error

37. When installing docker, it reports Requires: container-selinux >= 2.9

Reason: The container-selinux version is low or not installed.

Solution:

wget -O /etc/yum.repos.d/CentOS-Base.repo

http://mirrors.aliyun.com/repo/Centos-7.repo

yum install epel-release

yum makecache

yum install container-selinux

38. An error occurred when installing docker-compose.

“ImportError: 'module' object has no attribute 'check_specifier'”

Reason: setuptools version problem

Solution:

Upgrade setuptools to version 30.1.0 or above

pip install --upgrade setuptools

39. An error occurred when installing docker-compose.

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.

Reason: python2.7 prompts for upgrade

Solution: pip install -i https://pypi.douban.com/simple docker-compose

8. Docker deletion error

40. Docker reports an error when deleting a container

Error response from daemon:Driver overlay failed to remove root filesystem xxxxx: remove/var/lib/docker/overlay2/xxxxx/merged: device or resource busy

Reason: The container mounts the data volume and cannot be deleted directly.

Solution:

grep docker /proc/*/mountinfo | grep xxxxx

After killing the process

Delete the container again

41. Error when deleting a container in dead status

Error response from daemon: Driver aufs failed to remove root filesystem XXXXXXXXXXXXXXXX: aufs: unmount error after retries: /var/lib/docker/aufs/mnt/xxxxxxxx: device or resource busy

Reason: The dead state container cannot be deleted and is still occupying resources.

Solution: docker rm -fv container id will be automatically deleted after a few minutes

42. Docker reports error when deleting image

Error response from daemon: conflict: unable to remove repository reference "XXXX" (must force) - container XXXX is using its referenced image YYYY

Reason: The image is being used by a container

Solution: You need to delete the relevant ID container before you can delete the image.

43. Docker reports error when deleting image

Error response from daemon: conflict: unable to delete XXXXXXXXXX (must be forced) - image is referenced in multiple repositories

Reason: The image login pushed to other remote repositories

Solution: If this image is not needed, docker rmi -f forcefully delete it

44. Docker reports error when deleting image

Error response from daemon: conflict: unable to delete XXX (cannot be forced) - image has dependent child images

Reason: There is a child mirror that depends on the parent mirror

Solution: Forcefully delete the image or delete the containers in batches, then delete the image

9. Other error reports

45、docker: Error response from daemon: driver failed programming external connectivity on end-point XXXXXXX: (iptables failed: iptables --wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.2 --dport 8080 -j ACCEPT: iptables: No chain/target/match by that name.

Reason: Caused by firewall problem

Solution: Turn off the firewall and restart docker

46. ​​The following warning appears when executing docker info

WARNING: bridge-nf-call-iptables is disabled

WARNING: bridge-nf-call-ip6tables is disabled

Reason: Caused by configuration issues, bridge-nf-call-iptables needs to be enabled

Solution:

vi /etc/sysctl.conf

Add the following

net.bridge.bridge-nf-call-ip6tables = 1

net.bridge.bridge-nf-call-iptables = 1

net.bridge.bridge-nf-call-arptables = 1

47. Errors related to docker database

Using Docker to create a mysql container crashes

Database is uninitialized and password option is not specified

Solution: docker run -d -e MYSQL_ROOT_PASSWORD=[password] -p 3306:3306 mysql image

In order to avoid various strange and occasional problems, operation and maintenance personnel and developers should use docker containers in a standardized manner to avoid failures caused by improper use to the greatest extent. Please refer to the following:

Docker usage specification recommendations

1. Try to use the new stable docker version in the last 1-2 years

Don't install versions that are very old before this year. A large number of bugs have been solved by new version updates.

2. Try not to create very large images, such as 5G10G or above

The image should be as lightweight as possible and remove unnecessary software, data, etc.

3. Mount the host configuration in the container and use read-only

The container requires -v host configuration file, try to use ro read-only

4. The data must be mounted on the host’s physical hard disk or storage node.

Do not run directly in the container to avoid data loss caused by container downtime.

5. The application log must be hung on the host machine

Do not print directly into the container. Avoid viewing logs only in docker logs mode. Avoid going to the vulume directory to view logs.

6. Don’t just use the latest tag

Tags need to have a management standard, and you can find the corresponding version based on the tag.

7. Do not use the container IP, and do not hard-code it in the configuration (default 172.17.0.x)

After the container is restarted, the IP address is likely to change.

8. Try not to run multiple processes in a single container

Containers are not virtual machines. Try to have one container and one process.

9. Keep images consistent across environments

Whether it is testing, UAT, or production environment, try to keep the same image and do not change it. When changing the environment, you only need to change the environment variable parameters to distinguish it.

10. Be sure to monitor docker containers, even if problems are found

It is recommended to use prometheus to monitor the container

11. Be sure to limit the resources of the docker container

Especially the CPU, memory, hard disk space, and even the network, etc., to avoid invading the host's hardware resources.

Guess you like

Origin blog.csdn.net/wuds_158/article/details/133162166