Docker storage driver knowledge summary

Most of the content of this article is taken from the official docker documentation. Understand images, containers, and storage drivers.

    Docker  is an open source application container engine that allows developers to package their applications and dependencies into a portable container, which can then be distributed to any popular Linux machine, and can also be virtualized. Containers are completely sandboxed and do not have any interface with each other. It enables developers to quickly create simple, ready-to-run containerized applications; making it easier to manage and deploy applications.

    To truly understand docker's storage driver, you need to first understand how docker images are built and stored, and how containers use images.

    Mirroring and Layering

    Below is the image layering of ubuntu:15.04. There are a total of 4 layers, each layer is composed of some read-only files that describe the difference of the system.

docker storage driver

 

    Comparing the upper and lower images, you can clearly see the image layering relationship (the image above is the official document image, you can see that the image size has been simplified, but the layered structure of the ubuntu:15.04 image has not changed).

docker storage driver

    The role of the Docker storage driver is to stack these layered image files and provide a unified view. Make the container's file system look no different from our ordinary file system.

    When a new container is created, a new container layer (container layer) is actually added to the image layer. All subsequent modifications to the container actually only affect this layer.

   docker storage driver

    Notice

    Container layer: read-write layer (writable layer)

    Image layer: read-only layer

    Containers and Layering

    One of the main differences between an image and a container is whether it has a top-level read-write layer (writable layer). Data additions and modifications to a container are stored in the writable layer. When you delete a container, the writable layer will also be deleted (note: the difference between the writable layer and the data volume). However, the mirror layer remains the same.

    The figure below shows that multiple containers share an image. The mirror layer is a read-only layer, immutable. Multiple container layers are on the same image layer and are independent of each other and do not affect each other.

docker storage driver

    container

    The responsibility of the docker storage driver is to manage the image layer and the writable container layer. Different drivers implement management in different ways. Two key technologies for implementing container and image management are stackable image layers and copy-on-write (CoW, copy-on-write).

    Copy-on-write

    For example: Xiaowen and Xiaowu are taught math by different teachers, but they only have one workbook. Xiaowen's homework is the eleventh page of the workbook. In order not to affect Xiaowu, Xiaowen's method is to copy the 11th page and hand it in after completing the homework. This is a typical copy-on-write

    The first time a file is modified, the file is first copied from the read-only layer below the read-write layer to the read-write layer. The read-only version of the file still exists, but is hidden by the copy of the file in the read-write layer.

    After understanding the copy-on-write, you should pay attention to a problem: if the file contained in the mirror layer is modified for the first time, the size of the file is very large. Will cause a lot of disk IO overhead. Therefore, it is not recommended to integrate large files that need to be modified into the image. Data volumes can be used.

    Data volumes and storage drives

    When a container is deleted, all data written to the container will be deleted (except data stored in the data volume)

    The data volume is mounted to the container, a directory or file on the docker host. The file reading and writing of the data volume is not controlled by the storage driver, and is close to the reading and writing speed of the local file system. Multiple data volumes can be mounted to a container. It is also possible for multiple containers to share one or more data volumes.

    As shown in the figure: a docker host runs 2 containers. Each container has its own storage space, which is stored in the host's local file system /var/lib/docker/… In addition, there is a shared data volume mounted in /data. to the two containers for sharing.

    docker storage driver

    How to choose a storage driver

    The storage drivers currently supported by docker are: OverlayFS, AUFS, Btrfs, Device Mapper, VFS, ZFS.

    Docker's storage driver currently does not have a universal, perfect storage driver suitable for all environments. So you need to choose according to your own environment.

    Storage drivers are constantly improving and developing

    For stability considerations, a storage driver will be selected by default according to your system environment configuration when installing docker. Usually using this default driver will reduce your chance of encountering bugs.

    If your team has used RHEL and its related forks, you probably have experience with LVM and Device Mapper. In this case, it is recommended that you use the devicemapper storage driver.

    View the storage driver of the current docker engine

docker storage driver

    As shown in the figure: the storage driver type is aufs, and the format of the host file system is extfs.

    Storage driver and host file formats

    docker storage driver

    Set up docker storage driver

   docker storage driver

    Current status and future

    Many see OverlayFS as the storage-driven future of Docker. However, it is still not mature enough. The stability is not as good as some mature storage drivers, such as: AUFS, devicemapper.

    The chart below shows the advantages and disadvantages of each storage driver, please refer to:

    docker storage driver

    specific to a storage driver

    This part introduces the specific implementation of storage drivers, which can be referred to and learned by researchers of big data technology . For application practitioners, it can be temporarily stopped.

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326827149&siteId=291194637