Casual Conversation - How I View Docker

The enlightenment of Docker comes from innei, and I have learned a lot from the Mix Space CMS project.

The Essence of Docker#

First of all, let's clarify the concept of Docker. The official promotion is "Build once, run anywhere", but this statement needs some interpretation. If you build an image with Docker version 23.x, there may be exceptions on Docker 19.x. This is somewhat similar to the compatibility issue of Java JDK versions, where lower versions cannot run higher version Class files, which we are familiar with. What about images built with lower version Docker? Docker provides a migration process, which usually runs normally. This is different from JDK, where deprecated methods in higher versions cannot be used in lower versions.

After saying so much, what exactly is Docker?

I define Docker as a daemon program with an isolation environment. Of course, this definition is relatively popular. If we go a little deeper, Docker is a containerization platform that allows applications to run in independent containers by providing isolation environments.

So why do I emphasize the concept of a daemon program? That's because of the similarity between Docker and Linux. In Unix-like operating systems, after the kernel initialization is completed, an init process with PID = 1 will be started to handle subsequent user-space process startup and dependency operations. The same is true for Linux. Some distributions may link the daemon program as init, and some may not. But no matter what, these programs bear the responsibility of the init program. In other words, all programs are forked from this init process, and you can view the process relationship tree through tools like htop.

Applicable Scenarios for Containerization#

So what kind of programs are suitable for running with Docker? My practical experience tells me that programs with non-self-assigned states are suitable for Docker, such as common web backends, PHP CMS, and so on.

So what are self-assigned state programs? For example, a database, its state is assigned by itself, not obtained from external input. So if the container is abnormal, the self-assigned things that need to be written to a file may be lost. But for programs that have their state assigned by external input, this problem does not need to be considered. If you can understand this problem, that's great. Now think about whether a containerized database cluster is appropriate.

My answer is no. You may say, can't I use data volumes to persist data? Then you actually overlook the advantages of container clusters. You can add or remove containers arbitrarily according to your needs. If you use data volumes to persist data, the result is that the container is forcibly bound to the node. I cannot accept this strong coupling, so I think database containerization is not very suitable. Then you may say, can't I just use external storage? Write the data directly into external storage for persistence. This solution seems good, but how do you ensure data consistency? If it is single write and multiple reads, there is no need to consider, but can a single database handle actual write requests?

What Problems Does Containerization Solve#

As we all know, the development of technology often stems from problem-solving. This can be seen from my definition of Docker, which is a daemon program with an isolation environment. Suppose you need to run multiple Node.JS services of the same type on a node, and these services occupy the same port. If you use a daemon program like pm2 or systemd, you need to modify the port to make multiple Node.JS services run normally. But in Docker, you don't have to worry about this problem. They can share the same port and can be accessed using the "container name + port" method. In addition, the combination of Dockerfile and docker-compose can easily achieve single-machine container orchestration.

The advantage of Docker lies in its isolation. In addition, it is also a daemon program. In the container, there is also a program with PID = 1. So is this program a daemon program? It is a daemon program inside the container, and it is managed by Docker outside the container. If this program encounters an exception, Docker will determine the entire container as abnormal. This is one of the reasons why many people don't know how to solve Docker problems, that is, they don't know why the container cannot start or why the process with PID = 1 is abnormal. You may say that Docker's judgment is too simple and crude, only looking at whether the program with PID = 1 is abnormal. I can only say that as a daemon program, it can only be like this. As long as you can keep this program running continuously, the status of the container will not be considered abnormal by Docker. Therefore, Docker is not suitable for running programs that automatically exit after running for a period of time.

In summary, isolation is the core of Docker, and it solves many problems. By isolating processes, networks, and file systems, it helps us solve a series of problems, such as port occupation when deploying multiple identical services and resource allocation. Based on this isolation, we have also achieved more flexible things like container clusters. Let our services be more flexible.

Cost#

Of course, using containerization technology is not without cost. There is a slight loss of CPU performance, additional memory usage (because libc is not shared with the host and needs to be loaded into the container), and disk IO performance loss.

This article is synchronized to xLog by Mix Space
The original link is https://www.timochan.cn/posts/any_pen/understanding_docker_for_me