Dockerfile is used to build your own Docker images. A docker file can contain a sequence of instructions such as,
–what base OS to use?
–what files to download?
–what commands to run?
–what service to start? etc, so that the image is built to be run as per requirements.
Listed below are the different Dockerfile instructions that is commonly needed to build a Docker image.
- FROM: Docker base image
- Shell and Exec method for commands
- Docker build syntax
- RUN vs CMD vs ENTRYPOINT command instructions
- USER: default user
- VOLUME: volumes inside docker
- ADD: copy files into a docker image
- WORKDIR: default working directory for commands
- EXPOSE: open docker ports
- MAINTAINER: name of image maintainer
- Simple Dockerfile example for a httpd application image
A docker image is some what similar to a Linux systemd service. For instance when you start a systemd service with ‘systemctl start servicename’, the systemd daemon looks for the script that was defined to run when the service is started. If the script returns a non-zero value, systemd reports a “failed” status for the service.
Similarly when you build an image with docker, let’s say you have specified an invalid URL for a file download or may be an invalid command, this returns a non-zero exit value. The docker build fails when a non zero exit value is returned by one of its instructions.
Now that we have a good understanding of how a Dockerfile functions, lets look at the different sections of a Dockerfile that defines the flow of building a docker image.
FROM: Docker base image
FROM – Select the base image to build the new image on top of.
Lets say you want your application to run with a centos base image, you begin the docker file with the below lines to choose a centos 7 base image.
#Download base image centos 7 FROM centos:7
Shell and Exec method for commands
There are two methods when you are trying to execute a command in Dockerfile, which are the shell method and the exec method.
shell method syntax:
exec method syntax:
'instruction' ["executable", "param1", "param2", ...]
Further, ‘instruction’ can be of three types namely RUN, CMD, ENTRYPOINT.
These three types of instructions are basically used to execute a command. We will look at when a specific type of instruction will be used later. Lets just look at the two methods of using these instructions first, and then we will look at the instructions itself later.
Regardless of the type of instruction, when shell method is used as in the below example,
‘instruction’ echo hello
The command that runs in the background, is run by invoking the default shell, which is,
/bin/sh -c "echo hello"
Whereas if the exec method is used, as in the below example,
‘instruction’ [“/bin/echo”, “Hello world”]
a shell is not invoked.
You can declare variables in a docker file with ENV, as in below example.
ENV mystring Hello world.
When we instruct the command to be run in shell method , as in the below example,
‘instruction’ echo $mystring
It produces the below output.
The variable is successfully substituted with its value.
Whereas when exec method is used,
‘instruction’ [“echo”, “$mystring”]
It produces the below output.
The variable is not substituted with its value.
To work around this, we can invoke a shell by calling it in exec method, as in below example.
‘instruction’ [“/bin/bash”, “-c”, “echo $mystring”]
Now that we have seen the two methods of executing a command, we will look at the three types of command instructions RUN,CMD & ENTRYPOINT, to understand the difference between them.
RUN vs CMD vs ENTRYPOINT command instructions
Lets understand the difference between and RUN and CMD first. RUN and CMD are instructions used to execute commands, but the fact that makes these two different is that any commands passed to RUN instrution is executed at the time of image build, whereas any commands passed to CMD instruction is executed when a container is started.
For example, you would want to execute commands such as ‘yum install’ or ‘mkdir’ at the time of image build only and not everytime a container is started. hence these commands can be execeuted with RUN.
#Download base image centos 7 FROM centos:7 #shell method RUN yum install httpd #exec method RUN ["yum", "install", "httpd"] #Running multiple commands with RUN instruction #The RUN instruction keyword can appear seperately as well in individual lines, unlike below. RUN yum -y update && \ yum -y install httpd && \ yum clean all
And commands to start service etc, needs to run at the time the container is started and not when the image is built.
#CMD instruction example, this is executed only at run time not at build time, i.e when the container is started CMD ["/usr/sbin/httpd", "-D", "FOREGROUND"]
Now that we understand, RUN is for build-time commands and CMD is for container-start-time or run-time commands, lets look at ENTRYPOINT.
ENTRYPOINT is very similar to CMD and has the same purpose as CMD, but the key difference is in the below definitions.
ENTRYPOINT: can be used to specify command to run when container starts. CMD: can be used to specify command to run when container starts OR it can also be used to specify an argument which will get passed to an ENTRYPOINT command.
Lets look some examples to understand the purpose and difference between CMD and ENTRYPOINT.
If you have ever tried running a base image such as Centos or Ubuntu, with ‘docker run centos’, the container goes into an exited state immediately unless you start an interactive shell with ‘docker run -it centos’. This is because for most of the base images, /bin/bash or /bin/sh is set as the default ENTRYPOINT or CMD. When the container is run, this runs once and exits.
docker run docker.io/centos docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Thus an ENTRYPOINT/CMD is any command that is started first inside the container when the container is run. This command has pid 1. docker container stops running or goes into an exited state when pid 1 no longer exists.
Lets look at a CMD example.
#Download base image centos 7 FROM centos:7 CMD ping www.google.com
Lets build an image with this dockerfile.
Docker Build syntax
docker build [path/to/dockerfile]
# ‘-t test_image’, names the image that is going to build as ‘test_image’
docker build -t test_image images/
Sending build context to Docker daemon 3.072 kB Step 1/2 : FROM centos:7 ---> 49f7960eb7e4 Step 2/2 : ENTRYPOINT ping www.google.com ---> Using cache ---> 677cf5ac77a7 Successfully built 677cf5ac77a7
REPOSITORY TAG IMAGE ID CREATED SIZE test_image latest 36fdbd610afd 31 minutes ago 200 MB docker.io/centos latest e934aafc2206 2 months ago 199 MB
We now see the newly build image, test_image getting listed.
Lets try running the image.
docker run test_image
PING www.google.com (220.127.116.11) 56(84) bytes of data. 64 bytes from 18.104.22.168 (22.214.171.124): icmp_seq=1 ttl=39 time=22.7 ms 64 bytes from 126.96.36.199 (188.8.131.52): icmp_seq=2 ttl=39 time=22.8 ms 64 bytes from 184.108.40.206 (220.127.116.11): icmp_seq=3 ttl=39 time=22.7 ms 64 bytes from 18.104.22.168 (22.214.171.124): icmp_seq=4 ttl=39 time=22.7 ms 64 bytes from 126.96.36.199 (188.8.131.52): icmp_seq=5 ttl=39 time=22.7 ms ^C --- www.google.com ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4002ms rtt min/avg/max/mdev = 22.750/22.777/22.815/0.098 ms
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
We see the command speficied as ENTRYPOINT in docker file starts running immediately. Once we kill, the docker container is not running anymore.
Lets start the container again with in a detached mode and then lets list all processess running in the background.
docker run -d test_image /bin/bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 6fb2ba011883 test_image "/bin/sh -c 'ping ..." 30 seconds ago Up 29 seconds sharp_swartz
docker exec 6fb2ba011883 ps -ef
UID PID PPID C STIME TTY TIME CMD root 1 0 0 13:47 ? 00:00:00 ping www.google.com root 9 0 0 13:49 ? 00:00:00 ps -ef
We see that the entry point command, ‘ping www.google.com’ has taken the pid 1.
NOTE: replacing CMD with ENTRYPOINT would produce the same exact result.
From the boxed definition, we know that commands passed to CMD becomes arguments to commands passed to ENTRYPOINT, when both CMD and ENTRYPOINT exists in Dockefile. Lets look at an example of using CMD as an argument to an ENTRYPOINT.
NOTE: Passing CMD as an argument to ENTRYPOINT works only with exec method
FROM centos:7 ENTRYPOINT ["/bin/ping"] CMD ["yahoo.com"]
Running the image without any argument will ping the yahoo.com:
docker run -it test_image
PING yahoo.com (184.108.40.206): 48 data bytes 56 bytes from 220.127.116.11: icmp_seq=0 ttl=64 time=0.096 ms 56 bytes from 18.104.22.168: icmp_seq=1 ttl=64 time=0.088 ms 56 bytes from 22.214.171.124: icmp_seq=2 ttl=64 time=0.088 ms ^C--- localhost ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.088/0.091/0.096/0.000 ms Now, running the image with an argument will ping the argument:
Now, lets pass an argument at run time.
docker run -it test_image google.com
PING google.com (126.96.36.199): 48 data bytes 56 bytes from 188.8.131.52: icmp_seq=0 ttl=55 time=32.583 ms 56 bytes from 184.108.40.206: icmp_seq=2 ttl=55 time=30.327 ms 56 bytes from 220.127.116.11: icmp_seq=4 ttl=55 time=46.379 ms ^C--- google.com ping statistics --- 5 packets transmitted, 3 packets received, 40% packet loss round-trip min/avg/max/stddev = 30.327/36.430/46.379/7.095 ms
We see that the run time argument google.com is pinging instead of the defaults from docker file. Thus we now know that the default argument can be overridden using docker run command at run time.
NOTE: the RUN instruction keyword can appear mulltiple times in dockerfile whereas the CMD and ENTRYPOINT instruction keyword can appear only ones.
USER: default user
USER – Define the default User all commands will be run as within any Container created from your Image. It can be either a UID or username
VOLUME: volumes inside docker
VOLUME – Creates a mount point within the Container and links it back to file systems accessible by the Docker Host. New Volumes get populated with the pre-existing contents of the specified location in the image. It is not recommended to define Volumes in a Dockerfile. Volumes should be managed with docker-compose or “docker run” commands. Due to technical limitations path of volume in host cannot be mapped to a path in host through Dockerfile, however can be done with docker run command.
Syntax to map a container volume to a path in host is as follows.
docker run -v /host/path:/container/path
ADD: copy files into a docker image
ADD – allows you to use a URL or a path to a file from Docker host as the source of a file to be copied to a path in container. When a URL is provided, a file is downloaded from the URL and copied to the destination specified in container.
ADD http://example.com/image1 /tmp/image1 ADD /data/image2 /tmp/image2
WORKDIR: default working directory for commands
WORKDIR – Define the default working directory for the command defined in the “ENTRYPOINT” or “CMD” instructions
EXPOSE: open docker ports
EXPOSE – Define which Container ports to expose
MAINTAINER: sets the Author field of the generated images
MAINTAINER – Optional field to let you identify yourself as the maintainer of this image
MAINTAINER UnixUtils "email@example.com"
Simple Dockerfile example for a httpd application image
#use centos base image FROM centos:7 #default user to run commands USER root #default working directory to run commands WORKDIR /root/ #install httpd RUN ["yum", "install", "httpd"] #share /var/www/html on container with host VOLUME /var/www/html #open port 80 on container EXPOSE 80 #start httpd when the container is started CMD ["-D", "FOREGROUND"] ENTRYPOINT ["/usr/sbin/httpd"] #sets the Author field of the generated images MAINTAINER UnixUtils "firstname.lastname@example.org"