Dockerfile & Volumes in Docker-Part 2

6 min readJan 27, 2023

In my previous article , we went through the basic concepts of Docker. Now , lets have a deep dive into it. I would be covering some very interesting features of Docker.

Dockerfile

A Dockerfile is a set of instruction that tells the docker engine to perform certain actions.

It consists of various commands that will be followed step by step. Each instruction then build a layer on top of the pervious layer and finally we build the image.

So basically, it lets us customize our image from a base image that is pulled from the repository and builds over it. Lets see a few commonly used instructions.

FROM

This should always be the first command , which pulls the base image (could be from Dockerhub or Private Repositories ).

Usage : FROM ubuntu:20.04

ENV

This command is used for setting environment variables.

Usage : ENV RUN_TIME=300

COPY

This command is used to copy files from local to a folder inside a container.

Usage : COPY . app/my-docker-project/

ADD

This command is similar to the COPY command , with an advantage that the source can also be a URL.

Usage : ADD <src> app/my-docker-project/

Note : It can also be use to copy a zipped file from local and auto-extract folder to the container.

WORKDIR

It changes the current directory inside the container to the directory specified in the command.

Usage : WORKDIR app/my-docker-project/

RUN

This command , as obvious, is used to run any command. This is mostly used to install packages and dependencies to the container.

Usage : RUN [“apt-get update -y”]

CMD

This command sets default parameter for the container , which can be overridden from CLI.

Usage : CMD [“/bin/bash”]

ENTRYPOINT

This command allows us to specify command to run when the container starts. Unlike CMD, this can’t be overridden.

Usage : ENTRYPOINT [“./app.sh”]

EXPOSE

It exposes a port, so that the user can access the application.

Usage : EXPOSE 8080

VOLUME

It can mount or create (if not present) mountpoint at the specified path , for persisting data from container.

Usage : VOLUME /Data

ARG

It is used to set build time arguments. Its similar to ENV , but ARG cant be used after the image is created.

Usage : ARG MY_NAME=Linu

LABEL

It is used to specify metadata for Docker Images.

Usage : LABEL desc=”This is a Label instruction”

Some Commonly misused commands and their subtle differences:

1.ENV vs ARG :

The ARG command is used to pass an argument during build time. But this cant be used after the image is created. Hence , to make use of other variables/argument inside the container, we use the ENV command.

2.COPY vs ADD :

The COPY command can be used to transfer files /folders from our Local (we can either provide full path or if it is the current directory that we want to use, we make use of the ‘.’ argument.).The destination is a folder inside the container.

The ADD command is similar to COPY , but has an added advantage of making use of URL from source and copying that to the destination inside the container.

3.RUN vs CMD vs ENTRYPOINT :

The RUN command is used to install packages and dependencies inside the container.

The CMD command is used to provide default arguments if no arguments are specified explicitly.

The ENTRYPOINT command is used , when we want to specify a different argument other than the default that is specified in CMD. There can only be 1 ENTRYPOINT.

For best practice, you should use both CMD (default) and ENTRYPOINT. This is because , if CMD is absent and ENTRYPOINT is present but then if the argument is not provided for , then it will cause an error. But in presence of CMD, it will get a default argument to be provided to the ENTRYPOINT.

Example 1:

If I had the following instructions on my Dockerfile :

Now if I run the command “docker build . -t linu/whalesay “, the image will be created with tag name “linu/whalesay”:

Now if I run the command as show , the actual command executed would be “docker run linu/whalesay“, as it takes the value from CMD.

Notice that when I add an argument in CLI, it ignores the argument in CMD

Example 2:

I created another Dockerfile and built using the tag linu/whalesay

Now here , CMD acts as an default argument for the ENTRYPOINT.

Here , if I do not provide any arguments, it will add the default one specified in CMD

When I specify an argument to the ENTRYPOINT , it overrides the default value.

Sample Dockerfile for Reference:

FROM python:3.6 

# Create app directory
WORKDIR /app 

# Install app dependencies
COPY src/requirements.txt ./ 

RUN pip install -r requirements.txt 
# Bundle app source

COPY src /app 

EXPOSE 8080

CMD [ "python", "server.py" ]

That being said, docker commands such as docker create command just creates a container and stops, whereas docker run command creates and starts the container. Docker start command is used to start an already existing container.

Docker Volumes :

Imagine that you have deployed an E-Commerce application on Docker, where you store the order details on a container -mysql. All data will be stored onto it ( We’ll look into the How part of it in the next article ;) ).

But what happens if the container had a crash and the container stops. All the data goes poof! . So , how can we save the data or persist data ?

Answer : Volumes.

Docker makes use of the Filesystem of the Docker Host (Path : /var/lib/docker). Here it contains many sub folders like containers, images ,volumes etc.

So the absolute path for Volumes would be /var/lib/docker/volumes

We create a volume that is mounted to the Docker Host ie our local, which means, whenever data is updated, it writes into the path outside the container ,which we can specify.

docker create volume myvolume

docker run -v myvolume:/var/lib/mysql <image name>

The volume “myvolume” will be created on the default volume path — /var/lib/docker/volumes.

If we want to specify some other path than the default, we use :


docker run --mount type=bind,source=<new path>,target=/var/lib/mysql <image name>

So here, we now know what a Dockerfile is and how to run instructions. We also know to create persistent volumes so that our data is saved in case our containers crash.