Akash Singh

Docker Volumes

When a container writes files, it writes them inside of the container. Which means that when the container dies (the host machine restarts or the container is moved from one node to another in a cluster, it simply fails, etc.) all of that data is lost. It also means that if you run the same container several times in a load-balancing scenario, each container will have its own data, which may result in inconsistent user experience.

A rule of thumb for the sake of simplicity is to ensure that containers are stateless, for instance, storing their data in an external database (relational like an SQL Server or document-based like MongoDB) or distributed cache (like Redis). However, sometimes you want to store files in a place where they are persisted; this is done using volumes.

Using a volume, you map a directory inside the container to a persistent storage. Persistent storages are managed through drivers, and they depend on the actual Docker host. They may be an Azure File Storage on Azure or Amazon S3 on AWS. With Docker Desktop, you can map volumes to actual directories on the host system; this is done using the -v switch on the docker run command or if you are using Docker-compose you can simply set the volume there.

Suppose you run a Oracle database with no volume:
docker run -d oraclelinux:7-slim
Any data stored in that database will be lost when the container is stopped or restarted. In order to avoid data loss, you can use a volume mount:
docker run -v /your/dir:/var/lib/oraclelinux -d oraclelinux:7-slim

It will ensure that any data written to the /var/lib/oraclelinux directory inside the container is actually written to the /your/dir directory on the host system. This ensures that the data is not lost when the container is restarted. Another very big advantage is that it allows to edit the code inside your volume which takes effect upon restarting the container. This means that we can actually make code changes persistent.