First of all, I want to make it clear I've done due diligence in researching this topic. Very closely related is this SO question, which doesn't really address my confusion.
I understand that when VOLUME
is specified in a Dockerfile, this instructs Docker to create an unnamed volume for the duration of the container which is mapped to the specified directory inside of it. For example:
# Dockerfile
VOLUME ["/foo"]
This would create a volume to contain any data stored in /foo
inside the container. The volume (when viewed via docker volume ls
) would show up as a random jumble of numbers.
Each time you do docker run
, this volume is not reused. This is the key point causing confusion here. To me, the goal of a volume is to contain state persistent across all instances of an image (all containers started from it). So basically if I do this, without explicit volume mappings:
#!/usr/bin/env bash
# Run container for the first time
docker run -t foo
# Kill the container and re-run it again. Note that the previous
# volume would now contain data because services running in `foo`
# would have written data to that volume.
docker container stop foo
docker container rm foo
# Run container a second time
docker run -t foo
I expect the unnamed volume to be reused between the 2 run
commands. However, this is not the case. Because I did not explicitly map a volume via the -v
option, a new volume is created for each run
.
Here's important part number 2: Since I'm required to explicitly specify -v
to share persistent state between run
commands, why would I ever specify VOLUME
in my Dockerfile? Without VOLUME
, I can do this (using the previous example):
#!/usr/bin/env bash
# Create a volume for state persistence
docker volume create foo_data
# Run container for the first time
docker run -t -v foo_data:/foo foo
# Kill the container and re-run it again. Note that the previous
# volume would now contain data because services running in `foo`
# would have written data to that volume.
docker container stop foo
docker container rm foo
# Run container a second time
docker run -t -v foo_data:/foo foo
Now, truly, the second container will have data mounted to /foo
that was there from the previous instance. I can do this without VOLUME
in my Dockerfile. From the command line, I can turn any directory inside the container into a mount to either a bound directory on the host or a volume in Docker.
So my question is: What is the point of VOLUME
when you have to explicitly map named volumes to containers via commands on the host anyway? Either I'm missing something or this is just confusing and obfuscated.
Note that all of my assertions here are based on my observations of how docker behaves, as well as what I've gathered from the documentation.
See Question&Answers more detail:
os