The general scenario is that we have a cluster of servers and we want to set up virtual clusters on top of that using Docker.
For that we have created Dockerfiles for different services (Hadoop, Spark etc.).
Regarding the Hadoop HDFS service however, we have the situation that the disk space available to the docker containers equals to the disk space available to the server. We want to limit the available disk space on a per-container basis so that we can dynamically spawn an additional datanode with some storage size to contribute to the HDFS filesystem.
We had the idea to use loopback files formatted with ext4 and mount these on directories which we use as volumes in docker containers. However, this implies a large performance loss.
I found another question on SO (Limit disk size and bandwidth of a Docker container) but the answers are almost 1,5 years old which - regarding the speed of development of docker - is ancient.
Which way or storage backend would allow us to
- Limit storage on a per-container basis
- Has near bare-metal performance
- Doesn't require repartitioning of the server drives
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…