Container Filesystem Under The Hood: OverlayFS

adil
5 min readOct 17, 2021

Another post about containerization (Container Networking Under The Hood: Network Namespaces) is here.

The underlying filesystem is one of the mysterious parts of containerization.

The underlying filesystems contain multiple OS images, folders, mount points, etc.

***** You can follow me on LinkedIn *****

Photo by Jubal Kenneth Bernal on Unsplash

Before we dive into OverlayFS, let’s talk about a simple scenario.

Let’s say you have two different disk volumes and you want to mount them in a single folder. Thus, you will be able to list files and folders on those disks in the same folder.

It is been possible for a long time thanks to Union Mounts [1, 2] in the *nix family. However, administrating a filesystem with a union mount is tough.

Virtual File Systems provide reliable and relatively easy filesystem management. Aufs, UnionFS, OverlayFS, mhddfs, and MergerFS are popular virtual file systems.

Docker deprecated Aufs in favor of OverlayFS a few years ago.

Why does Docker need a virtual file system?

  • Because the containers must be isolated from each other. A container must not interfere with another container’s file system.
  • OS images must be downloaded once and used multiple times. Thanks to OverlayFS’ lower/upper directory mechanism, OS images don’t need to be replicated.

The directory structure in OverlayFS

There are four types of directories in OverlayFS: lower directory, upper directory, work directory, and overlay directory (a.k.a. merge directory)

Lower directory: read-only directory

Upper directory: writable directory

Overlay directory: Combined lower/upper directories (OverlayFS merges the lower directory and the upper directory)

Work directory: Used by Linux Kernel for the atomic file operations (must be an empty directory)

Let’s get our hands dirty

mkdir lower upper workdir overlaymount -t overlay -o lowerdir=lower,upperdir=upper,workdir=workdir none overlayecho "from lower" >> lower/file1.txt
echo "from upper" >> upper/file2.txt

We created an overlay filesystem between two folders. Now let’s take a look inside these directories:

root@adil:~# ls -li overlay/
549319 -rw-r--r—- 1 root root 11 Oct 16 18:45 file1.txt
549320 -rw-r--r—- 1 root root 11 Oct 16 18:45 file2.txt
root@adil:~# ls -li upper/file2.txt
549320 -rw--r--r—- 1 root root 11 Oct 16 18:45 upper/file2.txt
root@adil:~# ls -li lower/file1.txt
549319 -rw-r--r—- 1 root root 11 Oct 16 18:45 lower/file1.txt

I want to draw your attention to the inodes of the files. Files in the overlay directory have a hard link with their respective files.

Let’s make it a little complicated

root@adil:~# echo "from lower" >> lower/file2.txt
root@adil:~# cat overlay/file2.txt
from upper
root@adil:~# cat overlay/file1.txt
from lower

If the lower directory and the upper directory have a file with the same name, then OverlayFS will access the file in the upper directory.

Let’s delete the files in the overlay directory:

root@adil:~# rm -rf overlay/file1.txt overlay/file2.txt
root@adil:~# ls -la overlay/
#empty
root@adil:~# ls -la lower/
-rw-r—-r—- 1 root root 11 Oct 16 18:45 file1.txt
-rw-r—-r—- 1 root root 11 Oct 16 19:03 file2.txt
root@adil:~# ls -la upper/
c —-— —-- -—- 3 root root 0, 0 Oct 16 19:08 file1.txt
c —-— —-- -—- 3 root root 0, 0 Oct 16 19:08 file2.txt
root@adil:~# cat upper/file1.txt
cat: upper/file1.txt: No such device or address
root@adil:~# cat upper/file2.txt
cat: upper/file2.txt: No such device or address
root@adil:~# cat lower/file1.txt
from lower
root@adil:~# cat lower/file2.txt
from lower

Files are kept in the lower directory, and the files appear in the parent directory. However, they are no longer available.

Actually, we didn’t create the file `file1.txt` in the upper directory initially. However, file1.txt was created. It became inaccessible in the upper directory when deleted in the overlay directory. It is mind-blowing.

The lower directory is read-only, and the upper directory is writable. So OverlayFS will try to delete in the upper directory even if it is not in the upper directory.

Let’s run the unlink command in the upper directory

root@adil:~# unlink upper/file1.txt
root@adil:~# unlink upper/file2.txt
root@adil:~# ls -la upper/
#empty
root@adil:~# cat overlay/file1.txt
from lower
root@adil:~# cat overlay/file2.txt
from lower

Oops! The files are in the overlay directory. Why? Because if there is no link in the upper directory, OverlayFS will try to create a hard link from the lower directory.

Let’s create a Docker container

root@adil:~# docker run -dit --name=mycontainer -v /root/test:/root/test -v /run/lock/example:/root/example ubuntu

P.S.: The filesystem type of /run/lock/example is tmpfs. The filesystem type of /root/test is ext4 on my Ubuntu host.

I installed Node.js in my container:

root@a8cab50df83c:/# node --version
v16.11.1

I created a dummy file in /root/test in my container:

root@a8cab50df83c:/# touch /root/test/one.txt

I will run a Node.js script inside the container:

var fs = require('fs');fs.rename('/root/test/one.txt', '/root/example/two.txt', (err) => {
if (err) throw err;
console.log('Succeed');
});

I want to move /root/test/one.txt to /root/example/two.txt via Node.js’ rename function:

root@a8cab50df83c:~# node test.js
/root/test.js:4
if (err) throw err;
^
[Error: EXDEV: cross-device link not permitted, rename '/root/test/one.txt' -> '/root/example/two.txt'] {
errno: -18,
code: 'EXDEV',
syscall: 'rename',
path: '/root/test/one.txt',
dest: '/root/example/two.txt'

This is a known issue. If the source and the destination folders are not in the top layer you can’t call the rename syscall. You have to copy the file from the source directory to the destination directory. After that, you have to unlink the file from the source directory:

root@a8cab50df83c:~# cat test.js
var fs = require('fs');
fs.copyFile('/root/test/one.txt', '/root/example/two.txt', (err) => {
if (err) throw err;
console.log('Copy — Success');
fs.unlink('/root/test/one.txt', (err) => {
if (err) throw err;
console.log('Unlink — Success')
})
});

Let’s test:

root@a8cab50df83c:~# ls -l test/one.txt
-rw-r--r-— 1 root root 0 Oct 16 20:04 test/one.txt
root@a8cab50df83c:~# ls -l example/two.txt
ls: cannot access 'example/two.txt': No such file or directory
root@a8cab50df83c:~# node test.js
Copy — Success
Unlink — Success
root@a8cab50df83c:~# ls -l test/one.txt
ls: cannot access 'test/one.txt': No such file or directory
root@a8cab50df83c:~# ls -l example/two.txt
-rw-r--r-— 1 root root 0 Oct 16 20:04 example/two.txt

Some additional information:

You can use multiple lower directories:

mount -t overlay -o lowerdir=lower1:lower2:lower3,upperdir=upper,workdir=workdir none overlay

The importance of the lower directories is from left to right. If you create a file with the same name in each lower directory, you will see the file that is from the lower1 directory in the overlay directory.

What does none mean in the mount command?

We tell the mount command that there is no physical disk partition to the mount point with the none keyword.

--

--