Allow/Disallow Syscalls via Seccomp

Photo by Raimond Klavins on Unsplash
root@adil:~# gcc uname.c -l seccomp && ./a.out
What's up?
Linux
root@adil:~# strace -c ./a.out
What’s up?
Linux
% time seconds usecs/call calls errors syscall
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
0.00 0.000000 0 1 read
0.00 0.000000 0 2 write
0.00 0.000000 0 2 close
0.00 0.000000 0 3 fstat
0.00 0.000000 0 7 mmap
0.00 0.000000 0 4 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 6 pread64
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 uname
0.00 0.000000 0 2 1 arch_prctl
0.00 0.000000 0 2 openat
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
100.00 0.000000 36 2 total
root@adil:~# gcc uname.c -l seccomp && ./a.out
What’s up?
Bad system call (core dumped)
root@adil:~# strace ./a.out 2>&1 | tail -3
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=8, filter=0x55bdf22dbf30}) = 0
uname( <unfinished …>) = ?
+++ killed by SIGSYS (core dumped) +++
type=SECCOMP msg=audit(1613512469.711:351): auid=1000 uid=0 gid=0 ses=11 pid=12427 comm="a.out" exe="/root/a.out" sig=31 arch=c000003e syscall=63 compat=0 ip=0x7f048a06dccb code=0x0

Deny everything, allow some of them

Seccomp has a mode, and it is called Strict mode. In strict mode, only read, write, _exit, and sigreturn syscalls allowed.

root@ip-172-31-43-168:~# gcc file.c -lseccomp && ./a.out
Killed
root@ip-172-31-43-168:~# cat /tmp/test.txt
qwe
xyz

Why is it killed?

Let’s have a look at the audit:

type=SECCOMP msg=audit(1613509181.394:262): auid=1000 uid=0 gid=0 ses=11 pid=11857 comm=”a.out” exe=”/root/a.out” sig=9 arch=c000003e syscall=3 compat=0 ip=0x7fcbe5fd04ab code=0x0
root@adil:~# gcc file.c -lseccomp && ./a.out
Killed
root@adil:~# cat /tmp/test.txt
qwe

Why is it killed?

Let’s have a look at the audit:

type=SECCOMP msg=audit(1613509624.996:263): auid=1000 uid=0 gid=0 ses=11 pid=11880 comm=”a.out” exe=”/root/a.out” sig=9 arch=c000003e syscall=5 compat=0 ip=0x7fd386719689 code=0x0
openat(AT_FDCWD, “/tmp/test.txt”, O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END) = 8
fstat(3, {st_mode=S_IFREG|0644, st_size=8, …}) = 0
write(3, “qwe\n”, 4) = 4
close(3) = 0
openat(AT_FDCWD, “/tmp/test.txt”, O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END) = 12
prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT) = 0
fstat(3, <unfinished …>) = ?

+++ killed by SIGKILL +++

It isn’t very clear

The close syscall killed in the first version of file.c. The fstat syscall killed in the second version of file.c. However, we enabled the strict mode before the second fputs function in both of the two codes.

root@adil:~# strace ./a.out 2>&1 | grep write -B2
fstat(3, {st_mode=S_IFREG|0644, st_size=28, ...}) = 0
prctl(PR_SET_SECCOMP, SECCOMP_MODE_STRICT) = 0
write(3, "qwe\nxyz", 7) = 7
root@adil:~# gcc file.c -lseccomp && ./a.out
root@adil:~# cat /tmp/test.txt
qwe
xyz

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store