SystemTap: A Script to Help With Containerization
If you want to containerize your app but don't know which capabilities it draws on, there's a script you can use to root out exactly what you'll need to keep in mind.
Join the DZone community and get the full member experience.
Join For FreeMany developers would like to run their existing applications in a container with restricted capabilities to improve security. However, it may not be clear which capabilities the application uses because the code uses libraries or other code developed elsewhere. The developer could run the application in an unrestricted container that allows all syscalls and capabilities to be used to avoid possible hard to diagnose failures caused by the application’s use of forbidden capabilities or syscalls. Of course, this eliminates the enhanced security of restricted containers. At Red Hat, we have developed a SystemTap script (container_check.stp) to provide information about the capabilities that an application uses. Read the SystemTap Beginners Guide for information on how to setup SystemTap.
Below is an example of the container_check.stp script monitoring a sudo command and the child processes it creates due to the strace and ping commands. The SystemTap “-c” option will setup the SystemTap instrumentation, run the specified command following the option, and shut down the SystemTap instrumentation once the command is complete. The expected output of the ping and strace commands are printed out followed by the output of the script. If the script warns about skipped probes, the number of active kretprobes allowed needs to be increased by using a larger number in the “-DKRETACTIVE=100” option on the command line.
The container_check.stp script lists out the capabilities used by each executable. The first section of the script output for this example shows ping uses setuid and net_raw capabilities and the sudo uses setgid, setuid, and audit_write capabilities. The next section of the script output provides more details on the specific system calls using those capabilities for each executable. Thus, for this example to run in a container the setuid, setgid, net_raw, and audit_write capabilities would be required.
$ ./container_check.stp -DKRETACTIVE=100 -c "sudo strace -c -f ping -c 1 people.redhat.com"
starting container_check.stp. monitoring 20146
PING people02.pubmisc.prod.ext.phx2.redhat.com (10.5.19.28) 56(84) bytes of data.
64 bytes from people02.pubmisc.prod.ext.phx2.redhat.com (10.5.19.28): icmp_seq=1 ttl=57 time=46.3 ms
--- people02.pubmisc.prod.ext.phx2.redhat.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 46.370/46.370/46.370/0.000 ms
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
30.90 0.000623 69 9 2 socket
13.69 0.000276 14 20 1 open
7.84 0.000158 7 22 mprotect
7.14 0.000144 5 31 mmap
5.41 0.000109 5 24 close
4.37 0.000088 4 20 fstat
4.07 0.000082 4 20 read
3.08 0.000062 12 5 2 connect
3.03 0.000061 31 2 sendto
2.48 0.000050 8 6 write
2.18 0.000044 44 1 sendmmsg
1.93 0.000039 6 7 setsockopt
1.84 0.000037 7 5 poll
1.84 0.000037 12 3 munmap
1.44 0.000029 6 5 ioctl
1.24 0.000025 4 7 capget
0.99 0.000020 20 1 recvmsg
0.94 0.000019 6 3 recvfrom
0.74 0.000015 5 3 rt_sigaction
0.74 0.000015 5 3 capset
0.55 0.000011 6 2 2 access
0.50 0.000010 10 1 setuid
0.50 0.000010 5 2 prctl
0.45 0.000009 3 3 brk
0.35 0.000007 4 2 getuid
0.30 0.000006 6 1 setitimer
0.30 0.000006 6 1 getsockname
0.30 0.000006 6 1 getsockopt
0.25 0.000005 5 1 rt_sigprocmask
0.25 0.000005 5 1 geteuid
0.20 0.000004 4 1 getpid
0.20 0.000004 4 1 arch_prctl
0.00 0.000000 0 1 execve
------ ----------- ----------- --------- --------- ----------------
100.00 0.002016 215 7 total
capabilities used by executables
executable: prob capability
ping: cap_setuid
ping: cap_net_raw
sudo: cap_setgid
sudo: cap_setuid
sudo: cap_audit_write
capabilities used by syscalls
executable, syscall ( capability ) : count
ping, socket ( cap_net_raw ) : 2
ping, setuid ( cap_setuid ) : 1
sudo, setresuid ( cap_setuid ) : 11
sudo, setresgid ( cap_setgid ) : 10
sudo, setgroups ( cap_setgid ) : 5
sudo, setgid ( cap_setgid ) : 1
sudo, setuid ( cap_setuid ) : 1
sudo, sendto ( cap_audit_write ) : 5
forbidden syscalls
executable, syscall: count
failed syscalls
executable, syscall = errno: count
ping, connect = ENOENT: 2
ping, socket = EACCES: 2
ping, access = ENOENT: 2
ping, open = ENOENT: 1
stapio, execve = ENOENT: 5
stapio, rt_sigreturn = EINTR: 1
strace, wait4 = ECHILD: 1
strace, access = ENOENT: 1
sudo, read = EAGAIN: 1
sudo, ioctl = ENOTTY: 2
sudo, recvmsg = EAGAIN: 3
sudo, open = ENOENT: 83
sudo, stat = ENOENT: 7
sudo, access = ENOENT: 4
sudo, fstat = EBADF: 1
sudo, connect = ENOENT: 13
sudo, poll = : 1
sudo, rt_sigreturn = EINTR: 1
You can also monitor already running processes by using the “-x ” option and stopping the instrumentation with Ctl-C when the data collection is done. Below is an example monitoring Wireshark, showing the dumpcap executable using the setgid, setuid, and net_raw capabilities:
$ pgrep wireshark
19015
$ ./container_check.stp -DKRETACTIVE=200 -x 19015starting container_check.stp. monitoring 19015
^C
capabilities used by executables
executable: prob capability
dumpcap: cap_setgid
dumpcap: cap_setuid
dumpcap: cap_net_raw
capabilities used by syscalls
executable, syscall ( capability ) : count
dumpcap, setresgid ( cap_setgid ) : 1
dumpcap, setresuid ( cap_setuid ) : 1
dumpcap, socket ( cap_net_raw ) : 1
forbidden syscalls
executable, syscall: count
failed syscalls
executable, syscall = errno: count
dumpcap, select = : 1
dumpcap, rt_sigreturn = EINTR: 1
dumpcap, setsockopt = EBUSY: 1
dumpcap, stat = ENOENT: 1
dumpcap, access = ENOENT: 2
dumpcap, ioctl = EOPNOTSUPP: 2
dumpcap, recvfrom = EAGAIN: 1
wireshark, recvmsg = EAGAIN: 2840
wireshark, ioctl = EINVAL: 2
wireshark, open = ENOENT: 31
wireshark, stat = ENOENT: 57
Published at DZone with permission of Will Cohen, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments