/contrib/famzah

Enthusiasm never stops


Leave a comment

Private networking per-process in Linux

This is a follow-up of the Private /tmp mount per-process in Linux. As already stated there, Linux namespaces offer great options for security.

In this article we will demonstrate the use of the “network” namespace which enables a process to have independent IPv4 and IPv6 stacks, network interfaces, IP routing tables, iptables firewall rules, the /proc/net and /sys/class/net directory trees, sockets, etc.

Here is a diagram to illustrate the concept:
Linux network namespace

First we start by creating a pair of “veth” network interfaces:

ip link add v-eth1 type veth peer name v-peer1
ip link set v-eth1 up
ip link set v-peer1 up

One of those interfaces will be used as a communication point from the side of the original default network namespace. We will assign “10.200.1.1” for IP address:

ifconfig v-eth1 10.200.1.1 netmask 255.255.255.0 up

It is time to enter the new network namespace. Once we have created the new namespace, we will associate the second interface “v-peer1” with it, then we will configure an IP address “10.200.1.2” and add a default route through the first interface which will act as a router:

export MAIN_NS_PID="$$"
unshare -n /bin/bash

#
# We are in a "/bin/bash" session in the NEW network namespace now.
#

ip link set lo up # activate the "loopback" interface

nsenter --net="/proc/$MAIN_NS_PID/ns/net" ip link set v-peer1 netns "$$" # join "v-peer1" into this namespace
ifconfig v-peer1 10.200.1.2 netmask 255.255.255.0 up
route add default gw 10.200.1.1 dev v-peer1

#
# Setup is done.
# You can now drop privileges and launch a daemon which will use this confined network namespace.
#

sudo -u www-data /etc/init.d/my-net-daemon start

The original default namespace, our original Linux installation, must be configured to act as a router. Otherwise the processes inside the new network namespace won’t have any Internet access. Configuring a Linux network router is a straightforward task:

echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -P FORWARD DROP
iptables -F FORWARD

iptables -t nat -F
iptables -t nat -A POSTROUTING -s 10.200.1.0/255.255.255.0 -o eth0 -j MASQUERADE

iptables -A FORWARD -i eth0 -o v-eth1 -j ACCEPT
iptables -A FORWARD -o eth0 -i v-eth1 -j ACCEPT

Finally, you can enable inbound connections to the processes in the confined new network namespace. Let’s assume that you have a daemon listening on TCP port 10105. Here is how you can forward any new incoming connections to the processes inside the new network namespace:

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 10105 -j DNAT --to-destination 10.200.1.2

Pros: Using separate network namespaces gives us full network isolation and control over a group of processes. Additionally, we can match incoming packets against a process which is not possible in a standard “iptables” setup using the “-m owner” match extension. These are huge security benefits.

Cons: The technical implications are that the Linux host has to do (a lot) more work because of the DNAT/SNAT operations and their related connection tracking overhead. If you are running a high traffic server, you should plan and test accordingly. Furthermore, one additional network interfaces pair is created for each new network namespace. Linux can handle hundreds of network devices but still this is a factor to be considered.

The better security features outweigh the drawbacks in most use-cases though. Last but not least, it is very easily to run a process with completely detached network and this won’t cost us anything on the Linux host.

Resources:

Advertisements


1 Comment

Private /tmp mount per-process in Linux

I’ve been playing with Linux namespaces and the results are very satisfying. This process isolation has several benefits:

  • The setup is automatically destroyed when the process and its children exit — easy maintenance.
  • Non-privileged processes cannot alter the setup — great security.
  • The isolated resource type is completely invisible by processes in other namespaces — great security.
  • The setup is inherited by any forked children — great for security and maintenance.

If you review the man page of the “unshare” command or syscall, you will see that currently we can have the following private namespaces:

  • mount namespace — mounting and unmounting filesystems will not affect rest of the system, except for filesystems which are explicitly marked as shared
  • UTS namespace — setting hostname, domainname will not affect rest of the system
  • IPC namespace — the process will have independent namespace for System V message queues, semaphore sets and shared memory segments
  • network namespace — the process will have independent IPv4 and IPv6 stacks, IP routing tables, iptables firewall rules, the /proc/net and /sys/class/net directory trees, sockets, etc.
  • pid namespace (new) — children will have a distinct set of PID to process mappings from their parent
  • user namespace (new) — the process will have a distinct set of UIDs, GIDs and capabilities

In this article we will demonstrate the use of the “mount” namespace which lets us mount a filesystem per-process without affecting the rest of the system. Using such a private mount for “/tmp” has mainly security but also usability benefits.

Here are all the commands which you need, in order to start a process with a private “/tmp” directory:

TARGET_USER='www-data'
TARGET_CMD='/bin/bash' # but it can be any command
NEWTMP="$(mktemp -d)" # securely create a new empty tmp folder

chown "root:$TARGET_USER" "$NEWTMP"
chmod 770 "$NEWTMP"

unshare --mount -- /bin/bash -c "mount -o bind,noexec,nosuid,nodev '$NEWTMP' /tmp && sudo -u '$TARGET_USER' $TARGET_CMD"

A longer version with more explanations follow:

#
# setup operations done as "root"
#

root@vbox:~# TARGET_USER='www-data'
root@vbox:~# TARGET_CMD='/bin/bash' # but it can be any command
root@vbox:~# NEWTMP="$(mktemp -d)" # securely create a new empty tmp folder
root@vbox:~# chown "root:$TARGET_USER" "$NEWTMP"
root@vbox:~# chmod 770 "$NEWTMP"

#
# review the result in the real file-system "/tmp"
#

root@vbox:~# echo $NEWTMP
/tmp/tmp.IyoUhputAW

root@vbox:~# ls -la /tmp
total 60
drwxrwxrwt 12 root   root     12288 Jun  4 13:53 .
drwxr-xr-x 23 root   root      4096 Jan 24 15:31 ..
drwxrwxrwt  2 root   root      4096 Jun  1 22:54 .ICE-unix
drwxrwx---  2 root   www-data  4096 Jun  4 13:53 tmp.IyoUhputAW

root@vbox:~# ls -la "$NEWTMP"
total 16
drwxrwx---  2 root www-data  4096 Jun  4 13:53 .
drwxrwxrwt 12 root root     12288 Jun  4 13:53 ..

#
# start the non-privileged process with a private "/tmp" mount
#

root@vbox:~# unshare --mount -- /bin/bash -c "mount -o bind,noexec,nosuid,nodev '$NEWTMP' /tmp && sudo -u '$TARGET_USER' $TARGET_CMD"

#
# sample operations done inside the non-privileged process
#

www-data@vbox:~$ ls -la / | grep tmp
drwxrwx---   2 root www-data  4096 Jun  4 13:53 tmp

www-data@vbox:~$ touch /tmp/test-www-data-file

www-data@vbox:~$ ls -la /tmp # the process has a private "/tmp" mount
total 8
drwxrwx---  2 root     www-data 4096 Jun  4 13:55 .
drwxr-xr-x 23 root     root     4096 Jan 24 15:31 ..
-rw-r--r--  1 www-data www-data    0 Jun  4 13:55 test-www-data-file

#
# see the result in the real file-system "/tmp"
#

root@vbox:~# ls -la /tmp
total 60
drwxrwxrwt 12 root   root     12288 Jun  4 13:53 .
drwxr-xr-x 23 root   root      4096 Jan 24 15:31 ..
drwxrwxrwt  2 root   root      4096 Jun  1 22:54 .ICE-unix
drwxrwx---  2 root   www-data  4096 Jun  4 13:55 tmp.IyoUhputAW

root@vbox:~# echo "$NEWTMP"
/tmp/tmp.IyoUhputAW

root@vbox:~# ls -la "$NEWTMP"
total 16
drwxrwx---  2 root     www-data  4096 Jun  4 13:55 .
drwxrwxrwt 12 root     root     12288 Jun  4 13:53 ..
-rw-r--r--  1 www-data www-data     0 Jun  4 13:55 test-www-data-file

Note that we are mounting a directory inside another directory using the “bind” mount feature of Linux.

Resources: