Linux | /contrib/famzah

June 25, 2015
by Ivan Zahariev Leave a comment

“iperf” and “iftop” accuracy

While working on my latest pet project which involved 10 GigE transfers, I noticed a significant difference between the results shown by “iperf” and “iftop“. A fellow blogger also noticed this discrepancy. In order to get to the bottom of this, I did some additional tests using different MTU sizes, and observing the output of “iperf”, “iftop”, “iptraf”, and the raw Linux network device counters as seen by “ifconfig”.

The tests results are summarized in an online spreadsheet: https://goo.gl/MvJC8K

Some notes about each application:

iperf – this tool measures the TCP performance, as per documentation; therefore it counts the useful payload in a TCP/IP transfer; this is layer4 in the OSI model
iftop – this tool counts all IP packets, as per documentation; my tests show that it also operates on layer4, just as “iperf”, because ARP traffic (on layer3) is not counted at all; the fact that “iftop” cares about connections+ports also suggests that it operates at layer4
iptraf – this tool seems to be too old now, and its results were off by a multiple of 4 to 5
ifconfig – shows the most low-level statistics, namely bytes that passed as RX or TX through the network device; the most trusted source of performance data

We notice that both “iperf” and “iftop” measure the useful payload data that we can transfer per second. Since all OSI layers have some overhead, let’s take a look at what theory says about bandwidth efficiency in Ethernet:

with a standard MTU frame of 1500 bytes, we get 94.93% efficiency (5.07% overhead)
with a jumbo MTU frame of 9000 bytes, we get 99.14% efficiency (0.86% overhead)

Those numbers correspond very closely with the results shown by “iperf”.

It’s only “iftop” which differs a lot. Analysis of its source code reveals the reason for this and how we must interpret the displayed results:

#
# ui.c
#

void ui_print() {
...
    mvaddstr(y, COLS - 8 * HISTORY_DIVISIONS - 8, "rates:");

    draw_totals(&totals);
}

void draw_totals(host_pair_line* totals) {
    for(j = 0; j < HISTORY_DIVISIONS; j++) {
        readable_size((totals->sent[j] + totals->recv[j]) , buf, 10, 1024, options.bandwidth_in_bytes);
...
}

#
# ui_common.c
#

/*
 * Format a data size in human-readable format
 */
void readable_size(float n, char* buf, int bsize, int ksize, int bytes) {
    float size = 1;
...
    while(1) {
      size *= ksize;
...
        snprintf(buf, bsize, " %4.2f%s", n / size, bytes ? unit_bytes[i] : unit_bits[i]);

The authors of “iftop” decided to round to Gigibit (multiple of 1024), instead of the more common Gigabit (multiple of 1000). This makes the difference by “iftop” bigger as the transfer rate gets higher. For Gigabit the difference is 7%.

Once the “iftop” values are converted from Gigibit to Gigabit, they also match the results by “iperf” and the raw Linux network device counters.

June 25, 2015
by Ivan Zahariev 5 Comments

Linux md-RAID scalability on a 10 Gigabit network

The question for today is – does Linux md-RAID scale to 10 Gbit/s?

I wanted to build a proof of concept for a scalable, highly available, fault tolerant, distributed block storage, which utilizes commodity hardware, runs on a 10 Gigabit Ethernet network, and uses well-tested open-source technologies. This is a simplified version of Ceph. The only single point of failure in this cluster is the client itself, which is inevitable in any solution.

Here is an overview diagram of the setup:

My test lab is hosted on AWS:

3x “c4.8xlarge” storage servers
- each of them has 5x 50 GB General Purpose (SSD) EBS attached volumes which provide up to 160 MiB/s and 3000 IOPS for extended periods of time; practical tests shown 100 MB/s sustained sequential read/write performance per volume
- each EBS volume is managed via LVM and there is one logical volume with size 15 GB
- each 15 GB logical volume is being exported by iSCSI to the client machine
1x “c4.8xlarge” client machine
- the client machine initiates an iSCSI connection to each single 15 GB logical volume, and thus has 15 identical iSCSI block devices (3 storage servers x 5 block devices = 15 block devices)
- to achieve a 3x replication factor, the block devices from each storage server are grouped into 5x mdadm software RAID-1 (mirror) devices; each RAID-1 device “md1” to “md5” contains three disks from a different storage server, so that if one or two of the storage servers fail, this won’t affect the operation of the whole RAID-1 device
- all RAID-1 devices “md1” to “md5” are grouped into a single RAID-0 (stripe), in order to utilize the full bandwidth of all devices into a single block device, namely the “md99” RAID-0 device, which also combines the size capacity of all “md1” to “md5” devices and it equals to 75 GB
10 Gigabit network in a VPC using Jumbo frames
- “iperf” measured a 10.07 Gbits/s maximum throughput between the servers; at the same time, “iftop” displayed 9.20 Gbit/s
- VPC does not support network broadcasts, so ATA over Ethernet was not an option
the storage servers and the client machine were limited on boot to 4 CPUs and 2 GB RAM, in order to minimize the effect of the Linux disk cache
only sequential and random reading were benchmarked
Linux md RAID-1 (mirror) does not read from all underlying disks by default, so I had to create a RAID-1E (mirror) configuration; more info here and here; the “mdadm create” options follow: --level=10 --raid-devices=3 --layout=o3 Continue reading →

June 5, 2014
by Ivan Zahariev Leave a comment

Private networking per-process in Linux

This is a follow-up of the Private /tmp mount per-process in Linux. As already stated there, Linux namespaces offer great options for security.

In this article we will demonstrate the use of the “network” namespace which enables a process to have independent IPv4 and IPv6 stacks, network interfaces, IP routing tables, iptables firewall rules, the /proc/net and /sys/class/net directory trees, sockets, etc.

Here is a diagram to illustrate the concept:

First we start by creating a pair of “veth” network interfaces:

ip link add v-eth1 type veth peer name v-peer1
ip link set v-eth1 up
ip link set v-peer1 up

One of those interfaces will be used as a communication point from the side of the original default network namespace. We will assign “10.200.1.1” for IP address:

ifconfig v-eth1 10.200.1.1 netmask 255.255.255.0 up

It is time to enter the new network namespace. Once we have created the new namespace, we will associate the second interface “v-peer1” with it, then we will configure an IP address “10.200.1.2” and add a default route through the first interface which will act as a router:

export MAIN_NS_PID="$$"
unshare -n /bin/bash

#
# We are in a "/bin/bash" session in the NEW network namespace now.
#

ip link set lo up # activate the "loopback" interface

nsenter --net="/proc/$MAIN_NS_PID/ns/net" ip link set v-peer1 netns "$$" # join "v-peer1" into this namespace
ifconfig v-peer1 10.200.1.2 netmask 255.255.255.0 up
route add default gw 10.200.1.1 dev v-peer1

#
# Setup is done.
# You can now drop privileges and launch a daemon which will use this confined network namespace.
#

sudo -u www-data /etc/init.d/my-net-daemon start

The original default namespace, our original Linux installation, must be configured to act as a router. Otherwise the processes inside the new network namespace won’t have any Internet access. Configuring a Linux network router is a straightforward task:

echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -P FORWARD DROP
iptables -F FORWARD

iptables -t nat -F
iptables -t nat -A POSTROUTING -s 10.200.1.0/255.255.255.0 -o eth0 -j MASQUERADE

iptables -A FORWARD -i eth0 -o v-eth1 -j ACCEPT
iptables -A FORWARD -o eth0 -i v-eth1 -j ACCEPT

Finally, you can enable inbound connections to the processes in the confined new network namespace. Let’s assume that you have a daemon listening on TCP port 10105. Here is how you can forward any new incoming connections to the processes inside the new network namespace:

iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 10105 -j DNAT --to-destination 10.200.1.2

Pros: Using separate network namespaces gives us full network isolation and control over a group of processes. Additionally, we can match incoming packets against a process which is not possible in a standard “iptables” setup using the “-m owner” match extension. These are huge security benefits.

Cons: The technical implications are that the Linux host has to do (a lot) more work because of the DNAT/SNAT operations and their related connection tracking overhead. If you are running a high traffic server, you should plan and test accordingly. Furthermore, one additional network interfaces pair is created for each new network namespace. Linux can handle hundreds of network devices but still this is a factor to be considered.

The better security features outweigh the drawbacks in most use-cases though. Last but not least, it is very easily to run a process with completely detached network and this won’t cost us anything on the Linux host.

Resources:

Some notes on veth interfaces [main source of technical inspiration]
Block network access of a process? [lists all different approaches]
Command to run a child process “offline” (no external network) on Linux
Introducing Linux Network Namespaces
man page of “nsenter” [run program with namespaces of other processes]

June 4, 2014
by Ivan Zahariev 1 Comment

Private /tmp mount per-process in Linux

I’ve been playing with Linux namespaces and the results are very satisfying. This process isolation has several benefits:

The setup is automatically destroyed when the process and its children exit — easy maintenance.
Non-privileged processes cannot alter the setup — great security.
The isolated resource type is completely invisible by processes in other namespaces — great security.
The setup is inherited by any forked children — great for security and maintenance.

If you review the man page of the “unshare” command or syscall, you will see that currently we can have the following private namespaces:

mount namespace — mounting and unmounting filesystems will not affect rest of the system, except for filesystems which are explicitly marked as shared
UTS namespace — setting hostname, domainname will not affect rest of the system
IPC namespace — the process will have independent namespace for System V message queues, semaphore sets and shared memory segments
network namespace — the process will have independent IPv4 and IPv6 stacks, IP routing tables, iptables firewall rules, the /proc/net and /sys/class/net directory trees, sockets, etc.
pid namespace (new) — children will have a distinct set of PID to process mappings from their parent
user namespace (new) — the process will have a distinct set of UIDs, GIDs and capabilities

In this article we will demonstrate the use of the “mount” namespace which lets us mount a filesystem per-process without affecting the rest of the system. Using such a private mount for “/tmp” has mainly security but also usability benefits.

Here are all the commands which you need, in order to start a process with a private “/tmp” directory:

TARGET_USER='www-data'
TARGET_CMD='/bin/bash' # but it can be any command
NEWTMP="$(mktemp -d)" # securely create a new empty tmp folder

chown "root:$TARGET_USER" "$NEWTMP"
chmod 770 "$NEWTMP"

unshare --mount -- /bin/bash -c "mount -o bind,noexec,nosuid,nodev '$NEWTMP' /tmp && sudo -u '$TARGET_USER' $TARGET_CMD"

A longer version with more explanations follow:

#
# setup operations done as "root"
#

root@vbox:~# TARGET_USER='www-data'
root@vbox:~# TARGET_CMD='/bin/bash' # but it can be any command
root@vbox:~# NEWTMP="$(mktemp -d)" # securely create a new empty tmp folder
root@vbox:~# chown "root:$TARGET_USER" "$NEWTMP"
root@vbox:~# chmod 770 "$NEWTMP"

#
# review the result in the real file-system "/tmp"
#

root@vbox:~# echo $NEWTMP
/tmp/tmp.IyoUhputAW

root@vbox:~# ls -la /tmp
total 60
drwxrwxrwt 12 root   root     12288 Jun  4 13:53 .
drwxr-xr-x 23 root   root      4096 Jan 24 15:31 ..
drwxrwxrwt  2 root   root      4096 Jun  1 22:54 .ICE-unix
drwxrwx---  2 root   www-data  4096 Jun  4 13:53 tmp.IyoUhputAW

root@vbox:~# ls -la "$NEWTMP"
total 16
drwxrwx---  2 root www-data  4096 Jun  4 13:53 .
drwxrwxrwt 12 root root     12288 Jun  4 13:53 ..

#
# start the non-privileged process with a private "/tmp" mount
#

root@vbox:~# unshare --mount -- /bin/bash -c "mount -o bind,noexec,nosuid,nodev '$NEWTMP' /tmp && sudo -u '$TARGET_USER' $TARGET_CMD"

#
# sample operations done inside the non-privileged process
#

www-data@vbox:~$ ls -la / | grep tmp
drwxrwx---   2 root www-data  4096 Jun  4 13:53 tmp

www-data@vbox:~$ touch /tmp/test-www-data-file

www-data@vbox:~$ ls -la /tmp # the process has a private "/tmp" mount
total 8
drwxrwx---  2 root     www-data 4096 Jun  4 13:55 .
drwxr-xr-x 23 root     root     4096 Jan 24 15:31 ..
-rw-r--r--  1 www-data www-data    0 Jun  4 13:55 test-www-data-file

#
# see the result in the real file-system "/tmp"
#

root@vbox:~# ls -la /tmp
total 60
drwxrwxrwt 12 root   root     12288 Jun  4 13:53 .
drwxr-xr-x 23 root   root      4096 Jan 24 15:31 ..
drwxrwxrwt  2 root   root      4096 Jun  1 22:54 .ICE-unix
drwxrwx---  2 root   www-data  4096 Jun  4 13:55 tmp.IyoUhputAW

root@vbox:~# echo "$NEWTMP"
/tmp/tmp.IyoUhputAW

root@vbox:~# ls -la "$NEWTMP"
total 16
drwxrwx---  2 root     www-data  4096 Jun  4 13:55 .
drwxrwxrwt 12 root     root     12288 Jun  4 13:53 ..
-rw-r--r--  1 www-data www-data     0 Jun  4 13:55 test-www-data-file

Note that we are mounting a directory inside another directory using the “bind” mount feature of Linux.

Resources:

Namespaces overview (LWN.net)

July 31, 2013
by Ivan Zahariev 13 Comments

Using flock() in Bash without invoking a subshell

The flock(1) utility on Linux manages flock(2) advisory locks from within shell scripts or the command line. This lets you synchronize your Bash scripts with all your other applications written in Perl, Python, C, etc.

I’ll focus on the third usage form where flock() is used inside a Bash script. Here is what the man page suggests:

#!/bin/bash

(
flock -s 200

# ... commands executed under lock ...

) 200>/var/lock/mylockfile

Unfortunately, this invokes a subshell which has the following drawbacks:

You cannot pass values to variables from the subshell in the main shell script.
There is a performance penalty.
The syntax coloring in “vim” does not work properly. 🙂

This motivated my colleague zImage to come up with a usage form which does not invoke a subshell in Bash:

#!/bin/bash

exec {lock_fd}>/var/lock/mylockfile || exit 1
flock -n "$lock_fd" || { echo "ERROR: flock() failed." >&2; exit 1; }

# ... commands executed under lock ...

flock -u "$lock_fd"

Note that you can skip the “flock -u “$lock_fd” unlock command if it is at the very end of your script. In such a case, your lock file will be unlocked once your process terminates.

May 28, 2013
by Ivan Zahariev Leave a comment

Nagios: Improve CPU performance with popen_noshell()

Today I’ll share my real-world experience with popen_noshell() on the Nagios monitoring server which we run at work. We are actively monitoring 1166 hosts and 14250 services. The machine has 6 GB RAM and a single Intel Core i7-950 CPU with enabled multi-threading (8 total threads) and slight overclock. Besides running Nagios, this machine also handles the incoming data from our custom monitoring systems, processes RRD database storage, and generates web interface status + charts output. So it’s a pretty busy machine which does a lot of network activity and where the Nagios daemon is just a part of the CPU load. For example, since boot the main “nagios3” process has used only 20% of the CPU. The other part has been used by the fork()’ed Perl scripts (we use a lot of them for the active checks), the Nagios standard network checks, and the Apache/PHP web server handling the incoming data.

Recently the machine started to exhaust its CPU resources. First we overclocked it a bit which gave us 10% more CPU idle time. Then we decided to try to compile Nagios with the popen-noshell library. This gave us another 10% CPU idle and now the machine is working great again.

I’ll focus on the popen-noshell integration and results, since CPU overclocking is a well-known topic. Here is the chart which shows the CPU usage before and after we re-compiled Nagios with the popen-noshell library:

As we can see, the system-CPU usage dropped from 38% to 31%, which is an 18% improvement. The user-CPU usage dropped from 44% to 41%, which is a 7% improvement. Overall, we gained a 12% speed-up for our workload by just re-compiling Nagios with the popen-noshell library. I’m stressing out that the speed-up depends a lot on your workload. If this machine was busy only with Nagios and the active checks were more CPU efficient (i.e. not written in Perl but in C), then the speed-up could have been much higher, since popen_noshell() is about 10 times faster than the standard popen().

A list with the other machine metrics which were also affected by the workload change:

Used memory: 39% => 24% (38% less)
Load average: 39 => 46 (18% higher)
Forks rates: 8*61 => 8*61 (created processes/second – no change)

Here are the steps that you need to perform, in order to re-compile the Nagios Debian package by integrating it with the popen-noshell library:

apt-get install devscripts

apt-get build-dep nagios3-core
# No need to run as "root" from here on
apt-get source nagios3-core

svn checkout http://popen-noshell.googlecode.com/svn/trunk/ popen-noshell

cd nagios3-3.2.1/

# BEGIN: patch Nagios to use popen_noshell_compat()

cp ../popen-noshell/popen_noshell.* base/
vi base/Makefile.in
	OBJS=$(BROKER_O) popen_noshell.o 

vi base/utils.c
	#include "popen_noshell.h"
	
        /* run the command */
        struct popen_noshell_pass_to_pclose pclose_arg;
        fp=(FILE *)popen_noshell_compat(cmd,"r",&pclose_arg);

            /* close the command and get termination status */
            status=pclose_noshell(&pclose_arg);

vi base/checks.c
	2x the same as above

# END: patch Nagios to use popen_noshell_compat()

EDITOR=vim dch -i
	# 3.2.1-2+squeeze1 -> 3.2.1-2+squeeze1-noshell1
	# you must have a trailing number in the added version name
	# after exit, this renames the original directory name

cd ..
mv nagios3_3.2.1.orig.tar.gz nagios3_3.2.1-2+squeeze1.orig.tar.gz

# the source directory was renamed by "dch"
cd nagios3-3.2.1-2+squeeze1/
DEB_BUILD_OPTIONS=nocheck debuild -us -uc

cd ..
sudo dpkg -i nagios3-core_3.2.1-2+squeeze1-noshell1_i386.deb \
	nagios3-common_3.2.1-2+squeeze1-noshell1_all.deb \
	nagios3-cgi_3.2.1-2+squeeze1-noshell1_i386.deb \
	nagios3-doc_3.2.1-2+squeeze1-noshell1_all.deb \
	nagios3_3.2.1-2+squeeze1-noshell1_i386.deb

April 17, 2012
by Ivan Zahariev Leave a comment

Locally encrypted secure remote backup over Internet on Linux (iSCSI / TrueCrypt)

Recently I decided to start using Amazon AWS as my backup storage but my paranoid soul wasn’t satisfied until I figured it out how to secure my private data. It’s not that I don’t trust Amazon but a lot of bad things could happen if I decided that I just copy my data to a remote server on Amazon:

Amazon staff would have access to my data.
A breach in Amazon’s systems would expose my data.
A breach in my remote server OS would expose my data.

One of the solutions which I considered was to encrypt my local file-system with eCryptfs but it has some issues with relatively long file names.

Finally I came out with the following working backup solution which I currently use to backup both my Windows and Linux partitions. I share the Windows root directory with the VirtualBox Linux machine and run the backup scripts from there. Here is a short explanation of the properties and features of the backup setup:

Locally encrypted — all files which I store on the iSCSI volume are encrypted on my personal desktop, before being sent to the remote server. This ensures that the files cannot be read by anyone else.
Secure — besides the local volume encryption, the whole communication is done over an SSH tunnel which secures the Internet point-to-point client-to-server communication.
Remote — having a remote backup ensures that even if someone breached in my house and steals my laptop and my offline backup, I can still recover my data from the remote server. Furthermore, it is more convenient to frequently backup on a remote machine, because we have Internet access everywhere now. Note that remote backups are not a substitution for offline backups.
Over Internet — very convenient. Of course, this backup scheme can be used in any TCP/IP network — private LAN, WAN, VPN networks, etc.

The following two articles provide detailed instructions on how to setup the backup solution:

Daily usage example

Here are the commands which I execute, in order to make a backup of my laptop. Those can be further scripted and automated if a daily or more frequent backup is required:

IP=23.21.98.10 # the public DNS IP address of the EC2 instance / server

## Execute the following, in order to mount the remote encrypted iSCSI volume:

sudo -E \
  ssh -F /dev/null \
  -o PermitLocalCommand=yes \
  -o LocalCommand="ifconfig tun0 172.18.0.2 pointopoint 172.18.0.1 netmask 255.255.255.0" \
  -o ServerAliveInterval=60 \
  -w 0:0 root@"$IP" \
  'sudo ifconfig tun0 172.18.0.1 pointopoint 172.18.0.2 netmask 255.255.255.0; hostname; echo tun0 ready'

sudo iscsiadm -m node --targetname "iqn.2012-03.net.famzah:storage.backup" --portal "172.18.0.1:3260" --login
sudo truecrypt --filesystem=none -k "" --protect-hidden=no /dev/sdb
sudo mount /dev/mapper/truecrypt1 /mnt

## You can now work on /mnt -- make a backup, copy files, etc.

ls -la /mnt

## Execute the following, in order to unmount the encrypted iSCSI volume:

sync
sudo umount /mnt
sudo truecrypt -d /dev/sdb
sudo iscsiadm -m node --targetname "iqn.2012-03.net.famzah:storage.backup" --portal "172.18.0.1:3260" --logout
# stop the SSH tunnel

Disaster recovery plan

Any backup is useless if you cannot restore your data. If your main computer is totally out, you would need the following, in order to access your backed up data:

The Amazon AWS login credentials to start the EC2 backup instance.
The root password for the EC2 backup instance, so that you can log in there and then access your backup data by mounting the encrypted iSCSI volume locally on the remote server.

In order to be able to log in to the remote server via SSH, you need to set up the following:

vi /etc/ssh/sshd_config # PasswordAuthentication yes
/etc/init.d/ssh restart
passwd root # set a very long password which you CAN remember

Make sure that you test if you can log in using an SSH client which does not have your SSH key and thus requires you to enter the root password manually.

I do not consider password authentication for the root account to be a security threat here. The backup server is online only during the time a backup is being made, after which I shut it down in order to save money from Amazon AWS. Furthermore, the backup has a new IP address on each new EC2 machine start, so an attacker cannot continue a brute-force attack easily, even if they started it.

April 17, 2012
by Ivan Zahariev Leave a comment

Locally encrypt an iSCSI volume with TrueCrypt on Linux

While this article focuses on iSCSI volumes, it also applies for regular directly attached block devices. If you are in doubt on how to export and attach an iSCSI volume over Internet, you can review the “Secure iSCSI setup via an SSH tunnel on Linux” article.

Locally encrypting a remote iSCSI volume with TrueCrypt has the following advantages:

You don’t need to trust the administrators of the remote machine — they cannot see your files because you are using their storage in a locally encrypted format. Thus your private data is completely safe, as long as your encryption password/key is strong enough.
You have the option to temporarily mount the exported iSCSI volume on the remote server, if you are the owner of the remote server and know the encryption password/key. This is handy if you want to make a local copy of a file from the backup volume without storing the encryption password on the remote server.
TrueCrypt is cross-platform (Windows / Mac OS X / Linux), fast, free, and open-source.

Download and install TrueCrypt

You need to install TrueCrypt wherever you are going to use it — on the client machine and optionally on the server.

# Download the distribution file from the official page:
#   http://www.truecrypt.org/downloads
# Linux -> Console-only (choose 32-bit or 64-bit depending on your local Linux installation)

tar -zxf truecrypt-7.1a-linux-console-x86.tar.gz # 32-bit in this example
sudo ./truecrypt-7.1a-setup-console-x86

truecrypt --version
#>> TrueCrypt 7.1a

Encrypt an iSCSI volume with TrueCrypt

The instructions below assume that the iSCSI volume is attached under “/dev/sdb“. The output of the commands is quoted with “#>>”.

# Encrypt the iSCSI volume
sudo truecrypt -t --create /dev/sdb --volume-type=normal --encryption=AES --hash=RIPEMD-160 --filesystem=ext4 --quick -k ""

# Mount the *volume* (there is no file-system, yet)
sudo truecrypt --filesystem=none -k "" --protect-hidden=no /dev/sdb

# Check that a new "dm-0" device with the same size appeared
cat /proc/partitions
#>> major minor  #blocks  name
#>> ...
#>> 8        16  83886080 sdb
#>> 252       0  83885824 dm-0

# Double-check that this is a TrueCrypt volume
ls -la /dev/mapper/truecrypt1
# /dev/mapper/truecrypt1 -> ../dm-0

# Create a file-system.
# This takes about 30 min for a 80 GB volume @ 1 MBit Internet connection.
sudo mkfs.ext4 /dev/mapper/truecrypt1

# You can now mount and use /dev/mapper/truecrypt1 in any mount-point, as 
# this is a regular block device with an ext4 file-system.
# Remember to unmount it when you are done.
mount /dev/mapper/truecrypt1 /mnt
ls -la /mnt
umount /mnt

# Unmount the encrypted *volume*.
# Make sure that you have ALREADY unmounted the file-system!
sync
sudo truecrypt -d /dev/xvdf

Mount an encrypted iSCSI volume locally on the remote server

The output of the commands is quoted with “#>>”.

# The local block device is "/dev/xvdf"
cat /proc/partitions 
#>> major minor  #blocks  name
#>> ...
#>>   202    80  83886080 xvdf

#
# MAKE SURE that no iSCSI clients are using the volume now
#

# Mount an encrypted volume (/dev/xvdf).
# The unencrypted volume will be presented under a different device name (/dev/mapper/truecrypt1).
sudo truecrypt --filesystem=none -k "" --protect-hidden=no /dev/xvdf

# Mount the file-system
sudo mount /dev/mapper/truecrypt1 /mnt
# Access the encrypted files
ls -la /mnt
# Unmount the file-system
sudo umount /mnt

# Unmount the encrypted volume (/dev/mapper/truecrypt1 -> /dev/xvdf).
# Make sure that you have ALREADY unmounted the file-system!
sudo truecrypt -d /dev/xvdf

April 17, 2012
by Ivan Zahariev Leave a comment

Secure iSCSI setup via an SSH tunnel on Linux

This article will demonstrate how to export a raw block storage device over Internet in a secure manner. Re-phrased this means that you can export a hard disk from a remote machine and use it on your local computer as it was a directly attached disk, thanks to iSCSI. Authentication and secure transport channel is provided by an SSH tunnel (more info). The setup has been tested on Ubuntu 11.10 Oneiric.

Server provisioning

Amazon AWS made it really simple to deploy a server setup in a minute:

Launch a Micro EC2 instance and then install Ubuntu server by clicking on the links in the Ubuntu EC2StartersGuide, section “Official Ubuntu Cloud Guest Amazon Machine Images (AMIs)”.
Create an EBS volume in the same availability zone. Attach it to the EC2 instance as “/dev/sdf” (seen as “/dev/xvdf” in latest Ubuntu versions).
(optionally) Allocate an Elastic IP address and associate it with the EC2 instance.

Note that you can lower your AWS bill by buying a Reserved instance slot. Those slots are non-refundable and non-transferrable, so shop wisely. You can also stop the EC2 instance when you’re not using it and you won’t be billed for it but only for the allocated EBS volume storage.

You can use any other dedicated or virtual server which you own and can access by IP. An Amazon AWS EC2 instance is given here only as an example.

iSCSI server-side setup

Execute the following on your server (iSCSI target):

IP=23.21.98.10 # the public DNS IP address of the EC2 instance / server

# Log in to the server
ssh ubuntu@$IP
# Update your SSH key in ".ssh/authorized_keys", if needed.
sudo bash
cp /home/ubuntu/.ssh/authorized_keys /root/.ssh/ # so that we can log in directly as root

apt-get update
apt-get upgrade

apt-get install linux-headers-virtual # virtual because we're running an EC2 instance
apt-get install iscsitarget iscsitarget-dkms
perl -pi -e 's/^ISCSITARGET_ENABLE=.*$/ISCSITARGET_ENABLE=true/' /etc/default/iscsitarget

# We won't use any iSCSI authentication because the server is totally firewalled
# and we access it only using an SSH tunnel.
# NOTE: If you don't use Amazon EC2, make sure that you firewall this machine completely,
# leaving only SSH access (TCP port 22).

# update your block device location in "Path", if needed
cat >> /etc/iet/ietd.conf <<EOF
Target iqn.2012-03.net.famzah:storage.backup
   Lun 0 Path=/dev/xvdf,Type=fileio
EOF

/etc/init.d/iscsitarget restart

echo 'PermitTunnel yes' >> /etc/ssh/sshd_config
/etc/init.d/ssh restart

iSCSI client-side setup

Execute the following on your client / desktop machine (iSCSI initiator):

# Install the iSCSI client
sudo apt-get install open-iscsi

How to attach an iSCSI volume on the client

The following commands show how to attach and detach a remote iSCSI volume on the client machine. The output of the commands is quoted with “#>>”.

IP=23.21.98.10 # the public DNS IP address of the EC2 instance / server

# Establish the secure SSH tunnel to the remote server
sudo -E \
  ssh -F /dev/null \
  -o PermitLocalCommand=yes \
  -o LocalCommand="ifconfig tun0 172.18.0.2 pointopoint 172.18.0.1 netmask 255.255.255.0" \
  -o ServerAliveInterval=60 \
  -w 0:0 root@"$IP" \
  'sudo ifconfig tun0 172.18.0.1 pointopoint 172.18.0.2 netmask 255.255.255.0; hostname; echo tun0 ready'

# Make sure that we can reach the remote server via the SSH tunnel
ping 172.18.0.1

# Execute this one-time; it discovers the available iSCSI volumes
sudo iscsiadm -m discovery -t st -p 172.18.0.1
#>> 172.18.0.1:3260,1 iqn.2012-03.net.famzah:storage.backup

# Attach the remote iSCSI volume on the local machine
sudo iscsiadm -m node --targetname "iqn.2012-03.net.famzah:storage.backup" --portal "172.18.0.1:3260" --login
#>> Logging in to [iface: default, target: iqn.2012-03.net.famzah:storage.backup, portal: 172.18.0.1,3260]
#>> Login to [iface: default, target: iqn.2012-03.net.famzah:storage.backup, portal: 172.18.0.1,3260]: successful

# Check the kernel log
dmesg
#>> [ 1237.538172] scsi3 : iSCSI Initiator over TCP/IP
#>> [ 1238.657846] scsi 3:0:0:0: Direct-Access     IET      VIRTUAL-DISK     0    PQ: 0 ANSI: 4
#>> [ 1238.662985] sd 3:0:0:0: Attached scsi generic sg2 type 0
#>> [ 1239.578079] sd 3:0:0:0: [sdb] 167772160 512-byte logical blocks: (85.8 GB/80.0 GiB)
#>> [ 1239.751271] sd 3:0:0:0: [sdb] Write Protect is off
#>> [ 1239.751279] sd 3:0:0:0: [sdb] Mode Sense: 77 00 00 08
#>> [ 1240.099649] sd 3:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
#>> [ 1241.962729]  sdb: unknown partition table
#>> [ 1243.568470] sd 3:0:0:0: [sdb] Attached SCSI disk

# Double-check that the iSCSI volume is with the expected size (80 GB in our case)
cat /proc/partitions
#>> major minor  #blocks  name
#>> ...
#>> 8       16   83886080 sdb

# The remote iSCSI volume is now available under /dev/sdb on our local machine.
# You can use it as any other locally attached hard disk (block device).

# Detach the iSCSI volume from the local machine
sync
sudo iscsiadm -m node --targetname "iqn.2012-03.net.famzah:storage.backup" --portal "172.18.0.1:3260" --logout
#>> Logging out of session [sid: 1, target: iqn.2012-03.net.famzah:storage.backup, portal: 172.18.0.1,3260]
#>> Logout of [sid: 1, target: iqn.2012-03.net.famzah:storage.backup, portal: 172.18.0.1,3260]: successful

# Check the kernel log
dmesg
#>> [ 1438.942277]  connection1:0: detected conn error (1020)

# Double-check that the iSCSI volume is no longer available on the local machine
cat /proc/partitions
#>> no "sdb"

Once you have the iSCSI block device volume attached on your local computer, you can use it as you need, just like it was a normal hard disk. Only it will be slower because each I/O operation takes place over Internet. For example, you can locally encrypt the iSCSI volume with TrueCrypt, in order to prevent the administrators of the remote machine to be able to see your files.

References:

Setting Up An iSCSI Environment On Linux

January 6, 2012
by Ivan Zahariev 6 Comments

Bind a shell on Linux and reverse-connect to it through a firewall

There are situations when a friend is in need of Linux help, and the only way for you to help them is to log in to their machine and fix the problem yourself, instead of trying to explain over the phone all the steps to your friend.

Such a problem has two sub-problems:

The remote machine must accept incoming connections and provide you with shell access. The obvious way to achieve this is an SSH daemon. Many Desktop Linux distributions don’t install an SSH server by default though, for security reasons. Setting up an SSH server in this moment is slow, and could even not be possible, if your friend messed up with the packaging system, for example. So we need to find an easy way to bind a network shell on the remote machine.
We must be able to connect to the remote machine. Usually desktop machines are protected behind a firewall or NAT, and we cannot connect to them directly. If this is not the case for you, you can skip this step and just connect to the remote machine IP address. A common approach to overcome this problem is that the remote machine connects to a machine of yours, which has an accessible real IP address and has a running SSH server. Most Desktop Linux distributions have an SSH client installed by default. So all you need to do is quickly and temporarily set up an account with password authentication for your friend on your machine. Then let them log in there which will create a reverse tunnel back to their machine.

Bind a shell

Another useful tool which is usually available on Linux is the Netcat, the Swiss-army knife for TCP/IP. In order to bind a shell using the Netcat version available on Ubuntu/Debian, you need to execute the following:

mkfifo /tmp/mypipe
# user shell
cat /tmp/mypipe|/bin/bash 2>&1|nc -l 6000 >/tmp/mypipe

I got this awesome idea from a user comment. I only extended it a bit by adding “2>&1” which redirects the STDERR error messages to the remote network client too.

Once the above has been executed on the remote machine, anyone can connect on TCP port 6000, assuming that there is no firewall. Note that you have to connect via Netcat again. A connection via Telnet adds an additional “\r” at every line end, which confuses Bash. If you need to perform actions as “root” on the remote machine, the shell needs to be executed as “root”:

mkfifo /tmp/mypipe
# root shell
cat /tmp/mypipe|sudo /bin/bash 2>&1|nc -l 6000 >/tmp/mypipe

If you are worried that your friend will mistype something, save the commands to a text file on a web server, and let them download it using “wget” or “curl”. Example:

wget http://www.famzah.net/download/bind-shell.txt
# or
curl http://www.famzah.net/download/bind-shell.txt > bind-shell.txt

chmod +x bind-shell.txt
./bind-shell.txt

Reverse connect using an SSH tunnel

The ssh client has the ability to forward a local port (review Reversing an ssh connection for a detailed example). Once you’ve set up an account for your friend, you ask them to connect to your machine:

ssh -R 6000:127.0.0.1:6000 $IP_OF_YOUR_MACHINE

Once your friend has connected to your machine, you can connect to theirs using the reverse SSH tunnel by executing the following:

nc 127.0.0.1 6000

The connection to 127.0.0.1 on TCP port 6000 is actually forwarded by SSH to the remote machine of your friend on their TCP port 6000.

Note that once you disconnect from the “nc” session, the Netcat server on the remote machine exists and needs to be restarted if you need to connect again.

/contrib/famzah

Enthusiasm never stops

Tag Archives: Linux

“iperf” and “iftop” accuracy

Linux md-RAID scalability on a 10 Gigabit network

Private networking per-process in Linux

Private /tmp mount per-process in Linux

Using flock() in Bash without invoking a subshell

Nagios: Improve CPU performance with popen_noshell()

Locally encrypted secure remote backup over Internet on Linux (iSCSI / TrueCrypt)

Daily usage example

Disaster recovery plan

Locally encrypt an iSCSI volume with TrueCrypt on Linux

Download and install TrueCrypt

Encrypt an iSCSI volume with TrueCrypt

Mount an encrypted iSCSI volume locally on the remote server

Secure iSCSI setup via an SSH tunnel on Linux

Server provisioning

iSCSI server-side setup

iSCSI client-side setup

How to attach an iSCSI volume on the client

Bind a shell on Linux and reverse-connect to it through a firewall