/contrib/famzah

Enthusiasm never stops


Leave a comment

Unexpected issues with the AWS opt-in regions

AWS cloud services operate from many different regions (18 as of today).

It wasn’t long before I stumbled across the first problem — not all of them are enabled by default. The documentation says “Regions introduced after March 20, 2019, such as Asia Pacific (Hong Kong) and Middle East (Bahrain), are disabled by default. You must enable these Regions before you can use them.”

Enabling the Hong Kong (ap-east-1) and Bahrain (me-south-1) regions was super easy by following the documentation. I could manage all resources from the AWS web console.

Today I tried some operations from the AWS Command line interface (CLI) and got the following errors:

An error occurred (IllegalLocationConstraintException) when calling the DeleteBucket operation: The ap-east-1 location constraint is incompatible for the region specific endpoint this request was sent to.

An error occurred (IllegalLocationConstraintException) when calling the DeleteBucket operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.

fatal error: An error occurred (IllegalLocationConstraintException) when calling the ListObjects operation: The ap-east-1 location constraint is incompatible for the region specific endpoint this request was sent to.

fatal error: An error occurred (IllegalLocationConstraintException) when calling the ListObjects operation: The me-south-1 location constraint is incompatible for the region specific endpoint this request was sent to.

fatal error: An error occurred (InvalidToken) when calling the ListObjectsV2 operation: The provided token is malformed or otherwise invalid.

It turns out that the CLI authenticates using the “global” endpoint of the AWS Security Token Service (AWS STS). And by default, the “global” STS endpoint will not work with the two new regions: Hong Kong (ap-east-1) and Bahrain (me-south-1). There is an official documentation on how to fix this compatibility issue by making the “global” STS endpoint “Valid in all AWS Regions”.

If you do a lot of AWS API calls, it’s probably worth to consider the new default of AWS and to try the “regional” STS endpoints: “AWS recommends using Regional AWS STS endpoints instead of the global endpoint to reduce latency, build in redundancy, and increase session token validity.” This is already supported in the CLI, too.


Leave a comment

Amazon EFS benchmarks

The Amazon Elastic File System (EFS) is a very intriguing storage product. It provides simple, scalable, elastic file storage for use on an EC2 virtual machine. The file system can be mounted over NFS at one or more EC2 machines simultaneously, and it also supports file locking.

Here are some important facts which I found out while doing my tests:

  • I/O operations per second (IOPS) are not the same metric that we’re used to measure when dealing with block devices like HDD or SSD disks. When working with EFS, we measure the NFS I/O operations per second. These correspond 1:1 to the read() or write() system calls that your applications make.
  • The size of the issued I/O requests are another very important metric for EFS. This is the real bytes transfer between your EC2 instance and the NFS server.
  • Therefore, we’re limited by both the NFS I/O requests per second, and the total transferred bytes for those NFS I/O per second.
  • The EFS performance and EFS limits documentation pages give a lot of insight. You have to monitor your EFS metrics using CloudWatch.
  • NFS I/O requests smaller than 4096 bytes are accounted as 4096 bytes. Regardless if you request 1 bytes, 1000 bytes, or 4096 bytes, you will get 4096 bytes accounted. Once you request more than 4096 bytes, they are accounted correctly.
  • You need more than one reader/writer thread or program, in order to achieve the full IOPS potential. One writer thread in my tests did 130 op/s, while 20 writer threads did 1500 op/s, for example.
  • The documentation says: “In General Purpose mode, there is a limit of 7000 file system operations per second. This operations limit is calculated for all clients connected to a single file system”. Our tests confirm this — we could do 3500 reading or 3000 writing operations per second.
  • CloudWatch has different aggregation functions for the *IOBytes metrics: min/max/average; sum; count. They represent different aspects of your EFS metrics, namely: the min/max/average IO operation size in bytes; the total transferred bytes in a minute (you need to divide to 60 to get the “per second” value); the total operations in a minute (you need to divide to 60 to get the “per second” value).
  • The CloudWatch EFS metrics “DataReadIOBytes” and “DataWriteIOBytes” reflect exactly what we see on the Linux system for “kB/s” and “ops/s” by the nfsiostat program. The transferred bytes reflect exactly the used bandwidth on the Linux network interfaces.
  • The “Metered size” in the AWS Console which is the same value as what you see by the “df” command is not updated in real-time. It could take more than an hour to reflect the real disk usage.
  • There is plenty of initial burst credit balance which lets you do some heavy I/O on your freshly created EFS file system. Our benchmark tests ran for hours with block sizes between 1 byte and 10k bytes, and we still had some positive burst credit balance left at the end.

I’m using the default NFS settings by the NFS mount helper provided in the “Amazon Linux 2” OS:

[root@ip-172-31-11-75 ~]# mount -t efs fs-7513e02c:/ /efs

[root@ip-172-31-11-75 ~]# mount
fs-7513e02c.efs.eu-central-1.amazonaws.com:/ on /efs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.31.11.75,local_lock=none,addr=172.31.15.76)

The tests were performed using two “m4.xlarge” EC2 instances in the “eu-central” AWS region. This EC2 instance type provides “High” network performance.

The NFS I/O operations per second limits were tested using a simple C program which basically does the following:

fd = open(testfile, O_RDWR|O_DIRECT|O_SYNC);

while (1) {
  lseek(fd, SEEK_SET, 0);

  read(fd, buf, sizeof(buf));
  // or
  write(fd, buf, sizeof(buf));
}

I created 40 different files, so that I can run 40 different single benchmark programs on an EC2 instance – one for each file. This increases concurrency and lets the total throughput scale better.

Sequential writing and reading

Sequential writing and reading performed as expected – up to the “PermittedThroughput” limit shown in the CloudWatch metrics. In my case, for such a small EFS file system, the limit was 105 MB/s.

Writing: NFS I/O operations per second

Here are the results:

  • Writing from one EC2 instance using 1 byte, 1k bytes, or 10k bytes: regardless of the request size, we get up to 2000 IOPS. Typically the IOPS are between 1400 and 1700.
  • Writing from two EC2 instances using 1 byte, 1k bytes, or 10k bytes: regardless of the request size, we get up to 3000 IOPS in total which are equally spread across the two EC2 instances.
  • The “PercentIOLimit” CloudWatch metric shows 84% when we do 2880 ops/s, for example. Therefore, the total IOPS limit for writing is about 3500 ops/s.
  • When doing only write() system calls with 1 byte data, only “DataWriteIOBytes” is accounted by EFS which is an advantage for us. A real block file system needs to read the block (usually 4k bytes), update 1 byte in it, and then write it back on disk. I feel like this needs additional testing with more random data, so test for yourself, too. Note that the minimum accounted request size in EFS is 4kB.

Reading: NFS I/O operations per second

Here are the results:

  • Reading from one EC2 instance using 1 byte or 10k bytes: regardless of the request size, we get up to 3500 IOPS. One EC2 instance is enough to saturate the EFS limit.
  • Reading from two EC2 instances using 1 byte or 10k bytes: regardless of the request size, we get up to 3500 IOPS in total which are equally spread across the two EC2 instances.
  • The “PercentIOLimit” CloudWatch metric shows 100% when we do 3500 ops/s. Therefore, the total IOPS limit for reading is 3500 ops/s.


1 Comment

Two AWS CLI tips for S3 — UTF-8 when piping, and migrating the Storage Class

While working on the “youtube-mp3-archive” project, I stumbled across two issues which are worth to be documented for future use.

“aws s3 ls” shows “???” instead of the UTF-8 key names of the S3 objects

On my machine this happens when I pipe the output of “aws s3 ls” to another program. Here is an example:

$ aws s3 ls --recursive s3://youtube-mp3.famzah/ | tee | grep 4185710
2016-10-30 08:08:49    4185710 mp3/Youtube/??????? - ?? ???? ?????-BF6KuR8vWN0.mp3

There is already a discussion about this at the AWS CLI project. The solution in my case was to tamper with the PYTHONIOENCODING environment variable and force UTF-8:

$ PYTHONIOENCODING=utf8 aws s3 ls --recursive s3://youtube-mp3.famzah/ | tee | grep 4185710
2016-10-30 08:08:49    4185710 mp3/Youtube/Аналгин - Тя беше ангел-BF6KuR8vWN0.mp3

How to convert all stored S3 objects to another Storage Class

As already explained, the Storage Class cannot be set on a per-bucket basis. It must be specified with each upload operation in your client.

The migration procedure is already documented at the AWS CLI project. Here are the commands to check the current Storage Class of all objects in an S3 bucket, and how to convert them to a different Storage Class:

# all our S3 objects are using the "Standard" Storage Class
$ aws s3api list-objects --bucket youtube-mp3.famzah | grep StorageClass | sort | uniq -c
749  "StorageClass": "STANDARD"

# convert without re-uploading the objects from your computer
aws s3 cp --recursive --storage-class STANDARD_IA s3://youtube-mp3.famzah/ s3://youtube-mp3.famzah/

# all our S3 objects are now using the "Standard-Infrequent" Storage Class
$ aws s3api list-objects --bucket youtube-mp3.famzah | grep StorageClass | sort | uniq -c
749  "StorageClass": "STANDARD_IA"

The reason to use a different Storage Class is pricing.

AWS S3 icon by isdownrightnow.net


Leave a comment

Dynamic DNS using AWS Route 53

The Internet ecosystem and technologies advanced so much lately that you can rebuild an entire business from scratch in a few hours of coding and at pretty acceptable costs. I’m referring to the dynamic DNS (aka. DDNS or DynDNS) service which was a hit a few years back. It took me less than a hundred lines of code to create a simple dynamic DNS using AWS Route 53. The AWS API and backend provide the DNS service, while the free service “ipify” lets you look up your real remote IP address. While this solution is not free as speech, it’s free as beer and costs less than a dollar per month.

DNS icon by PRchecker


4 Comments

Goodbye Acronis cloud — Hello Encrypted S3 backup!

Over time the backup strategies for my personal laptop are changing in the search for the most cost-effective, robust and secure solution. And it must be able to back up both my Windows host and Linux virtual machine.

  • I tried a backup to an AWS EC2 instance for a while but this was expensive.
  • I then changed to Acronis Cloud backup because I’m very satisfied with their local hard disk backups. But their online cloud backup was an unpleasant experience. The cloud backup failed without indication in the taskbar; when I clicked for more info, the cryptic “error(0x49052524) in lib; please contact support” was displayed; I contacted support to no avail — but they wanted me to reinstall; it fixed itself after a dozen of days; this has happened two times in a few months; last but not least, when I wanted to browse my online backup the web interface was really slow. Sorry Acronis, but you really disappointed me.

Now I’ve come to an open-source solution for my backup needs — the Encrypted S3 Backup written in Bash based on the official Amazon Command-Line Interface (CLI). This simple backup system leaves control and visibility in your hands. Additionally, the backup scripts are very small and you can easily audit them. The README provides all information about the design, security, usage, disaster recovery, etc. More or less, it’s a solution for Linux technical guys, and not really suited for end-uses who should try Duplicati instead. And it doesn’t back up an “image” of your system but it is file-based. Only the file data is archived, so you can’t restore the file owners, permissions and other meta info.

Let’s review the pricing side. In my case I’m doing a daily backup for 125 GB data in 320,000 files.

  • The incremental daily backup costs me $2.73 per month. 89% is the cost for S3 (mainly the GB-storage cost) and the rest is for bandwidth.
  • The initial one-time upload of 70 GB costed me $3.43. Expect about double for 125 GB.
  • The projected cost for a full restore is $11.59 where 96% is the price of the used bandwidth from S3 to Internet.
  • All prices are without taxes.

As far as performance is concerned, S3 is great!

  • Browsing my backup versions in the online S3 explorer is lightning fast.
  • The daily sync for 125 GB data in 320,000 files takes 23 minutes. I don’t change a lot of files on my laptop during my daily activities.
  • My initial upload performed with a speed of 10 MBytes/s, and it could have been faster if I had more than 80 Mbit/s Internet at my disposal.

Note that in the end you need to trust AWS S3 to encrypt your data server-side, and then to completely forget your original data.

Backup icon by PRchecker


2 Comments

Securely avoid SSH warnings for changing IP addresses

If you have servers that change their IP address, you’ve probably already been used to the following SSH warning:

The authenticity of host '176.34.91.245 (176.34.91.245)' can't be established.
...
Are you sure you want to continue connecting (yes/no)? yes

Besides from being annoying, it is also a security risk to blindly accept this warning and continue connecting. And be honest — almost none of us check the fingerprint in advance every time.

A common scenario for this use case is when you have an EC2 server in Amazon AWS which you temporarily stop and then start, in order to cut costs. I have a backup server which I use in this way.

In order to securely avoid this SSH warning and still be sure that you connect to your trusted server, you have to save the fingerprint in a separate file and update the IP address in it every time before you connect. Here are the connect commands, which you can also encapsulate in a Bash wrapper script:

IP=176.34.91.245 # use an IP address here, not a hostname
FPFILE=~/.ssh/aws-backup-server.fingerprint

test -e "$FPFILE" && perl -pi -e "s/^\S+ /$IP /" "$FPFILE"
ssh -o StrictHostKeyChecking=ask -o UserKnownHostsFile="$FPFILE" root@$IP

Note that the FPFILE is not required to exist on the first SSH connect. The first time you connect to the server, the FPFILE will be created when you accept the SSH warning. Further connects will not show an SSH warning or ask you to accept the fingerprint again.