Enthusiasm never stops

Leave a comment

ESP32 custom NodeMCU firmware build and flash

Here is my success story on how to compile and flash your own firmware for NodeMCU running on the ESP32 microcontroller. Although there is official documentation on how to achieve this, there were some minor issues that I stumbled across.

Using the Docker image is the recommended and most easy way to proceed. I didn’t have a Docker server so I started a cheap VPS from Digital Ocean. They have a one-click installer for Docker. Once the VPS machine is running, execute the following which covers the step “Clone the NodeMCU firmware repository” from the docs:

### log in via SSH to the Docker host machine

cd /opt

git clone --recurse-submodules https://github.com/nodemcu/nodemcu-firmware.git

cd nodemcu-firmware

git checkout dev-esp32
git submodule update --recursive
# https://github.com/marcelstoer/docker-nodemcu-build/issues/58
git submodule update --init

Then follow the instructions for the step “Build for ESP32” from the docs:

docker run --rm -ti -v `pwd`:/opt/nodemcu-firmware marcelstoer/nodemcu-build configure-esp32
docker run --rm -ti -v `pwd`:/opt/nodemcu-firmware marcelstoer/nodemcu-build build

Note: If you use the Heltec ESP32 WiFi Kit 32 with OLED, when configuring the firmware options you have to change the Main XTAL frequency from the default 40 Mhz to either “Autodetect” or 26 MHz, as explained here. I selected “Autodetect” (26 MHz) under “Component config” -> “ESP32-Specific” -> “Main XTAL frequency”.

For the impatient — most probably the only configuration you need to do is in the section “Component config” -> “NodeMCU modules”.

I encountered a compilation error which was caused by a defined-but-not-used function:

make[2]: Entering directory '/opt/nodemcu-firmware/build/modules'
CC build/modules/mqtt.o
CC build/modules/ucg.o
/opt/nodemcu-firmware/components/modules/ucg.c:598:12: error: 'ldisplay_hw_spi' defined but not used [-Werror=unused-function]
 static int ldisplay_hw_spi( lua_State *L, ucg_dev_fnptr device, ucg_dev_fnptr extension )
cc1: some warnings being treated as errors
/opt/nodemcu-firmware/sdk/esp32-esp-idf/make/component_wrapper.mk:285: recipe for target 'ucg.o' failed
make[2]: *** [ucg.o] Error 1
make[2]: Leaving directory '/opt/nodemcu-firmware/build/modules'
/opt/nodemcu-firmware/sdk/esp32-esp-idf/make/project.mk:505: recipe for target 'component-modules-build' failed
make[1]: *** [component-modules-build] Error 2

You can simply comment out the function in the firmware sources or configure to ignore the warning. Start a shell into the Docker image and edit the sources:

docker run --rm -ti -v `pwd`:/opt/nodemcu-firmware marcelstoer/nodemcu-build bash
# you are in the Docker image now; fix the source code

vi /opt/nodemcu-firmware/components/modules/ucg.c # add the following (including the starting shebang sign):
#pragma GCC diagnostic ignored "-Wunused-function"


# you are in the Docker host; redo the build as usual
docker run --rm -ti -v `pwd`:/opt/nodemcu-firmware marcelstoer/nodemcu-build build

Because it took me a few attempts to configure everything properly and to finally build the image, I used the following alternative commands to make the process more easy:

docker run --rm -ti -v `pwd`:/opt/nodemcu-firmware marcelstoer/nodemcu-build bash
# you are in the Docker image now
cd /opt/nodemcu-firmware
make menuconfig

Once the build completes it produces three files on the Docker host:


Copy those three files on your local development machine where you are about to flash the NodeMCU controller. My first attempt was rather naive and the chip wouldn’t boot. It issued the following error on the USB console:

Early flash read error: "flash read err, 1000"

It turned out that I accidentally overwrote the bootloader and the partitions. Fortunately, the ESP32 official docs explain this in details here and here.

Here is what worked for me to flash my custom NodeMCU firmware into ESP32 from my Windows workstation:

esptool.py.exe --chip esp32 -p COM5 erase_flash

esptool.py.exe --chip esp32 -p COM5 write_flash 0x1000 .\bootloader.bin 0x8000 .\partitions_singleapp.bin 0x10000 .\NodeMCU.bin



posix_spawn() performance benchmarks and usage examples

The glibc library has an efficient posix_spawn() implementation since glibc version 2.24 (2016-08-05). I have awaited this feature for a long time.

TL;DR: posix_spawn() in glibc 2.24+ is really fast. You should replace the old system() and popen() calls with posix_spawn().

Today I ran all benchmarks of the popen_noshell() library, which basically emulates posix_spawn(). Here are the results:

Test Uses pipes User CPU System CPU Total CPU Slower with
vfork() + exec(), standard Libc No 7.4 1.6 9.0
the new noshell, default clone(), compat=1 Yes 7.7 2.1 9.7 8%
the new noshell, default clone(), compat=0 Yes 7.8 2.0 9.9 9%
posix_spawn() + exec() no pipes, standard Libc No 9.4 2.0 11.5 27%
the new noshell, posix_spawn(), compat=0 Yes 9.6 2.7 12.3 36%
the new noshell, posix_spawn(), compat=1 Yes 9.6 2.7 12.3 37%
fork() + exec(), standard Libc No 40.5 43.8 84.3 836%
the new noshell, debug fork(), compat=1 No 41.6 45.2 86.8 863%
the new noshell, debug fork(), compat=0 No 41.6 45.3 86.9 865%
system(), standard Libc No 67.3 48.1 115.4 1180%
popen(), standard Libc Yes 70.4 47.1 117.5 1204%

The fastest way to run something externally is to call vfork() and immediately exec() after it. This is the best solution if you don’t need to capture the output of the command, nor you need to supply any data to its standard input. As you can see, the standard system() call is about 12 times slower in performing the same operation. The good news is that posix_spawn() + exec() is almost as fast as vfork() + exec(). If we don’t care about the 27% slowdown, we can use the standard posix_spawn() interface.

It gets more complicated and slower if you want to capture the output or send data to stdin. In such a case you have to duplicate stdin/stdout descriptors, close one of the pipe ends, etc. The popen_noshell.c source code gives a full example of all this work.

We can see that the popen_noshell() library is still the fastest option to run an external process and be able to communicate with it. The command popen_noshell() is just 8% slower than the absolute ideal result of a simple vfork() + exec().

There is another good news — posix_spawn() is also very efficient! It’s a fact that it lags with 36% behind the vfork() + exec() marker, but still it’s 12 times faster than the popen() old-school glibc alternative. Using the standard posix_spawn() makes your source code easier to read, better supported for bugs by the mainstream glibc library, and you have no external library dependencies.

The replacement of system() using posix_spawn() is rather easy as we can see in the “popen-noshell/performance_tests/fork-performance.c” function posix_spawn_test():

# the same as system() but using posix_spawn() which is 12 times faster
void posix_spawn_test() {
	pid_t pid;
	char * const argv[] = { "./tiny2" , NULL };

	if (posix_spawn(&pid, "./tiny2", NULL, NULL, argv, environ) != 0) {
		err(EXIT_FAILURE, "posix_spawn()");


If you want to communicate with the external process, there are a few more steps which you need to perform like creating pipes, etc. Have a look at the source code of “popen_noshell.c“. If you search for the string “POPEN_NOSHELL_MODE”, you will find two alternative blocks of code — one for the standard way to start a process and manage pipes in C, and the other block will show how to perform the same steps using the posix_spawn() family functions.

Please note that posix_spawn() is a completely different implementation than system() or popen(). If it’s not safe to use the faster way, posix_spawn() may fall back to the slow fork().

Leave a comment

Amazon EFS benchmarks

The Amazon Elastic File System (EFS) is a very intriguing storage product. It provides simple, scalable, elastic file storage for use on an EC2 virtual machine. The file system can be mounted over NFS at one or more EC2 machines simultaneously, and it also supports file locking.

Here are some important facts which I found out while doing my tests:

  • I/O operations per second (IOPS) are not the same metric that we’re used to measure when dealing with block devices like HDD or SSD disks. When working with EFS, we measure the NFS I/O operations per second. These correspond 1:1 to the read() or write() system calls that your applications make.
  • The size of the issued I/O requests are another very important metric for EFS. This is the real bytes transfer between your EC2 instance and the NFS server.
  • Therefore, we’re limited by both the NFS I/O requests per second, and the total transferred bytes for those NFS I/O per second.
  • The EFS performance and EFS limits documentation pages give a lot of insight. You have to monitor your EFS metrics using CloudWatch.
  • NFS I/O requests smaller than 4096 bytes are accounted as 4096 bytes. Regardless if you request 1 bytes, 1000 bytes, or 4096 bytes, you will get 4096 bytes accounted. Once you request more than 4096 bytes, they are accounted correctly.
  • You need more than one reader/writer thread or program, in order to achieve the full IOPS potential. One writer thread in my tests did 130 op/s, while 20 writer threads did 1500 op/s, for example.
  • The documentation says: “In General Purpose mode, there is a limit of 7000 file system operations per second. This operations limit is calculated for all clients connected to a single file system”. Our tests confirm this — we could do 3500 reading or 3000 writing operations per second.
  • CloudWatch has different aggregation functions for the *IOBytes metrics: min/max/average; sum; count. They represent different aspects of your EFS metrics, namely: the min/max/average IO operation size in bytes; the total transferred bytes in a minute (you need to divide to 60 to get the “per second” value); the total operations in a minute (you need to divide to 60 to get the “per second” value).
  • The CloudWatch EFS metrics “DataReadIOBytes” and “DataWriteIOBytes” reflect exactly what we see on the Linux system for “kB/s” and “ops/s” by the nfsiostat program. The transferred bytes reflect exactly the used bandwidth on the Linux network interfaces.
  • The “Metered size” in the AWS Console which is the same value as what you see by the “df” command is not updated in real-time. It could take more than an hour to reflect the real disk usage.
  • There is plenty of initial burst credit balance which lets you do some heavy I/O on your freshly created EFS file system. Our benchmark tests ran for hours with block sizes between 1 byte and 10k bytes, and we still had some positive burst credit balance left at the end.

I’m using the default NFS settings by the NFS mount helper provided in the “Amazon Linux 2” OS:

[root@ip-172-31-11-75 ~]# mount -t efs fs-7513e02c:/ /efs

[root@ip-172-31-11-75 ~]# mount
fs-7513e02c.efs.eu-central-1.amazonaws.com:/ on /efs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=,local_lock=none,addr=

The tests were performed using two “m4.xlarge” EC2 instances in the “eu-central” AWS region. This EC2 instance type provides “High” network performance.

The NFS I/O operations per second limits were tested using a simple C program which basically does the following:

fd = open(testfile, O_RDWR|O_DIRECT|O_SYNC);

while (1) {
  lseek(fd, SEEK_SET, 0);

  read(fd, buf, sizeof(buf));
  // or
  write(fd, buf, sizeof(buf));

I created 40 different files, so that I can run 40 different single benchmark programs on an EC2 instance – one for each file. This increases concurrency and lets the total throughput scale better.

Sequential writing and reading

Sequential writing and reading performed as expected – up to the “PermittedThroughput” limit shown in the CloudWatch metrics. In my case, for such a small EFS file system, the limit was 105 MB/s.

Writing: NFS I/O operations per second

Here are the results:

  • Writing from one EC2 instance using 1 byte, 1k bytes, or 10k bytes: regardless of the request size, we get up to 2000 IOPS. Typically the IOPS are between 1400 and 1700.
  • Writing from two EC2 instances using 1 byte, 1k bytes, or 10k bytes: regardless of the request size, we get up to 3000 IOPS in total which are equally spread across the two EC2 instances.
  • The “PercentIOLimit” CloudWatch metric shows 84% when we do 2880 ops/s, for example. Therefore, the total IOPS limit for writing is about 3500 ops/s.
  • When doing only write() system calls with 1 byte data, only “DataWriteIOBytes” is accounted by EFS which is an advantage for us. A real block file system needs to read the block (usually 4k bytes), update 1 byte in it, and then write it back on disk. I feel like this needs additional testing with more random data, so test for yourself, too. Note that the minimum accounted request size in EFS is 4kB.

Reading: NFS I/O operations per second

Here are the results:

  • Reading from one EC2 instance using 1 byte or 10k bytes: regardless of the request size, we get up to 3500 IOPS. One EC2 instance is enough to saturate the EFS limit.
  • Reading from two EC2 instances using 1 byte or 10k bytes: regardless of the request size, we get up to 3500 IOPS in total which are equally spread across the two EC2 instances.
  • The “PercentIOLimit” CloudWatch metric shows 100% when we do 3500 ops/s. Therefore, the total IOPS limit for reading is 3500 ops/s.

Leave a comment

HTTP Keep-Alive timeout values used by popular websites

Here is the command to test the Keep-Alive timeout of a website:

VHOST="www.google.bg"; time openssl s_client -ign_eof -connect "$VHOST:443" <<< "$( echo -ne "GET / HTTP/1.1\r\nHost: $VHOST\r\nConnection: Keep-Alive\r\n\r\n" )"

And here are the today’s results for some popular websites:

slashdot.org: 0s
LWN.net: 5s
snag.gy: 5s
yahoo.com: 10s
readthedocs.org: 10s
www.superhosting.bg: 15s
httpd.apache.org: 30s
nginx.org: 1m
en.wikipedia.org: 1m
famzah.wordpress.com: 1m15s
aws.amazon.com: 2m50s
www.facebook.com: 3m
www.google.bg: 4m
cloudplatform.googleblog.com: 4m
www.cloudflare.com: 6m40s
www.mozilla.org: 6m40s
www.tagesschau.de: 8m
access.redhat.com: 8m20s
stackoverflow.com: 10m
www.timeanddate.com: 10m
www.dreamhost.com: 10m
www.reddit.com: 10m
twitter.com: 15m

Leave a comment

Find the repository of all installed packages on Debian or Ubuntu

It turns out that there is no standard “apt” command which lists where a package was installed from. You may need this information if you have added additional APT repositories to your Debian/Ubuntu installation. I see a lot of questions at the forums (1, 2, 3, 4) and the proper solution tends to be “parse apt-cache output yourself”. Here is my solution which is very similar to this one:

set -u


for PKGNAME in $(dpkg -l|grep ^i|awk '{print $2}'); do
        INFO="$(apt-cache policy "$PKGNAME")"
        IVER="$(echo "$INFO" | grep Installed: | awk '{print $2}')"
        IPRIO="$(echo "$INFO" | fgrep "*** $IVER" | awk '{print $3}')"
        REPO="$(echo "$INFO" | fgrep -A1 "*** $IVER" | tail -n+2 | head -n1 | awk '{print $2 " " $3}')"

        echo "$PKGNAME repo=$REPO"

        if [ "$REPO" == '' ]; then
                errors=$(( $errors + 1 ))
                echo "ERROR: Unable to find the repo for package \"$PKGNAME\"" >&2

if [ "$errors" -ne 0 ]; then
        echo "ERROR: $errors errors encountered" >&2
        exit 1
        exit 0

Leave a comment

Postfix: Redirect all local and remote mails to a single email address

Virtual servers like EC2 usually get a random external IP address which is not suitable for outgoing SMTP. That’s because these “pool” IP addresses lack reverse DNS resolving, and their spam reputation is unknown because somebody before you may have used them to send out spam.

Still you need to be able to get email notifications from these machines because many vital services like the crontab, for example, send diagnostic emails to “root” or other local mailboxes, depending on the user that a cron job is being executed with.

One possible solution is to catch all mail sent to any email address (local or remote), forward it to Amazon Simple Email Service (SES), and let SES do the actual SMTP delivery for you.

Open the file “/etc/postfix/main.cf” and add the following two statements there:

smtp_generic_maps = regexp:/etc/postfix/email_rewrites
alias_maps = regexp:/etc/postfix/email_rewrites

The first directive ensures that the “From” address is being rewritten to your single external destination email (read the docs), while the second directive forwards all locally delivered mail to the same single external email address (SF article). Note that if “alias_maps” directive already exists in the “main.cf” file, you need to comment it out.

You can configure the single external email address to forward to by creating the file “/etc/postfix/email_rewrites” and then putting the following in it:

/.+/ mailbox@example.com

Finally, execute the following commands, so that Postfix picks up the new configuration:

postmap /etc/postfix/email_rewrites
/etc/init.d/postfix restart

If you decided to use Amazon SES for email delivery, there are a few additional steps to do:

If you are not using Postfix, then review the Amazon SES documentation about integration with other mail servers like Exim, Sendmail, Microsoft Exchange, etc.

Leave a comment

JIRA notify on create but only for parent issues

Here is a quick tutorial on how to create a custom notification in JIRA. It triggers when new issues are created but only if the issue is a parent, excluding sub-tasks.

You need to have the ScriptRunner for JIRA plugin.

First navigate to “Administration -> System -> Events -> Add New Event”:

  • Name: Issue Created (only parents)
  • Description: Triggers only for parent issues, not for sub-tasks
  • Template: Issue Created

Then navigate to “Administration -> Add-ons -> Script Listeners -> Add New Item -> Fires an event when condition is true”:

  • Note: Fire an “Issue Created (only parents)” event
  • Project Key: leave empty or enter a project name to limit the scope of this notification only for it
  • Events: Issue Created
  • Condition: !issue.isSubTask()
  • Event: Issue Created (only parents)

Finally, navigate to “Project settings -> Your project -> Notifications” and then “Actions -> Edit”. At the bottom there is the newly created custom event “Issue Created (only parents)”. Add one or more recipients and enjoy.

A quick note when testing the custom event. If you added only yourself as the recipient of the new event, JIRA won’t send you an email when the event fired. JIRA correctly thinks that it doesn’t make sense to notify the creator of the issue, as he or she already knows what they did. Therefore, in order to test this properly, you need to add someone else’s account or group, and then create a new test issue. Or do it vice versa by adding your account and let someone else create a new test issue.