One of my first experiences with low-level system software was building my own operating system for the x86 from scratch. Putting my code — and nothing but my code — on a floppy disk and booting from it, made me incredibly excited. I was slightly less excited whenever I inadvertently introduced some bug in my code that caused a triple fault that made my computer reboot. The trial-and-error approach to addressing such bugs was interesting detective work, but it wasn’t very productive. I felt like the primary purpose of my — and indeed any other — operating system was to provide meaningful, actionable feedback about misbehaving software, so I spent most of my time fiddling with page and descriptor tables to use the hardware-provided means for catching and reporting such issues.
I know now that enabling software developers is definitely my thing, so I have been thrilled to see how the close cousin of the subject of my master’s thesis — the internet of things (IoT) — has been going through a phase where the focus on connectivity and hardware has been matched with a focus on functionality provided by software. Today, lots of developers come to IoT from software backgrounds and they pick tools that make them productive. They choose hardware like the Raspberry Pi instead of a power-efficient microcontroller, because coupled with Linux, it gives them a high-level and robust foundation for their work.
If you want to build for microcontrollers, you’re forced to use a different, much less refined development workflow. Soldering irons, C compilers, and flashing via USB connectors is on the menu if you go that route. It can be a rather painful experience, so brace yourself. In a nutshell, the problem is that on microcontrollers everything is firmware that is compiled, linked, and deployed together using really old-fashioned tools. Changing anything means changing everything. All the different pieces of your codebase also crash together, so if you introduce a bug in one part of your code, chances are that it will ruin the entire functionality of your device, rendering your device useless on a good day and completely bricked on a rainy day.
Modern computers have grown operating systems that support developers by giving them safety rails and fault isolation, but this level of support and security hasn’t yet reached microcontrollers. It is time for a change!
Microcontrollers usually run so-called real-time operating systems (RTOS). The reason they run these stripped down operating systems that deemphasize security and robustness is that a typical operating system relies on the hardware to provide multiple protection domains and memory isolation. Without that hardware support, the operating system only deals with simpler things like scheduling, synchronization, and memory allocation.
It is possible to provide fault tolerance and isolation through software, but it requires a software layer that shields the applications that run on top of it from the underlying hardware. This layer is typically called a virtual machine.
You may have heard about two different kinds of virtual machines: The ones that emulate a concrete computing system and the ones that provide a platform-independent managed environment for a specific class of high-level languages to run in. Because of its significantly lower system requirements, we have focused on the latter kind and designed an embedded virtual machine in the tradition of Java — not Docker — for microcontrollers. It runs on the ESP32 chips from Espressif Systems, and it augments the primitive FreeRTOS operating system that is bundled in Espressif’s IoT Development Framework (ESP-IDF) with the capabilities for safely running platform-independent software applications side-by-side.
The concepts of operating system, virtual machine and programming language are closely related. From the perspective of a developer, we have hidden the operating system and the physical hardware and given them a high-level language and environment to work in. It is just like how modern web browsers let developers run their applications across Windows, Linux and macOS on different kinds of processors, but this time for microcontrollers. The Toit virtual machine implementation is our secret sauce, and no it isn’t just another operating system.
An operating system is a collection of things that don’t fit into a language. There shouldn’t be one. — Dan Ingalls
The Toit platform allows you to install independently developed applications side-by-side on a small microcontroller like the ESP32. The virtual machine has built-in support for constructing application images in flash, based on a stream of bits and relocation information. The relocation information is crucial, because it allows the device to freely pick the location in flash where it installs the application. We do not have the luxury of using virtual memory to let the system believe an application always runs from a particular location in memory, so we have to adapt the application image to the actual location in flash that it ends up being stored in.
The Toit platform streams the application images via CoAP over TLS and the device receives 32 words at a time and relocates them before writing them into flash. We have designed it so we never have to keep the full image in RAM, because that is a fairly limited resource. Once we’re done with all the application image bits, we validate them using a checksum mechanism and finally commit the header, turning the application into a valid and runnable piece of functionality.
A typical Toit application image is around 30KB in total. The vast majority of that is the bytecodes that describe the behavior of individual methods in an easily interpretable form. We extract the essential information from the program’s hierarchy, classes and interfaces, and store them in a compact form. Similarly, we save space by collectively storing methods as one flattened sequence of bytes in something that resembles the .text segment of an ELF file. The only structured objects in the images are the compile-time constants that go with the application.
The Toit virtual machine ends up acting like a flash-based filesystem with a dynamic relocating linker for installing, upgrading, and uninstalling application images that can run directly from flash. The applications are completely separate and only share what is provided by the virtual machine on the device.
For robustness and security reasons, applications need a safe environment on your microcontroller to run within. When an application starts up, the Toit virtual machine allocates a new process structure in memory. The structure includes an object heap that is isolated from the rest of the system, so we have a place to keep all objects allocated by code running in that particular process. The footprint of a minimal process is 4KB of RAM and that includes the object heap. Start small and grow only when necessary!
Once the process has been set up, the virtual machine tells the scheduler that the process is ready to run. The processes are scheduled on top of a fixed number of FreeRTOS threads — one for each core of the CPU — but you can easily have more processes than threads. In most cases, we run with two threads on the ESP32, because it is a dual-core processor. The threads have a small, fixed execution stack associated with them and they service the individual processes one at a time by running the applications associated with them for a while until the threads are preempted by the scheduler and move on to another process. Multiple preemptively scheduled processes with completely isolated address spaces. Check.
Inside the processes, the application consists of multiple light-weight tasks that each have their own execution stack. These stacks are allocated in the object heap and they grow on demand, so you don’t have to preallocate nor pick the right stack sizes for your tasks. The system takes care of that for you. The tasks are cooperatively scheduled, so it is only when one task cannot make progress that we pick another task in the same process and continue running its code.
This setup gives your applications their own isolated memory areas. This is important for security and robustness, but it is also the perfect way to support decomposing your device functionality into meaningful and decoupled modular applications. Internally, applications can be composed of multiple task-based activities and it is extremely cheap to do blocking operations like waiting for events, because you are only blocking a light-weight task, not an entire process or the FreeRTOS thread. This leads to understandable code that is easy to write and it avoids the need for most asynchronous operations:
Once you have a good handful of processes running on your microcontroller, you want to make sure they don’t unnecessarily consume too much of your most precious resource: RAM. The best way to achieve that is to make sure that the virtual machine itself doesn’t come with too much overhead and stays lean when executing your code.
For dynamic and flexible programming languages, a common source of overhead is actually the optimization techniques used to make method calls fast. When a virtual machine executes your method calls like:
it needs to determine which
append method to invoke based on the runtime type of months. Historically, this has been optimized by having a method lookup table that uses extremely fast hashing to find and validate the right
append method, but the hash table is updated at runtime and must be stored in RAM. To avoid too many collisions, it also needs to have a pretty decent size, so this adds quite a bit of overhead for the sole purpose of making method calls fast.
For Toit, we knew that we needed something better that would allow us to perform efficient method calls directly from flash with no RAM overhead. We started by doing a depth first numbering of all the Toit classes in the system and noted how that trivially led to consecutive numbers for all subclasses of a given class.
That means methods are inherited by classes within a specific class number range. As an example, if class
B defines a method called
append, the method
B.append is inherited by all classes with numbers in the [1, 3] range. If class
F also defines an
append method, the
F.append method is only applicable for instances of
F. You can easily associate a method name with all the class number ranges its different implementations are inherited by. Through that you can form a one-dimensional dispatch table specific for the method name
append that looks like this: