Did you ever wonder where
As an example, consider the extremely common
IsUndefined API function. Instead of having to look up the address of the
undefined object for reference, what if we could simply check if an object's pointer ends in, say,
0x61 to know if it is undefined. This is exactly what the V8’s static roots feature achieves. This post explores the hurdles we had to take to get there. The feature landed in Chrome 111 and brought performance benefits across the whole VM, particularly speeding up C++ code and builtin functions.
Bootstrapping the Read-Only Heap #
Creating the read-only objects takes some time, so V8 creates them at compile time. To compile V8, first a minimal proto-V8 binary called
mksnapshot is compiled. This one creates all the shared read-only objects as well as the native code of builtin functions and writes them into a snapshot. Then, the actual V8 binary is compiled and bundled with the snapshot. To start V8 the snapshot is loaded into memory and we can immediately start using its content. The following diagram shows the simplified build process for the standalone
d8 is up and running all the read-only objects have their fixed place in memory and never move. When we JIT code, we can e.g., directly refer to
undefined by its address. However, when building the snapshot and when compiling the C++ for libv8 the address is not known yet. It depends on two things unknown at build time. First, the binary layout of the read-only heap and second, where in the memory space that read-only heap is located.
How to Predict Addresses? #
V8 uses pointer compression. Instead of full 64 bit addresses we refer to objects by a 32 bit offset into a 4GB region of memory. For many operations such as property loads or comparisons, the 32 bit offset into that cage is all that is needed to uniquely identify an object. Therefore our second problem — not knowing where in the memory space the read-only heap is placed — is not actually a problem. We simply place the read-only heap at the start of every pointer compression cage thus giving it a known location. For instance of all objects in V8’s heap,
undefined always has the smallest compressed address, starting at 0x61 bytes. That’s how we know that if the lower 32 bits of any JS object’s full address are 0x61, then it must be
This is already useful, but we want to be able to use this address in the snapshot and in libv8 – a seemingly circular problem. However, if we ensure that
mksnapshot deterministically creates a bit identical read-only heap, then we can re-use these addresses across builds. To use them in libv8 itself, we basically build V8 twice:
The first time round calling
mksnapshot the only artifact produced is a file that contains the addresses relative to the cage base of every object in the read-only heap. In the second stage of the build we compile libv8 again and a flag ensures that whenever we refer to
undefined we literally use
cage_base + StaticRoot::kUndefined instead; the static offset of
undefined of course being defined in the static-roots.h file. In many cases this will allow the C++ compiler creating libv8 and the builtins compiler in
mksnapshot to create much more efficient code as the alternative is to always load the address from a global array of root objects. We end up with a
d8 binary where the compressed address of
undefined is hardcoded to be
Well, morally this is how everything works, but practically we only build V8 once – ain’t nobody got time for this. The generated static-roots.h file is cached in the source repository and only needs to be recreated if we change the layout of the read-only heap.
Further Applications #
Speaking of practicalities, static roots enable even more optimizations. For instance we have since grouped common objects together allowing us to implement some operations as range checks over their addresses. For instance all string maps (i.e., the hidden-class meta objects describing the layout of different string types) are next to each other, hence an object is a string if its map has a compressed address between
0x49d. Or, truthy objects must have an address that is at least
Not everything is about the performance of JITed code in V8. As this project has shown, a relatively small change to the C++ code can have significant impact too. For instance Speedometer 2, a benchmark which exercises the V8 API and the interaction between V8 and its embedder, gained about 1% in score on an M1 CPU thanks to static roots.