As part of the work for fex/hex I implemeneted a multi-threaded, multi-process, just-in-time object code cache.

There were several challenges involved

  • Unlike aotir, the index cannot be sorted, so it got stored as a BST
  • Similarly, the format had to be crash / atomic safe
  • AOTOBJ was the goal, so relocations were needed. This was done seperatelly iirc.

Key features include

  • Reading from the caches requires a shared lock.
  • Writing to the caches necessitates a unique lock and an advisory lock when resizing the index or data files.
  • Index files are re-mapped as they grow in 64kb chunks.
  • Data files are mapped in chunks of 16 Megabytes.
  • Settings have been refined, merging AOT[IR/OBJ]Load and AOT[IR/OBJ]Capture to [IR/OBJ]Cache. The AOTIRGenerate has been removed, and a separate executable, FEXAOTGen, is now used.

How does it perform?

Some performance numbers from Orion

No Cache

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/ls > /dev/null

real	0m0,277s
user	0m0,249s
sys	0m0,025s

OBJCache

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/ls > /dev/null
[Info] Warning: OBJ/IR Caches are experimental, and might lead to crashes.

real	0m0,029s
user	0m0,012s
sys	0m0,015s

Native

skmp@ornio:~/projects/FEX/build$ time /bin/ls > /dev/null

real	0m0,007s
user	0m0,000s
sys	0m0,007s

Best OBJCache result so far

skmp@ornio:~/projects/FEX/build$ time Bin/FEXLoader /bin/true > /dev/null
[Info] Warning: OBJ/IR Caches are experimental, and might lead to crashes.

real	0m0,014s
user	0m0,009s
sys	0m0,005s