- The kernel of “Translation Ranger” is here.
- A newer version of “Translation Ranger” rebased on Linux kenrel v5.0-rc5 is here and patches. You will need to apply the patch (
git am <patch file>) on top of Linux v5.0-rc5.
- Its companion userspace applications can be find here.
make menuconfig and select
TRANSLATION_RANGER option to make sure the kernel can be compiled correctly. (Use
/ to search for the option)
To get an initial kernel configuration file, you can copy a config file like
config-4.4.0-154-generic from your
/boot folder to the kernel source folder and rename it to
To reduce kernel compilation time, you could run
make localmodconfig, which only selects the modules currently loaded in your system.
##Run your application with Translation Ranger
To launch benchmarks in Translation Ranger, you need the scripts and the launcher from the userspace GitHub repo. The application will always run on Node 1 in the system, which can be changed by assigning the NUMA node id to
FAST_NUMA_NODE in run_bench.sh.
In the repo, run
make to create launcher binary from launcher.c file.
- setup_thp.sh changes kernel khugepaged running configurations. The script will disable khugepaged except for Linux Default. In Enhanced khugepaged case, khugepaged is invoked via my customized system call from launcher.
- run_bench.sh uses launcher to run your applications and performs different memory defragmentation actions, like Linux Default, Large Max Order, and Translation Ranger.
- Each of your applications should be in a separate folder like 503.postencil in the GitHub repo. You need a
bench_run.shfile in the folder to tell
run_bench.shthe name of your appilcation (
BENCH_NAME) and how to run it (
BENCH_SIZEwill be exported in
run_all.sh(see next bullet point), so you can launch your appilcations with different memory footprints by just changing
- run_all.sh loops through
CONFIG_LISTfor each benchmark in
BENCH_LISTwith the number of threads used in
THREAD_NUM_LISTand memory footprint from
BENCH_SIZEand used by
bench_run.shabove). Based on the values from
DEFRAG_FRAQ_FACTOR_LIST, like N, Translation Ranger will run every
STATS_PERIOD * Nseconds (
STATS_PERIODis explained below). Other MACRO explanations:
PREFER_MEM_MODE: if an application footprint is bigger than the assigned NUMA node, the kernel will spill the extra data to other NUMA nodes or not. Yes means extra data will be allocated on other NUMA nodes; no means swapping will happen.
STATS_PERIOD: how frequenct the launcher will scan application memory and generate contiguity stats. Every 5 seconds is the default value.
CONTIG_STATS: dumping contiguity stats or not.
ONLINE_STATS: debug information, do not change.
PROMOTE_1GB_MAP: if yes, Translation Ranger will try to do in-place 1GB THP promotion.
NO_REPEAT_DEFRAG: each VMA is augmented with timestamps telling when the VMA is changed and defragmented. If yes, Translation Ranger will compare the two timestamps and performs a memory defragmentation when the VMA is changed after last memory defragmentation. (This configuration is not presented in the paper)
CHILD_PROC_STAT: debug information, do not change.
The scripts use different configuration names (the ones showing in
CONFIG_LIST) than the ones (like Linux Default, Large Max Order, Translation Ranger) from the paper. Here is the one-to-one mapping:
- Linux Default:
- Large Max Order:
- Enhanced khugepaged:
- Translation Ranger:
###Contiguity stats format
After running, a result folder will be created for each benchmark with a distinct configuration, like
result-<benchmark name>-<thread number>-cpu-<memory footprint size>-mem-defrag-scan-period-<the value of STATS_PERIOD>-defrag-period-<the value of DEFRAG_FRAQ_FACTOR*STATS_PERIOD>-<configuration>-<date and time of the run>. In the folder,
<benchmark name>_mem_frag_contig_stats_0 stores the memory contiguity stats. It contains multiple memory scans, which are separated by
----. Each scan contains contiguity stats for all VMAs in the application. Here is a sample for two VMAs:
00000000a8b6e356, 0xffff9b36d7c7cb0b, 0x7f1325294000, 0x7f1325295000, 1, -1 00000000a8b6e356, 0xffff9b35d7817550, 0x7ffc42439000, 0x7ffc4245a000, -29, 1, 1, 1, 1, -1
The fields are explained below:
00000000a8b6e356: kernel mm_struct virtual address, use it only if you have multiple application running.
0xffff9b36d7c7cb0b: kernel vm_area virtual address, distinguish different VMAs.
0x7f1325294000: the start virutal address of the VMA.
0x7f1325295000: the end virtual address of the VMA.
- all following fields represent all contiguous regions in the VMA, beginning from the start virtual address of the VMA. Positive number
N4KB contiguous physical pages are in this contiguous region and negative number
M4KB non-present physical hole.
By scanning the stats, you should be able to get contiguity statistics for each scan.