Concurrent Modification and Execution of Instructions (CMODX) for RISC-V Linux¶
CMODX is a programming technique where a program executes instructions that weremodified by the program itself. Instruction storage and the instruction cache(icache) are not guaranteed to be synchronized on RISC-V hardware. Therefore, theprogram must enforce its own synchronization with the unprivileged fence.iinstruction.
CMODX in the Kernel Space¶
Dynamic ftrace¶
Essentially, dynamic ftrace directs the control flow by inserting a functioncall at each patchable function entry, and patches it dynamically at runtime toenable or disable the redirection. In the case of RISC-V, 2 instructions,AUIPC + JALR, are required to compose a function call. However, it is impossibleto patch 2 instructions and expect that a concurrent read-side executes themwithout a race condition. This series makes atmoic code patching possible inRISC-V ftrace. Kernel preemption makes things even worse as it allows the oldstate to persist across the patching process withstop_machine().
In order to get rid ofstop_machine() and run dynamic ftrace with full kernelpreemption, we partially initialize each patchable function entry at boot-time,setting the first instruction to AUIPC, and the second to NOP. Now, atmoicpatching is possible because the kernel only has to update one instruction.According to Ziccif, as long as an instruction is naturally aligned, the ISAguarantee an atomic update.
By fixing down the first instruction, AUIPC, the range of the ftrace trampolineis limited to +-2K from the predetermined target, ftrace_caller, due to the lackof immediate encoding space in RISC-V. To address the issue, we introduceCALL_OPS, where an 8B naturally align metadata is added in front of eachpacthable function. The metadata is resolved at the first trampoline, then theexecution can be derect to another custom trampoline.
CMODX in the User Space¶
Though fence.i is an unprivileged instruction, the default Linux ABI prohibitsthe use of fence.i in userspace applications. At any point the scheduler maymigrate a task onto a new hart. If migration occurs after the userspacesynchronized the icache and instruction storage with fence.i, the icache on thenew hart will no longer be clean. This is due to the behavior of fence.i onlyaffecting the hart that it is called on. Thus, the hart that the task has beenmigrated to may not have synchronized instruction storage and icache.
There are two ways to solve this problem: use theriscv_flush_icache() syscall,or use thePR_RISCV_SET_ICACHE_FLUSH_CTXprctl() and emit fence.i inuserspace. The syscall performs a one-off icache flushing operation. The prctlchanges the Linux ABI to allow userspace to emit icache flushing operations.
As an aside, “deferred” icache flushes can sometimes be triggered in the kernel.At the time of writing, this only occurs during theriscv_flush_icache() syscalland when the kernel usescopy_to_user_page(). These deferred flushes happen onlywhen the memory map being used by a hart changes. If theprctl() context causedan icache flush, this deferred icache flush will be skipped as it is redundant.Therefore, there will be no additional flush when using theriscv_flush_icache()syscall inside of theprctl() context.
prctl() Interface¶
Callprctl() withPR_RISCV_SET_ICACHE_FLUSH_CTX as the first argument. Theremaining arguments will be delegated to the riscv_set_icache_flush_ctxfunction detailed below.
- intriscv_set_icache_flush_ctx(unsignedlongctx,unsignedlongscope)¶
Enable/disable icache flushing instructions in userspace.
Parameters
unsignedlongctxSet the type of icache flushing instructions permitted/prohibited inuserspace. Supported values described below.
unsignedlongscopeSet scope of where icache flushing instructions are allowed to beemitted. Supported values described below.
Description
Supported values for ctx:
PR_RISCV_CTX_SW_FENCEI_ON: Allow fence.i in user space.PR_RISCV_CTX_SW_FENCEI_OFF: Disallow fence.i in user space. All threads ina process will be affected whenscope==PR_RISCV_SCOPE_PER_PROCESS.Therefore, caution must be taken; use this flag only when you can guaranteethat no thread in the process will emit fence.i from this point onward.
Supported values for scope:
PR_RISCV_SCOPE_PER_PROCESS: Ensure the icache of any thread in this processis coherent with instruction storage uponmigration.
PR_RISCV_SCOPE_PER_THREAD: Ensure the icache of the current thread iscoherent with instruction storage uponmigration.
Whenscope==PR_RISCV_SCOPE_PER_PROCESS, all threads in the process arepermitted to emit icache flushing instructions. Whenever any thread in theprocess is migrated, the corresponding hart’s icache will be guaranteed to beconsistent with instruction storage. This does not enforce any guaranteesoutside of migration. If a thread modifies an instruction that another threadmay attempt to execute, the other thread must still emit an icache flushinginstruction before attempting to execute the potentially modifiedinstruction. This must be performed by the user-space program.
In per-thread context (eg.scope==PR_RISCV_SCOPE_PER_THREAD) only thethread calling this function is permitted to emit icache flushinginstructions. When the thread is migrated, the corresponding hart’s icachewill be guaranteed to be consistent with instruction storage.
On kernels configured without SMP, this function is a nop as migrationsacross harts will not occur.
Example usage:
The following files are meant to be compiled and linked with each other. Themodify_instruction() function replaces an add with 0 with an add with one,causing the instruction sequence inget_value() to change from returning a zeroto returning a one.
cmodx.c:
#include <stdio.h>#include <sys/prctl.h>extern int get_value();extern void modify_instruction();int main(){ int value = get_value(); printf("Value before cmodx: %d\n", value); // Call prctl before first fence.i is called inside modify_instruction prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_ON, PR_RISCV_SCOPE_PER_PROCESS); modify_instruction(); // Call prctl after final fence.i is called in process prctl(PR_RISCV_SET_ICACHE_FLUSH_CTX, PR_RISCV_CTX_SW_FENCEI_OFF, PR_RISCV_SCOPE_PER_PROCESS); value = get_value(); printf("Value after cmodx: %d\n", value); return 0;}cmodx.S:
.option norvc.text.global modify_instructionmodify_instruction:lw a0, new_insnlui a5,%hi(old_insn)sw a0,%lo(old_insn)(a5)fence.iret.section modifiable, "awx".global get_valueget_value:li a0, 0old_insn:addi a0, a0, 0ret.datanew_insn:addi a0, a0, 1