The Linux Kernel Tracepoint API

Author:

Jason Baron

Author:

William Cohen

Introduction

Tracepoints are static probe points that are located in strategic pointsthroughout the kernel. ‘Probes’ register/unregister with tracepoints viaa callback mechanism. The ‘probes’ are strictly typed functions that arepassed a unique set of parameters defined by each tracepoint.

From this simple callback mechanism, ‘probes’ can be used to profile,debug, and understand kernel behavior. There are a number of tools thatprovide a framework for using ‘probes’. These tools include Systemtap,ftrace, and LTTng.

Tracepoints are defined in a number of header files via various macros.Thus, the purpose of this document is to provide a clear accounting ofthe available tracepoints. The intention is to understand not only whattracepoints are available but also to understand where futuretracepoints might be added.

The API presented has functions of the form:trace_tracepointname(functionparameters). These are the tracepointscallbacks that are found throughout the code. Registering andunregistering probes with these callback sites is covered in theDocumentation/trace/* directory.

IRQ

voidtrace_irq_handler_entry(intirq,structirqaction*action)

called immediately before the irq action handler

Parameters

intirq

irq number

structirqaction*action

pointer tostructirqaction

Description

Thestructirqaction pointed to byaction contains variousinformation about the handler, including the device name,action->name, and the device id,action->dev_id. When used inconjunction with the irq_handler_exit tracepoint, we can figureout irq handler latencies.

voidtrace_irq_handler_exit(intirq,structirqaction*action,intret)

called immediately after the irq action handler returns

Parameters

intirq

irq number

structirqaction*action

pointer tostructirqaction

intret

return value

Description

If theret value is set to IRQ_HANDLED, then we know that the correspondingaction->handler successfully handled this irq. Otherwise, the irq might bea shared irq line, or the irq was not handled successfully. Can be used inconjunction with the irq_handler_entry to understand irq handler latencies.

voidtrace_softirq_entry(unsignedintvec_nr)

called immediately before the softirq handler

Parameters

unsignedintvec_nr

softirq vector number

Description

When used in combination with the softirq_exit tracepointwe can determine the softirq handler routine.

voidtrace_softirq_exit(unsignedintvec_nr)

called immediately after the softirq handler returns

Parameters

unsignedintvec_nr

softirq vector number

Description

When used in combination with the softirq_entry tracepointwe can determine the softirq handler routine.

voidtrace_softirq_raise(unsignedintvec_nr)

called immediately when a softirq is raised

Parameters

unsignedintvec_nr

softirq vector number

Description

When used in combination with the softirq_entry tracepointwe can determine the softirq raise to run latency.

voidtrace_tasklet_entry(structtasklet_struct*t,void*func)

called immediately before the tasklet is run

Parameters

structtasklet_struct*t

tasklet pointer

void*func

tasklet callback or function being run

Description

Used to find individual tasklet execution time

voidtrace_tasklet_exit(structtasklet_struct*t,void*func)

called immediately after the tasklet is run

Parameters

structtasklet_struct*t

tasklet pointer

void*func

tasklet callback or function being run

Description

Used to find individual tasklet execution time

SIGNAL

voidtrace_signal_generate(intsig,structkernel_siginfo*info,structtask_struct*task,intgroup,intresult)

called when a signal is generated

Parameters

intsig

signal number

structkernel_siginfo*info

pointer tostructsiginfo

structtask_struct*task

pointer tostructtask_struct

intgroup

shared or private

intresult

TRACE_SIGNAL_*

Description

Current process sends a ‘sig’ signal to ‘task’ process with‘info’ siginfo. If ‘info’ is SEND_SIG_NOINFO or SEND_SIG_PRIV,‘info’ is not a pointer and you can’t access its field. Instead,SEND_SIG_NOINFO means that si_code is SI_USER, and SEND_SIG_PRIVmeans that si_code is SI_KERNEL.

voidtrace_signal_deliver(intsig,structkernel_siginfo*info,structk_sigaction*ka)

called when a signal is delivered

Parameters

intsig

signal number

structkernel_siginfo*info

pointer tostructsiginfo

structk_sigaction*ka

pointer tostructk_sigaction

Description

A ‘sig’ signal is delivered to current process with ‘info’ siginfo,and it will be handled by ‘ka’. ka->sa.sa_handler can be SIG_IGN orSIG_DFL.Note that some signals reported by signal_generate tracepoint can belost, ignored or modified (by debugger) before hitting this tracepoint.This means, this can show which signals are actually delivered, butmatching generated signals and delivered signals may not be correct.

Block IO

voidtrace_block_touch_buffer(structbuffer_head*bh)

mark a buffer accessed

Parameters

structbuffer_head*bh

buffer_head being touched

Description

Called fromtouch_buffer().

voidtrace_block_dirty_buffer(structbuffer_head*bh)

mark a buffer dirty

Parameters

structbuffer_head*bh

buffer_head being dirtied

Description

Called frommark_buffer_dirty().

voidtrace_block_rq_requeue(structrequest*rq)

place block IO request back on a queue

Parameters

structrequest*rq

block IO operation request

Description

The block operation requestrq is being placed back into queueq. For some reason the request was not completed and needs to beput back in the queue.

voidtrace_block_rq_complete(structrequest*rq,blk_status_terror,unsignedintnr_bytes)

block IO operation completed by device driver

Parameters

structrequest*rq

block operations request

blk_status_terror

status code

unsignedintnr_bytes

number of completed bytes

Description

The block_rq_complete tracepoint event indicates that some portionof operation request has been completed by the device driver. Iftherq->bio isNULL, then there is absolutely no additional work todo for the request. Ifrq->bio is non-NULL then there isadditional work required to complete the request.

voidtrace_block_rq_error(structrequest*rq,blk_status_terror,unsignedintnr_bytes)

block IO operation error reported by device driver

Parameters

structrequest*rq

block operations request

blk_status_terror

status code

unsignedintnr_bytes

number of completed bytes

Description

The block_rq_error tracepoint event indicates that some portionof operation request has failed as reported by the device driver.

voidtrace_block_rq_insert(structrequest*rq)

insert block operation request into queue

Parameters

structrequest*rq

block IO operation request

Description

Called immediately before block operation requestrq is insertedinto queueq. The fields in the operation requestrqstructcanbe examined to determine which device and sectors the pendingoperation would access.

voidtrace_block_rq_issue(structrequest*rq)

issue pending block IO request operation to device driver

Parameters

structrequest*rq

block IO operation request

Description

Called when block operation requestrq from queueq is sent to adevice driver for processing.

voidtrace_block_rq_merge(structrequest*rq)

merge request with another one in the elevator

Parameters

structrequest*rq

block IO operation request

Description

Called when block operation requestrq from queueq is merged to anotherrequest queued in the elevator.

voidtrace_block_io_start(structrequest*rq)

insert a request for execution

Parameters

structrequest*rq

block IO operation request

Description

Called when block operation requestrq is queued for execution

voidtrace_block_io_done(structrequest*rq)

block IO operation request completed

Parameters

structrequest*rq

block IO operation request

Description

Called when block operation requestrq is completed

voidtrace_block_bio_complete(structrequest_queue*q,structbio*bio)

completed all work on the block operation

Parameters

structrequest_queue*q

queue holding the block operation

structbio*bio

block operation completed

Description

This tracepoint indicates there is no further work to do on thisblock IO operationbio.

voidtrace_block_bio_backmerge(structbio*bio)

merging block operation to the end of an existing operation

Parameters

structbio*bio

new block operation to merge

Description

Merging block requestbio to the end of an existing block request.

voidtrace_block_bio_frontmerge(structbio*bio)

merging block operation to the beginning of an existing operation

Parameters

structbio*bio

new block operation to merge

Description

Merging block IO operationbio to the beginning of an existing block request.

voidtrace_block_bio_queue(structbio*bio)

putting new block IO operation in queue

Parameters

structbio*bio

new block operation

Description

About to place the block IO operationbio into queueq.

voidtrace_block_getrq(structbio*bio)

get a free request entry in queue for block IO operations

Parameters

structbio*bio

pending block IO operation (can beNULL)

Description

A requeststructhas been allocated to handle the block IO operationbio.

voidtrace_blk_zone_append_update_request_bio(structrequest*rq)

update bio sector after zone append

Parameters

structrequest*rq

the completed request that sets the bio sector

Description

Update the bio’s bi_sector after a zone append command has been completed.

voidtrace_block_plug(structrequest_queue*q)

keep operations requests in request queue

Parameters

structrequest_queue*q

request queue to plug

Description

Plug the request queueq. Do not allow block operation requeststo be sent to the device driver. Instead, accumulate requests inthe queue to improve throughput performance of the block device.

voidtrace_block_unplug(structrequest_queue*q,unsignedintdepth,boolexplicit)

release of operations requests in request queue

Parameters

structrequest_queue*q

request queue to unplug

unsignedintdepth

number of requests just added to the queue

boolexplicit

whether this was an explicit unplug, or one fromschedule()

Description

Unplug request queueq because device driver is scheduled to workon elements in the request queue.

voidtrace_block_split(structbio*bio,unsignedintnew_sector)

split a single biostructinto two bio structs

Parameters

structbio*bio

block operation being split

unsignedintnew_sector

The starting sector for the new bio

Description

The bio requestbio needs to be split into two bio requests. The newlycreatedbio request starts atnew_sector. This split may be required due tohardware limitations such as operation crossing device boundaries in a RAIDsystem.

voidtrace_block_bio_remap(structbio*bio,dev_tdev,sector_tfrom)

map request for a logical device to the raw device

Parameters

structbio*bio

revised operation

dev_tdev

original device for the operation

sector_tfrom

original sector for the operation

Description

An operation for a logical device has been mapped to theraw block device.

voidtrace_block_rq_remap(structrequest*rq,dev_tdev,sector_tfrom)

map request for a block operation request

Parameters

structrequest*rq

block IO operation request

dev_tdev

device for the operation

sector_tfrom

original sector for the operation

Description

The block operation requestrq inq has been remapped. The blockoperation requestrq holds the current information andfrom holdthe original sector.

voidtrace_blkdev_zone_mgmt(structbio*bio,sector_tnr_sectors)

Execute a zone management operation on a range of zones

Parameters

structbio*bio

The block IO operation sent down to the device

sector_tnr_sectors

The number of sectors affected by this operation

Description

Execute a zone management operation on a specified range of zones. Thisrange is encoded innr_sectors, which has to be a multiple of the zonesize.

Workqueue

voidtrace_workqueue_queue_work(intreq_cpu,structpool_workqueue*pwq,structwork_struct*work)

called when a work gets queued

Parameters

intreq_cpu

the requested cpu

structpool_workqueue*pwq

pointer tostructpool_workqueue

structwork_struct*work

pointer tostructwork_struct

Description

This event occurs when a work is queued immediately or once adelayed work is actually queued on a workqueue (ie: once the delayhas been reached).

voidtrace_workqueue_activate_work(structwork_struct*work)

called when a work gets activated

Parameters

structwork_struct*work

pointer tostructwork_struct

Description

This event occurs when a queued work is put on the active queue,which happens immediately after queueing unlessmax_active limitis reached.

voidtrace_workqueue_execute_start(structwork_struct*work)

called immediately before the workqueue callback

Parameters

structwork_struct*work

pointer tostructwork_struct

Description

Allows to track workqueue execution.

voidtrace_workqueue_execute_end(structwork_struct*work,work_func_tfunction)

called immediately after the workqueue callback

Parameters

structwork_struct*work

pointer tostructwork_struct

work_func_tfunction

pointer to worker function

Description

Allows to track workqueue execution.