Movatterモバイル変換


[0]ホーム

URL:


man7.org > Linux >man-pages

Linux/UNIX system programming training


tracefs_sql(3) — Linux manual page

NAME |SYNOPSIS |DESCRIPTION |KEYWORDS AS EVENT FIELDS |HISTOGRAMS |RETURN VALUE |CREATE A TOOL |EXAMPLE |FILES |SEE ALSO |AUTHOR |REPORTING BUGS |LICENSE |RESOURCES |COPYING |NOTES |COLOPHON

LIBTRACEFS(3)               libtracefs ManualLIBTRACEFS(3)

NAME        top

       tracefs_sql - Create a synthetic event via an SQL statement

SYNOPSIS        top

#include <tracefs.h>       struct tracefs_synth *tracefs_sql(struct tep_handle *tep, const char *name,                                               const char *sql_buffer, char **err);

DESCRIPTION        top

       Synthetic events are dynamically created events that attach two       existing events together via one or more matching fields between       the two events. It can be used to find the latency between the       events, or to simply pass fields of the first event on to the       second event to display as one event.       The Linux kernel interface to create synthetic events is complex,       and there needs to be a better way to create synthetic events that       is easy and can be understood via existing technology.       If you think of each event as a table, where the fields are the       column of the table and each instance of the event as a row, you       can understand how SQL can be used to attach two events together       and form another event (table). Utilizing the SQLSELECT FROM JOINON[WHERE] syntax, a synthetic event can easily be created from       two different events.       For simple SQL queries to make a histogram instead of a synthetic       event, see HISTOGRAMS below.tracefs_sql() takes in atep handler (Seetep_local_events(3))       that is used to verify the events within thesql_buffer       expression. Thename is the name of the synthetic event to create.       Iferr points to an address of a string, it will be filled with a       detailed message on any type of parsing error, including fields       that do not belong to an event, or if the events or fields are not       properly compared.       The example program below is a fully functional parser where it       will create a synthetic event from a SQL syntax passed in via the       command line or a file.       The SQL format is as follows:SELECT<fields>FROM<start-event>JOIN<end-event>ON       <matching-fields>WHERE<filter>       Note, although the examples show the SQL commands in uppercase,       they are not required to be so. That is, you can use "SELECT" or       "select" or "sElEct".       For example:           SELECT syscalls.sys_enter_read.fd, syscalls.sys_exit_read.ret FROM syscalls.sys_enter_read              JOIN syscalls.sys_exit_read              ON syscalls.sys_enter_read.common_pid = syscalls.sys_exit_write.common_pid       Will create a synthetic event that with the fields:           u64 fd; s64 ret;       Because the function takes atep handle, and usually all event       names are unique, you can leave off the system (group) name of the       event, andtracefs_sql() will discover the system for you.       That is, the above statement would work with:           SELECT sys_enter_read.fd, sys_exit_read.ret FROM sys_enter_read JOIN sys_exit_read              ON sys_enter_read.common_pid = sys_exit_write.common_pid       TheASkeyword can be used to name the fields as well as to give       an alias to the events, such that the above can be simplified even       more as:           SELECT start.fd, end.ret FROM sys_enter_read AS start JOIN sys_exit_read AS end ON start.common_pid = end.common_pid       The above aliasessys_enter_read asstartandsys_exit_read asend       and uses those aliases to reference the event throughout the       statement.       Using theASkeyword in the selection portion of the SQL statement       will define what those fields will be called in the synthetic       event.           SELECT start.fd AS filed, end.ret AS return FROM sys_enter_read AS start JOIN sys_exit_read AS end              ON start.common_pid = end.common_pid       The above labels thefd ofstart asfiledand theret ofend asreturnwhere the synthetic event that is created will now have the       fields:           u64 filed; s64 return;       The fields can also be calculated with results passed to the       synthetic event:           select start.truesize, end.len, (start.truesize - end.len) as diff from napi_gro_receive_entry as start              JOIN netif_receive_skb as end ON start.skbaddr = end.skbaddr       Which would show thetruesizeof thenapi_gro_receive_entry event,       the actuallen of the content, shown by thenetif_receive_skb, and       the delta between the two and expressed by the fielddiff.       The code also supports recording the timestamps at either event,       and performing calculations on them. For wakeup latency, you have:           select start.pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as lat from sched_waking as start              JOIN sched_switch as end ON start.pid = end.next_pid       The above will create a synthetic event that records thepid of       the task being woken up, and the time difference between thesched_waking event and thesched_switch event. TheTIMESTAMP_USECS       will truncate the time down to microseconds as the timestamp       usually recorded in the tracing buffer has nanosecond resolution.       If you do not want that truncation, useTIMESTAMPinstead ofTIMESTAMP_USECS.       Because it is so common to have:              (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS)       The above can be represented withTIMESTAMP_DELTA_USECSor if       nanoseconds are OK, you can useTIMESTAMP_DELTA. That is, the       previous select can also be represented by:           select start.pid, TIMESTAMP_DELTA_USECS as lat from sched_waking as start JOIN sched_switch as end ON start.pid = end.next_pid       Finally, theWHEREclause can be added, that will let you add       filters on either or both events.           select start.pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as lat from sched_waking as start              JOIN sched_switch as end ON start.pid = end.next_pid              WHERE start.prio < 100 && (!(end.prev_pid < 1 || end.prev_prio > 100) || end.prev_pid == 0)NOTE       Although both events can be used together in theWHEREclause,       they must not be mixed outside the top most "&&" statements. You       can not OR (||) the events together, where a filter of one event       is OR’d to a filter of the other event. This does not make sense,       as the synthetic event requires both events to take place to be       recorded. If one is filtered out, then the synthetic event does       not execute.           select start.pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as lat from sched_waking as start              JOIN sched_switch as end ON start.pid = end.next_pid              WHERE start.prio < 100 && end.prev_prio < 100       The above is valid.       Where as the below is not.           select start.pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as lat from sched_waking as start              JOIN sched_switch as end ON start.pid = end.next_pid              WHERE start.prio < 100 || end.prev_prio < 100       If the kernel supports it, you can pass around a stacktrace       between events.           select start.prev_pid as pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as delta, start.STACKTRACE as stack              FROM sched_switch as start JOIN sched_switch as end ON start.prev_pid = end.next_pid              WHERE start.prev_state == 2       The above will record a stacktrace when a task is in the       UNINTERRUPTIBLE (blocked) state, and trigger the synthetic event       when it is scheduled back in, recording the time delta that it was       blocked for. It will record the stacktrace of where it was when it       scheduled out along with the delta.

KEYWORDS AS EVENT FIELDS        top

       In some cases, an event may have a keyword. For example,       regcache_drop_region has "from" as a field and the following will       not work             select from from regcache_drop_region       In such cases, add a backslash to the conflicting field, and this       will tell the parser that the "from" is a field and not a keyword:             select \from from regcache_drop_region

HISTOGRAMS        top

       Simple SQL statements without theJOIN ONmay also be used, which       will create a histogram instead. When doing this, the struct       tracefs_hist descriptor can be retrieved from the returned       synthetic event descriptor via thetracefs_synth_get_start_hist(3).       In order to utilize the histogram types (see xxx) the CAST command       of SQL can be used.       That is:             select CAST(common_pid AS comm), CAST(id AS syscall) FROM sys_enter       Which produces:            # echo 'hist:keys=common_pid.execname,id.syscall' > events/raw_syscalls/sys_enter/trigger            # cat events/raw_syscalls/sys_enter/hist           { common_pid: bash            [     18248], id: sys_setpgid                   [109] } hitcount:          1           { common_pid: sendmail        [      1812], id: sys_read                      [  0] } hitcount:          1           { common_pid: bash            [     18247], id: sys_getpid                    [ 39] } hitcount:          1           { common_pid: bash            [     18247], id: sys_dup2                      [ 33] } hitcount:          1           { common_pid: gmain           [     13684], id: sys_inotify_add_watch         [254] } hitcount:          1           { common_pid: cat             [     18247], id: sys_access                    [ 21] } hitcount:          1           { common_pid: bash            [     18248], id: sys_getpid                    [ 39] } hitcount:          1           { common_pid: cat             [     18247], id: sys_fadvise64                 [221] } hitcount:          1           { common_pid: sendmail        [      1812], id: sys_openat                    [257] } hitcount:          1           { common_pid: less            [     18248], id: sys_munmap                    [ 11] } hitcount:          1           { common_pid: sendmail        [      1812], id: sys_close                     [  3] } hitcount:          1           { common_pid: gmain           [      1534], id: sys_poll                      [  7] } hitcount:          1           { common_pid: bash            [     18247], id: sys_execve                    [ 59] } hitcount:          1       Note, string fields may not be cast.       The possible types to cast to are:HEX- convert the value to use hex and not decimalSYM- convert a pointer to symbolic (kallsyms values)SYM-OFFSET- convert a pointer to symbolic and include the offset.SYSCALL- convert the number to the mapped system call nameEXECNAMEorCOMM- can only be used with the common_pid field.       Will show the task name of the process.LOGorLOG2- bucket the key values in a log 2 values (1, 2, 3-4,       5-8, 9-16, 17-32, ...)       The above fields are not case sensitive, and "LOG2" works as good       as "log".       A special CAST toCOUNTER orCOUNTER will make the field a value       and not a key. For example:             SELECT common_pid, CAST(bytes_req AS _COUNTER_) FROM kmalloc       Which will create             echo 'hist:keys=common_pid:vals=bytes_req' > events/kmem/kmalloc/trigger             cat events/kmem/kmalloc/hist           { common_pid:       1812 } hitcount:          1  bytes_req:         32           { common_pid:       9111 } hitcount:          2  bytes_req:        272           { common_pid:       1768 } hitcount:          3  bytes_req:       1112           { common_pid:          0 } hitcount:          4  bytes_req:        512           { common_pid:      18297 } hitcount:         11  bytes_req:       2004

RETURN VALUE        top

       Returns 0 on success and -1 on failure. On failure, iferr is       defined, it will be allocated to hold a detailed description of       what went wrong if it the error was caused by a parsing error, or       that an event, field does not exist or is not compatible with what       it was combined with.

CREATE A TOOL        top

       The below example is a functional program that can be used to       parse SQL commands into synthetic events.              man tracefs_sql | sed -ne '/^EXAMPLE/,/FILES/ { /EXAMPLE/d ; /FILES/d ; p}' > sqlhist.c              gcc -o sqlhist sqlhist.c `pkg-config --cflags --libs libtracefs`       Then you can run the above examples:             sudo ./sqlhist 'select start.pid, (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as lat from sched_waking as start                             JOIN sched_switch as end ON start.pid = end.next_pid                             WHERE start.prio < 100 || end.prev_prio < 100'

EXAMPLE        top

           #include <stdio.h>           #include <stdlib.h>           #include <stdarg.h>           #include <string.h>           #include <errno.h>           #include <unistd.h>           #include <tracefs.h>           static void usage(char **argv)           {                   fprintf(stderr, "usage: %s [-ed][-n name][-s][-S fields][-m var][-c var][-T][-t dir][-f file | sql-command-line]\n"                           "  -n name - name of synthetic event 'Anonymous' if left off\n"                           "  -t dir - use dir instead of /sys/kernel/tracing\n"                           "  -e - execute the commands to create the synthetic event\n"                           "  -m - trigger the action when var is a new max.\n"                           "  -c - trigger the action when var changes.\n"                           "  -s - used with -m or -c to do a snapshot of the tracing buffer\n"                           "  -S - used with -m or -c to save fields of the end event (comma deliminated)\n"                           "  -T - used with -m or -c to do both a snapshot and a trace\n"                           "  -f file - read sql lines from file otherwise from the command line\n"                           "            if file is '-' then read from standard input.\n",                           argv[0]);                   exit(-1);           }           enum action {                   ACTION_DEFAULT          = 0,                   ACTION_SNAPSHOT         = (1 << 0),                   ACTION_TRACE            = (1 << 1),                   ACTION_SAVE             = (1 << 2),                   ACTION_MAX              = (1 << 3),                   ACTION_CHANGE           = (1 << 4),           };           #define ACTIONS ((ACTION_MAX - 1))           static int do_sql(const char *instance_name,                             const char *buffer, const char *name, const char *var,                             const char *trace_dir, bool execute, int action,                             char **save_fields)           {                   struct tracefs_synth *synth;                   struct tep_handle *tep;                   struct trace_seq seq;                   enum tracefs_synth_handler handler;                   char *err;                   int ret;                   if ((action & ACTIONS) && !var) {                           fprintf(stderr, "Error: -s, -S and -T not supported without -m or -c");                           exit(-1);                   }                   if (!name)                           name = "Anonymous";                   trace_seq_init(&seq);                   tep = tracefs_local_events(trace_dir);                   if (!tep) {                           if (!trace_dir)                                   trace_dir = "tracefs directory";                           perror(trace_dir);                           exit(-1);                   }                   synth = tracefs_sql(tep, name, buffer, &err);                   if (!synth) {                           perror("Failed creating synthetic event!");                           if (err)                                   fprintf(stderr, "%s", err);                           free(err);                           exit(-1);                   }                   if (tracefs_synth_complete(synth)) {                           if (var) {                                   if (action & ACTION_MAX)                                           handler = TRACEFS_SYNTH_HANDLE_MAX;                                   else                                           handler = TRACEFS_SYNTH_HANDLE_CHANGE;                                   if (action & ACTION_SAVE) {                                           ret = tracefs_synth_save(synth, handler, var, save_fields);                                           if (ret < 0) {                                                   err = "adding save";                                                   goto failed_action;                                           }                                   }                                   if (action & ACTION_TRACE) {                                           /*                                            * By doing the trace before snapshot, it will be included                                            * in the snapshot.                                            */                                           ret = tracefs_synth_trace(synth, handler, var);                                           if (ret < 0) {                                                   err = "adding trace";                                                   goto failed_action;                                           }                                   }                                   if (action & ACTION_SNAPSHOT) {                                           ret = tracefs_synth_snapshot(synth, handler, var);                                           if (ret < 0) {                                                   err = "adding snapshot";            failed_action:                                                   perror(err);                                                   if (errno == ENODEV)                                                           fprintf(stderr, "ERROR: '%s' is not a variable\n",                                                                   var);                                                   exit(-1);                                           }                                   }                           }                           tracefs_synth_echo_cmd(&seq, synth);                           if (execute) {                                   ret = tracefs_synth_create(synth);                                   if (ret < 0) {                                           fprintf(stderr, "%s\n", tracefs_error_last(NULL));                                           exit(-1);                                   }                           }                   } else {                           struct tracefs_instance *instance = NULL;                           struct tracefs_hist *hist;                           hist = tracefs_synth_get_start_hist(synth);                           if (!hist) {                                   perror("get_start_hist");                                   exit(-1);                           }                           if (instance_name) {                                   if (execute)                                           instance = tracefs_instance_create(instance_name);                                   else                                           instance = tracefs_instance_alloc(trace_dir,                                                                             instance_name);                                   if (!instance) {                                           perror("Failed to create instance");                                           exit(-1);                                   }                           }                           tracefs_hist_echo_cmd(&seq, instance, hist, 0);                           if (execute) {                                   ret = tracefs_hist_start(instance, hist);                                   if (ret < 0) {                                           fprintf(stderr, "%s\n", tracefs_error_last(instance));                                           exit(-1);                                   }                           }                   }                   tracefs_synth_free(synth);                   trace_seq_do_printf(&seq);                   trace_seq_destroy(&seq);                   return 0;           }           int main (int argc, char **argv)           {                   char *trace_dir = NULL;                   char *buffer = NULL;                   char buf[BUFSIZ];                   int buffer_size = 0;                   const char *file = NULL;                   const char *instance = NULL;                   bool execute = false;                   char **save_fields = NULL;                   const char *name;                   const char *var;                   int action = 0;                   char *tok;                   FILE *fp;                   size_t r;                   int c;                   int i;                   for (;;) {                           c = getopt(argc, argv, "ht:f:en:m:c:sS:TB:");                           if (c == -1)                                   break;                           switch(c) {                           case 'h':                                   usage(argv);                           case 't':                                   trace_dir = optarg;                                   break;                           case 'f':                                   file = optarg;                                   break;                           case 'e':                                   execute = true;                                   break;                           case 'm':                                   action |= ACTION_MAX;                                   var = optarg;                                   break;                           case 'c':                                   action |= ACTION_CHANGE;                                   var = optarg;                                   break;                           case 's':                                   action |= ACTION_SNAPSHOT;                                   break;                           case 'S':                                   action |= ACTION_SAVE;                                   tok = strtok(optarg, ",");                                   while (tok) {                                           save_fields = tracefs_list_add(save_fields, tok);                                           tok = strtok(NULL, ",");                                   }                                   if (!save_fields) {                                           perror(optarg);                                           exit(-1);                                   }                                   break;                           case 'T':                                   action |= ACTION_TRACE | ACTION_SNAPSHOT;                                   break;                           case 'B':                                   instance = optarg;                                   break;                           case 'n':                                   name = optarg;                                   break;                           }                   }                   if ((action & (ACTION_MAX|ACTION_CHANGE)) == (ACTION_MAX|ACTION_CHANGE)) {                           fprintf(stderr, "Can not use both -m and -c together\n");                           exit(-1);                   }                   if (file) {                           if (!strcmp(file, "-"))                                   fp = stdin;                           else                                   fp = fopen(file, "r");                           if (!fp) {                                   perror(file);                                   exit(-1);                           }                           while ((r = fread(buf, 1, BUFSIZ, fp)) > 0) {                                   buffer = realloc(buffer, buffer_size + r + 1);                                   strncpy(buffer + buffer_size, buf, r);                                   buffer_size += r;                           }                           fclose(fp);                           if (buffer_size)                                   buffer[buffer_size] = '\0';                   } else if (argc == optind) {                           usage(argv);                   } else {                           for (i = optind; i < argc; i++) {                                   r = strlen(argv[i]);                                   buffer = realloc(buffer, buffer_size + r + 2);                                   if (i != optind)                                           buffer[buffer_size++] = ' ';                                   strcpy(buffer + buffer_size, argv[i]);                                   buffer_size += r;                           }                   }                   do_sql(instance, buffer, name, var, trace_dir, execute, action, save_fields);                   free(buffer);                   return 0;           }

FILES        top

tracefs.h                   Header file to include in order to have access to the library APIs.-ltracefs                   Linker switch to add when building a program that uses the library.

SEE ALSO        top

sqlhist(1),libtracefs(3),libtraceevent(3),trace-cmd(1),tracefs_synth_init(3),tracefs_synth_add_match_field(3),tracefs_synth_add_compare_field(3),tracefs_synth_add_start_field(3),tracefs_synth_add_end_field(3),tracefs_synth_append_start_filter(3),tracefs_synth_append_end_filter(3),tracefs_synth_create(3),tracefs_synth_destroy(3),tracefs_synth_free(3),tracefs_synth_echo_cmd(3),tracefs_hist_alloc(3),tracefs_hist_alloc_2d(3),tracefs_hist_alloc_nd(3),tracefs_hist_free(3),tracefs_hist_add_key(3),tracefs_hist_add_value(3),tracefs_hist_add_name(3),tracefs_hist_start(3),tracefs_hist_destory(3),tracefs_hist_add_sort_key(3),tracefs_hist_sort_key_direction(3)

AUTHOR        top

Steven Rostedt<rostedt@goodmis.org[1]>Tzvetomir Stoyanov<tz.stoyanov@gmail.com[2]>sameeruddin shaik<sameeruddin.shaik8@gmail.com[3]>

REPORTING BUGS        top

       Report bugs to <linux-trace-devel@vger.kernel.org[4]>

LICENSE        top

       libtracefs is Free Software licensed under the GNU LGPL 2.1

RESOURCES        top

https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/

COPYING        top

       Copyright (C) 2020 VMware, Inc. Free use of this software is       granted under the terms of the GNU Public License (GPL).

NOTES        top

        1. rostedt@goodmis.org           mailto:rostedt@goodmis.org        2. tz.stoyanov@gmail.com           mailto:tz.stoyanov@gmail.com        3. sameeruddin.shaik8@gmail.com           mailto:sameeruddin.shaik8@gmail.com        4. linux-trace-devel@vger.kernel.org           mailto:linux-trace-devel@vger.kernel.org

COLOPHON        top

       This page is part of thelibtracefs (Linux kernel trace file       system library) project.  Information about the project can be       found at ⟨https://www.trace-cmd.org/⟩.  If you have a bug report       for this manual page, see ⟨https://www.trace-cmd.org/⟩.  This page       was obtained from the project's upstream Git repository       ⟨https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git⟩ on       2025-08-11.  (At that time, the date of the most recent commit       that was found in the repository was 2025-06-02.)  If you discover       any rendering problems in this HTML version of the page, or you       believe there is a better or more up-to-date source for the page,       or you have corrections or improvements to the information in this       COLOPHON (which isnot part of the original manual page), send a       mail to man-pages@man7.orglibtracefs 1.8.1                01/02/2025LIBTRACEFS(3)

Pages that refer to this page:sqlhist(1)tracefs_synth_create(3)tracefs_synth_echo_cmd(3)



HTML rendering created 2025-09-06 byMichael Kerrisk, author ofThe Linux Programming Interface.

For details of in-depthLinux/UNIX system programming training courses that I teach, lookhere.

Hosting byjambit GmbH.

Cover of TLPI


[8]ページ先頭

©2009-2025 Movatter.jp