Stream Parser (strparser)

Introduction

The stream parser (strparser) is a utility that parses messages of anapplication layer protocol running over a data stream. The streamparser works in conjunction with an upper layer in the kernel to providekernel support for application layer messages. For instance, KernelConnection Multiplexor (KCM) uses the Stream Parser to parse messagesusing a BPF program.

The strparser works in one of two modes: receive callback or generalmode.

In receive callback mode, the strparser is called from the data_readycallback of a TCP socket. Messages are parsed and delivered as they arereceived on the socket.

In general mode, a sequence of skbs are fed to strparser from anoutside source. Message are parsed and delivered as the sequence isprocessed. This modes allows strparser to be applied to arbitrarystreams of data.

Interface

The API includes a context structure, a set of callbacks, utilityfunctions, and a data_ready function for receive callback mode. Thecallbacks include a parse_msg function that is called to performparsing (e.g. BPF parsing in case of KCM), and a rcv_msg functionthat is called when a full message has been completed.

Functions

strp_init(struct strparser *strp, struct sock *sk,        const struct strp_callbacks *cb)

Called to initialize a stream parser. strp is a struct of typestrparser that is allocated by the upper layer. sk is the TCPsocket associated with the stream parser for use with receivecallback mode; in general mode this is set to NULL. Callbacksare called by the stream parser (the callbacks are listed below).

void strp_pause(struct strparser *strp)

Temporarily pause a stream parser. Message parsing is suspendedand no new messages are delivered to the upper layer.

void strp_unpause(struct strparser *strp)

Unpause a paused stream parser.

void strp_stop(struct strparser *strp);

strp_stop is called to completely stop stream parser operations.This is called internally when the stream parser encounters anerror, and it is called from the upper layer to stop parsingoperations.

void strp_done(struct strparser *strp);

strp_done is called to release any resources held by the streamparser instance. This must be called after the stream processorhas been stopped.

int strp_process(struct strparser *strp, struct sk_buff *orig_skb,                 unsigned int orig_offset, size_t orig_len,                 size_t max_msg_size, long timeo)

strp_process is called in general mode for a stream parser toparse an sk_buff. The number of bytes processed or a negativeerror number is returned. Note that strp_process does notconsume the sk_buff. max_msg_size is maximum size the streamparser will parse. timeo is timeout for completing a message.

void strp_data_ready(struct strparser *strp);

The upper layer calls strp_tcp_data_ready when data is ready onthe lower socket for strparser to process. This should be calledfrom a data_ready callback that is set on the socket. Note thatmaximum messages size is the limit of the receive socketbuffer and message timeout is the receive timeout for the socket.

void strp_check_rcv(struct strparser *strp);

strp_check_rcv is called to check for new messages on the socket.This is normally called at initialization of a stream parserinstance or after strp_unpause.

Callbacks

There are six callbacks:

int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);

parse_msg is called to determine the length of the next messagein the stream. The upper layer must implement this function. Itshould parse the sk_buff as containing the headers for thenext application layer message in the stream.

The skb->cb in the input skb is a struct strp_msg. Onlythe offset field is relevant in parse_msg and gives the offsetwhere the message starts in the skb.

The return values of this function are:

>0indicates length of successfully parsed message
0indicates more data must be received to parse the message
-ESTRPIPEcurrent message should not be processed by thekernel, return control of the socket to userspace whichcan proceed to read the messages itself
other < 0Error in parsing, give control back to userspaceassuming that synchronization is lost and the streamis unrecoverable (application expected to close TCP socket)

In the case that an error is returned (return value is less thanzero) and the parser is in receive callback mode, then it will setthe error on TCP socket and wake it up. If parse_msg returned-ESTRPIPE and the stream parser had previously read some bytes forthe current message, then the error set on the attached socket isENODATA since the stream is unrecoverable in that case.

void (*lock)(struct strparser *strp)

The lock callback is called to lock the strp structure whenthe strparser is performing an asynchronous operation (such asprocessing a timeout). In receive callback mode the defaultfunction is to lock_sock for the associated socket. In generalmode the callback must be set appropriately.

void (*unlock)(struct strparser *strp)

The unlock callback is called to release the lock obtainedby the lock callback. In receive callback mode the defaultfunction is release_sock for the associated socket. In generalmode the callback must be set appropriately.

void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);

rcv_msg is called when a full message has been received andis queued. The callee must consume the sk_buff; it cancall strp_pause to prevent any further messages from beingreceived in rcv_msg (see strp_pause above). This callbackmust be set.

The skb->cb in the input skb is a struct strp_msg. Thisstruct contains two fields: offset and full_len. Offset iswhere the message starts in the skb, and full_len is thethe length of the message. skb->len - offset may be greaterthen full_len since strparser does not trim the skb.

   int (*read_sock_done)(struct strparser *strp, int err);read_sock_done is called when the stream parser is done readingthe TCP socket in receive callback mode. The stream parser mayread multiple messages in a loop and this function allows cleanupto occur when exiting the loop. If the callback is not set (NULLin strp_init) a default function is used.::   void (*abort_parser)(struct strparser *strp, int err);This function is called when stream parser encounters an errorin parsing. The default function stops the stream parser andsets the error in the socket if the parser is in receive callbackmode. The default function can be changed by setting the callbackto non-NULL in strp_init.

Statistics

Various counters are kept for each stream parser instance. These are inthe strp_stats structure. strp_aggr_stats is a convenience structure foraccumulating statistics for multiple stream parser instances.save_strp_stats and aggregate_strp_stats are helper functions to saveand aggregate statistics.

Message assembly limits

The stream parser provide mechanisms to limit the resources consumed bymessage assembly.

A timer is set when assembly starts for a new message. In receivecallback mode the message timeout is taken from rcvtime for theassociated TCP socket. In general mode, the timeout is passed as anargument in strp_process. If the timer fires before assembly completesthe stream parser is aborted and the ETIMEDOUT error is set on the TCPsocket if in receive callback mode.

In receive callback mode, message length is limited to the receivebuffer size of the associated TCP socket. If the length returned byparse_msg is greater than the socket buffer size then the stream parseris aborted with EMSGSIZE error set on the TCP socket. Note that thismakes the maximum size of receive skbuffs for a socket with a streamparser to be 2*sk_rcvbuf of the TCP socket.

In general mode the message length limit is passed in as an argumentto strp_process.

Author

Tom Herbert (tom@quantonium.net)