Note: This is the first time I've written anything whatsoever in straight C. In other words:I have no idea what I'm doing.
I recently had a task that involved temporarily relaying responses from one server, through a web-facing server, and then on to the client. Along the way, the relay server had to dynamically add an HTTP header of its own to the relayed response. Came up with a quick and dirty scripted solution, but it worked for the duration it had to.
Still, I thought it might be a good exercise for getting to know a little C, just for fun, since something like this would benefit from being written closer to the metal.
So below is a program (I call itheadshape for lack of a better name) where you pipe in (say, from a curl command) an HTTP response including headers, and the program will add/remove/modify one of those headers, and pass on the rest on as-is. Given no arguments, stdin will just pass to stdout.
E.g.
$ curl -i example.com | headshape # pass through$ curl -i example.com | headshape Server # remove the Server header$ curl -i example.com | headshape Via 'foobar' # add/modify the Via headerSo it does a bit of HTTP parsing to find the right header - or add it if it's not there - and to figure out what the response code is. Only responses with a 2xx status will be altered, 1xx responses will be ignored, and 3xx and above will cause it to just forward all remaining data unaltered (this partly an arbitrary choice; it's just a code exercise after all). It also defaults to forwarding everything as-is if it sees something it can't make sense of.
Below is the code. Much to my surprise it seems to work exactly as intended but, again,I've never written C code, so it's no doubt horrible. Any and all tips are welcome.
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <unistd.h>#define CRLF "\r\n" // HTTP line ending#define BLOCK_SIZE 1024 // Any reason to make this larger/smaller?#define HEADER_DELIM ": " // Used to spot header lines#define MAX_HEADER_LENGTH 100 // max length of a header *name*; not the whole header linestatic int parseStatus(char* line);static char* getHeaderName(char* line);static int isDelimited(char* line, int continuation);static int isBoundary(char* line);void writeLine(char* line);void passthru();int main(int argc, char const *argv[]) { // states int should_parse = 0; int effective_status = 0; // options/ char* target_header = NULL; char* payload_line = NULL; // loop variables int line_continued = 0; int current_status = 0; char* current_header = NULL; char* line_out = NULL; char line[BLOCK_SIZE]; // exit with >0 if nothing's being piped if(isatty(fileno(stdin))) { return 1; // TODO: is there a better code for this? } // Get the target header argument, if any if(argc > 1) { should_parse = 1; // could just check target_header, but this seems more descriptive target_header = (char*) argv[1]; } // Get the target value argument, if any if(argc > 2) { // build the payload header line (i.e. "Header: value") size_t payload_length = 1; payload_length += strlen(target_header); payload_length += strlen(HEADER_DELIM); payload_length += strlen(argv[2]); payload_length += strlen(CRLF); payload_line = (char *) malloc(payload_length); strcpy(payload_line, target_header); strcat(payload_line, HEADER_DELIM); strcat(payload_line, argv[2]); strcat(payload_line, CRLF); } // heading parsing loop while(should_parse && fgets(line, BLOCK_SIZE, stdin)) { // if previous fgets didn't get a complete, CRLF-delimited line // (i.e. a line exceeded BLOCK_SIZE), assume the current chunk // is the continuation, and output it without parsing if(line_continued) { writeLine(line_out); line_continued = !isDelimited(line, line_continued); continue; } // check if this is a "complete" CRLF-delimited line line_continued = !isDelimited(line, line_continued); // reference our input for later line_out = line; if((current_status = parseStatus(line))) { // set the status code effective_status = current_status; // we're not interested in rewriting 3xx and higher responses if(effective_status >= 300) { // pass the remaining data directly to stdout should_parse = 0; } } else if((current_header = getHeaderName(line))) { // check if this is the header we're looking for if(strcmp(current_header, target_header) == 0) { // replace or remove the header line_out = payload_line; // pass the remaining data directly to stdout should_parse = 0; } } else if(isBoundary(line)) { // We've reached the end of the HTTP header section. // If effective_status is 1xx, parsing continues but // 2xx will stop the parsing and insert the header // if need be if(effective_status >= 200) { // add the header, if this is a 2xx response if(payload_line && effective_status < 300) { writeLine(payload_line); } // pass the remaining data directly to stdout should_parse = 0; } } else { // stop parsing if the line isn't recognized and // pass the remaining data directly to stdout should_parse = 0; } // output the line (unless it's NULL) if(line_out) writeLine(line_out); } // pass the remaining data directly to stdout passthru(); // free the payload line, if necessary if(payload_line) free(payload_line); return 0;}// Gets the header name in the line, if any.static char* getHeaderName(char* line) { static char header[MAX_HEADER_LENGTH]; char* delim = strstr(line, HEADER_DELIM); if(delim) { size_t count = delim - line; count = count >= MAX_HEADER_LENGTH ? MAX_HEADER_LENGTH - 1 : count; strncpy(header, line, count); header[count] = '\0'; return header; } return NULL;}// Match a line like "HTTP/1.x xxx..." and extract the status code// Not exactly regex-like precision here, though. Maybe use strtok()?static int parseStatus(char* line) { size_t code_length = 4; char* http = strstr(line, "HTTP/1."); char* delim = strchr(line, ' '); size_t remaining = strlen(line) - (delim - line) + 1; if(http && delim && remaining > code_length) { char code[remaining]; strncpy(code, delim, remaining); return atoi(code); } return 0;}// Check whether a string ends with CRLF// Note: This function keeps the last char of the line from its previous// invocation in memory in order to spot the 2-character CRLF even if it's// been split into separate fgets lines.static int isDelimited(char* line, int continuation) { static char prev = 0; char tail[3] = {'\0', '\0', '\0'}; int offset = strlen(line) - 2; if(offset >= 0) { // grab last 2 chars tail[0] = line[offset]; tail[1] = line[offset + 1]; } else if(continuation && offset == -1) { // grab char carried over from earlier, and the line's single char tail[0] = prev; tail[1] = line[0]; } prev = tail[1]; return !strcmp(tail, CRLF);}// is this line a "blank" CRLF line?static int isBoundary(char* line) { return strlen(line) == strlen(CRLF) && strcmp(line, CRLF) == 0;}// write to stdout & flushvoid writeLine(char* line) { fwrite(line, sizeof(char), strlen(line), stdout); fflush(stdout);}// pass stdin through to stdoutvoid passthru() { char buffer[BLOCK_SIZE]; while(1) { size_t bytes = fread(buffer, sizeof(char), BLOCK_SIZE, stdin); fwrite(buffer, sizeof(char), bytes, stdout); fflush(stdout); if(bytes < BLOCK_SIZE && feof(stdin)) break; }}1 Answer1
A few items:
UseSTDIN_FILENO
Instead offileno(stdin) you can use the preprocessor symbolSTDIN_FILENO, defined in<unistd.h> which you have already included.
should_parse should be declaredbool
Assuming you're using a compiler that isn't decades old (bool was added in the c99 standard), you can include<stdbool.h> and usebool for the type ofshould_parse. It also allows you to use the values oftrue andfalse within the code to make the intention of that flag a little more explicit.
The same is true forline_continued, the return value ofisBoundary, andisDelimited and the second argument ofisDelimited.
target_header should beconst andargv should not
Yourtarget_header variable should be declaredconst because its contents are not altered by the program. If you make that change, you can also remove the cast from the line
target_header = (const char *)argv[1];However, you wouldn't actually need that cast anyway if you had declaredmain as:
int main(int argc, char *argv[])It doesseem like that ought to beconst *argv but that's counter to what the C standard says. In section 5.1.2.2.1, it says that:
the strings pointed to by the
argvarray shall be modifiable by the program
which implies that they're notconst.
Don't abusefflush
You probably don't need to callfflush after everyfwrite. The stream will automatically flush when the file is closed, which also happens automatically whenmain ends.
Make loop exits explicit
Inpassthru(), instead of usingwhile(1) and then using abreak it would be more clear to write it like this:
size_t bytes;do { bytes = fread(buffer, sizeof(char), BLOCK_SIZE, stdin); fwrite(buffer, sizeof(char), bytes, stdout);} while (bytes == BLOCK_SIZE && !feof(stdin));SimplifyisBoundary
TheisBoundary routine can be simplified as this:
// is this line a "blank" CRLF line?static bool isBoundary(char* line) { return strcmp(line, CRLF) == 0;}There is no need to also compare their lengths, since that's already implicit instrcmp.
Calculation ofpayload_line is complex
Thepayload_line variable size is calculated, and thenmalloc'd and then copied, but it's only used a few other places. What you've got isn't wrong, but it might be worthwhile instead allocating a fixed size and then usingsnprintf to populate the string. That would collapse the dozen lines used for that calculation to the much simpler single line:
snprintf(payload_line, BLOCK_SIZE, "%s" HEADER_DELIM "%s" CRLF, target_header, argv[2]);I'm sure there's more but I lack the time at the moment.
Update
I found some time and a few other things in the code.
RefactorparseStatus
The currentparseStatus routine does more work than needed to extract the status code. You don't need to copy the string to pass it toatoi so it can be greatly simplified like so:
static int parseStatus(char* line) { char* http = strstr(line, "HTTP/1."); if (http) { http = strchr(http, ' '); if (http) { return atoi(http); } } return 0;}RefactorgetHeaderName
Right now, thegetHeaderName function is outlined something like this:
static char* getHeaderName(char* line) { static char header[MAX_HEADER_LENGTH]; /* ... */ return header;}This works (and even some ancient library functions were done this way) but it's not thread-safe and the design can easily be improved. Instead, it's better to have the calling function allocate a buffer and pass it in, along with the length rather than pointing to an internal static buffer.
RefactorisDelimited
In a similar vein, yourisDelimited function is really only checking for two particular bytes, so it would be more clear and slightly faster to check for them directly instead of constructing a copy in memory and then usingstrcmp.
- \$\begingroup\$Great tips, thanks! Re
bool: was trying some things out using ints, and it stuck; will definitely fix that.argv: I used my editor's C template and didn't ask questions. Good to know theconstisn't required.fflush: I smelled that myself, but I also figured I'd have many Mbits/s streaming through, so maybe better to get stuff flushed asap? Loop exit: Guilty of googling and copy/pasting without thinking!isBoundary: Ah, of course! Should've seen that myself.payload_line: I could smell that code from a mile away, but I didn't know a better way to do it. Now I do - thanks!\$\endgroup\$Flambino– Flambino2014-05-04 23:55:48 +00:00CommentedMay 4, 2014 at 23:55 - 1\$\begingroup\$It's a pretty good attempt for a first foray into C. You did well, despite all the nit picking!\$\endgroup\$Edward– Edward2014-05-04 23:58:06 +00:00CommentedMay 4, 2014 at 23:58
- 1\$\begingroup\$Oh, and in my defense, using
mallocandfreemade me feel like I wasreally writing C code after all these years of languages with GC ;)\$\endgroup\$Flambino– Flambino2014-05-04 23:58:39 +00:00CommentedMay 4, 2014 at 23:58 - \$\begingroup\$Thanks! And hey, nit-picking is what I want in a review! I mean, it'd be pretty cool if my very first stab at C was a gleaming example of perfection, but let's just say I didn't expect that to be the case :)\$\endgroup\$Flambino– Flambino2014-05-05 00:00:37 +00:00CommentedMay 5, 2014 at 0:00
- \$\begingroup\$Just noticed your update - thanks again! Good point about the construction/signature of
getHeaderNamein particular. I definitely see your point (and I recognize the pass-in-a-buffer approach, now that you mention it). Great stuff.\$\endgroup\$Flambino– Flambino2014-05-05 15:01:21 +00:00CommentedMay 5, 2014 at 15:01
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
