I love C. I could give you rational reasons for that. Like the fact that it works everywhere and how it goes fast. But mostly I love C because when I write C I feel an intimate connection with my computer. To me, C has soul. In fact all my personal project are written in C, like the graphics rendering engines that I'm currently writing. The problem is that C is dangerous and sharing new C projects to wider audiences is borderline malicious.
Today on hacker news a cool little web framework written in C was posted. As a security researcher I'm always trying to sharpen my skills so I thought I'd given the app a look. And yup sure enough, there's memory safety issues.
HttpParser parseRequest(char *request) {
HttpParser parser = {
.isValid = true,
.requestBuffer = strdup(request), // [0]
.requestLength = strlen(request),
.position = 0,
};
// ... irrelevant stuff
for (int i = 0; i < parser.request.headerCount; i++) {
if (strcasecmp(parser.request.headers[i].name, "Content-Length") == 0) {
parser.request.bodyLength = atoi(parser.request.headers[i].value); // [1]
break;
}
}
parser.request.body = malloc(parser.request.bodyLength + 1); // [2]
for (int i = parser.position; i < parser.position + parser.request.bodyLength; i++) {
parser.request.body[i - parser.position] = parser.requestBuffer[i]; // [3]
}
line [1] takes Content-Length off the http packet. This is a non validated value basically straight from the socket. line [2] allocates based on that size. Line [3] copies data into that buffer based on that size. But it's copying out of a buffer of any size. So passing a Content-Length
Larger than the request
sent in will start copying heap data into the parser.request.body
.
Another interesting choice in this project is to make lengths signed:
typedef struct {
//...
int headerCount;
int headerCapacity;
//...
int bodyLength;
} HttpRequest;
typedef struct {
// ...
int position;
//...
int requestLength;
} HttpParser;
What does it mean to have a negative headerCount
or negative of anything here? Maybe there is a valid meaning, but we need to ask ourselves that question. Going back to the original code sample. A malicious user can pass Content-Length
of 4294967295
. malloc(parser.request.bodyLength + 1)
which becomes malloc(0)
actually returns a valid pointer in glibc. Now you won't immediately buffer overflow here because i < parser.position + parser.request.bodyLength
where i is initialized to parser.position
and it's basically parser.position-1
so i
will always be greater. But these mangled request body and length values get routed to the webdevs App. You'd better hope they catch the issue.
I love C it's simple, elegant but also so dang annoying.
Comments to be found on this hackernews thread. Ping me there to talk security, C, or graphics and rendering haha :)