Last Updated: 2009-07-17 14:45:10 UTC
by Bojan Zdrnja (Version: 2)
Source code for a exploit of a Linux kernel vulnerability has been posted by Brad Spengler (Brad is the author of grsecurity). I have to tell you right now – this was one of the most fascinating bugs I've read about lately.
Why is it so fascinating? Because a source code audit of the vulnerable code would never find this vulnerability (well, actually, it is possible but I assure you that almost everyone would miss it). However, when you add some other variables into the game, the whole landscape changes.
While technical details about this are a bit complex, generally what's happening can be easily explained. The vulnerable code is located in the net/tun implementation. Basically, what happens here is that the developer initialized a variable (sk in the code snippet below) to a certain value that can be NULL. The developer correctly checked the value of this new variable couple of lines later and, if it is 0 (NULL), he just returns back an error. The code looks like this:
struct sock *sk = tun->sk; // initialize sk with tun->sk
return POLLERR; // if tun is NULL return error
This code looks perfectly ok, right? Well, it is, until the compiler takes this into its hands. While optimizing the code, the compiler will see that the variable has already been assigned and will actually remove the if block (the check if tun is NULL) completely from the resulting compiled code. In other words, the compiler will introduce the vulnerability to the binary code, which didn't exist in the source code. This will cause the kernel to try to read/write data from 0x00000000, which the attacker can map to userland – and this finally pwns the box. There are some other highly technical details here so you can check your favorite mailing list for details, or see a video with this exploit on YouTube at http://www.youtube.com/watch?v=UdkpJ13e6Z0. Brad was able to even bypass SELinux protections with this and LSM.
The fix for this is relatively easy, the check has to be done before assigning the value to the sk structure.
Fascinating research that again shows how security depends on every layer, and how even very expensive source code audit can result in missed vulnerabilities.
It looks like this generated a lot of interest, judging by the number of e-mails we received and posted comments.
Brad was nice to contact us as well and send some additional clarifications about things that are confusing people, I'm pasting this below:
The thing that is NULL is the tun ptr. Without any exploit (just calling the poll function on the device) this would cause a crash. As I noted in my timeline in the exploit (which also explains how the exploit works) the bug was fixed but no one understood the security implications of it (so without my exploit, or vendor-sec seeing my videos for a week prior to its release) none of this stuff would have been discovered.
Because tun is dereferenced (to use tun->sk) the compiler assumes that tun is non-null, so it removes the check for tun against NULL.
By mmaping at 0 (and first bypassing restrictions on doing so either by just having SELinux enabled -- their default policies allow anyone to mmap at 0, overriding any mmap_min_addr restriction provided by the kernel, due to the fact that both SELinux and mmap_min_addr both contend for the same LSM hook, but only one can be used (SELinux's in this case)) I can avoid the initial crash that would occur upon the initializing of sk from tun->sk, and since the tun NULL check does not exist, proceed with exploiting the kernel.