Analysing Linux Kernel Commits

Analysing Linux Kernel Commits

It's been a while, hasn't it? This post is going to be a bit of a change of pace from usual, as its actually covering some research from last year I ended up dropping.

The plan was to do some analysis of Linux kernel commits, to determine the feasibility of automating the process of finding interesting and potentially exploitable vulnerabilities, hopefully putting a novel poc or two together.

However, between both IRL circumstances and simply underestimating the time involved, this has dragged on more than I'd like for a blog post to take and I'm eager to move onto new things. But instead of putting it on the back burner, AKA never to see the light of day again, I thought I'd share the tool I ended up writing and discuss some background behind it as well as my own takeaways during my time working on this stuff.

So in this post I'll talk a little about the background behind the motivations for looking into this and why kernel security fixes is an interesting topic. Then I'll do a quick tl;dr on the tool, Lica (Linux Commit Analyser), I wrote and share some takeaways.

Disclaimer

Before we dive into things, some of the topics and issues I cover in this post are both complex and contentious. I want to highlight that I am by no means an expert on these things, and my thoughts here are from the experiences (and biases) of a security researcher.

Where there are gaps in my understanding or knowledge, I'll try to the highlight them, and if anyone has any corrections or additional info please let me know, thank you!

Content

Background

The original motivation behind this research stems from a somewhat contentious and longstanding topic of discussion amongst the Linux kernel community regarding the handling of security fixes, such as instances of "silent security fixes".

First of all, to give some context to what we're talking about, let's do a quick tl;dr on kernel development and some of the terms mentioned so we're all up to speed! (feel free to skip)

kernel dev tl;dr

"The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel [...] Day-to-day development discussions take place on the Linux kernel mailing list (LKML). Changes are tracked using the version control system git" [1]

Specifically for a project using git, we can track the changes made by looking at the commits. A commit describes a set of changes made to the project by an author. If we look at projects on GitHub for example, we can see this. As of writing, the Linux kernel source tree mirror on GitHub has 1,154,596 commits that we can peruse!

That's a lot of changes, right? The Linux kernel has guidelines and rules about submitting patches[2], but typically a commit is a logically cohesive set of changes (i.e. you won't see a bunch of different fixes for different parts of the kernel in one commit, I hope anyway).

All these changes are organised into releases, which you can read about over at kernel.org[3], with new mainline kernels being releases every 9-10 weeks.

Important to note is the concept of backporting, whereby bug fixes introduced in latest releases are applied to older kernel releases as well. There are several long-term maintenance (aka LTS) kernel releases, to designate support for older kernels.

on (silent) security fixes

There's been lots of discussion surrounding security fixes and how they should be handled in relation to non-security fixes in the kernel, and this dialogue has understandably evolved over the years as our concept and understanding of security has too.

It's a complex topic and to over simplify the arguments, on either extreme of the axis you may have folks saying all fixes should be treated equally, while others would argue security fixes need to be dealt with in a specific way, highlighting the impact etc.

A recurring topic in this space is the concept of "silent security fixes", where a commit fixing a potentially exploitable vulnerability intentionally omits information regarding the security implications/reasons behind the fix.

This has been up for debate within the community as far back, at least, as 2008 as we can seem from this post on the Full Disclosure mailing list from 2008, titled "Linux's unofficial security-through-coverup policy" by @spendergrsec.

Now as I mentioned earlier, a lot has changed since then, and our perception of security has come a long way since then. However over the years there have still been cases of, at worst, silent security fixes or, at best, inconsistency in the handling of security fixes[5][6][7][8].

the plan

Putting this altogether, I was interested in analysing Linux kernel commits in a somewhat automated way such that I could filter for security fixes and explore trends.

With full understanding that I'm no data scientist or software engineer, I whipped up a quick (and very hacky) tool to delve around a bit and have some fun.


  1. https://en.wikipedia.org/wiki/Linux_kernel
  2. https://www.kernel.org/doc/html/latest/process/submitting-patches.html
  3. https://www.kernel.org/category/releases.html
  4. https://github.com/hardenedlinux/grsecurity-101-tutorials/blob/master/kernel_vuln_exp.md#silent-fixes-from-linux-kernel-community--welcome-to-add-more-for-fun
  5. https://arstechnica.com/information-technology/2013/05/critical-linux-vulnerability-imperils-users-even-after-silent-fix/
  6. CVE-2022-1786 was UAF leading to LPE, with no mention in the fix commit
  7. CVE-2022-2602 was a UAF leading to LPE, with no mention in the fix commit
  8. CVE-2021-41073 was disclosed by @chompie1337, although the fix commit has no mention of the exploitability and they also asked her to use a non-security related email for the "Reported-by" ack (as mentioned in @chompie1337's article here)

Lica

get ready for some peak xdev-ctf-poc-tier code

Let's talk about the tool! I'll try keep this brief, both for my dignity and your sanity. I put together this tool using Python to parse kernel commits and try filter them for interesting security related fixes as well as any interesting stats along the way.

sam4k/lica
A hacky tool for analysing linux kernel commits. Contribute to sam4k/lica development by creating an account on GitHub.

Thanks to the kernel patch submission guidelines[1], there's some level of consistency in what to expect a commit to contain, which helps us filter down the 34000 or so commits in the last 6 months to around 135 possible security fixes - neat!

Commit...... | Subsystem......... | Hits.................................... | CVE............. | Reporter.......................................... | Coverage.......
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
331cd9461412 | btrfs              | use-after-free                           |                  | Ye Bin <yebin10@huawei.com>                        | linux-5.15.90, linux-5.10.165, linux-5.4.230
...
cf6531d98190 | ksmbd              | use-after-free                           |                  | zdi-disclosures@trendmicro.com # ZDI-CAN-17816     | N/A            
...          
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Now For The Stats...
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
[+] 133 commits where matched from 2448 fixes, over 33487 commits.
[+] 36 / 133 listed a reporter.
[+] 2 / 133 mentioned a CVE.
[+] Breakdown by category:
|---- UAF: 95
|---- Races: 22
|---- Generic: 15
|---- Info Leak: 10
|---- Stack Overflow: 2
...
[+] Breakdown by module:
|---- mm: 11
|---- wifi: 10
|---- drm: 8
|---- media: 6
|---- net: 5
|---- cifs: 4
|---- io_uring: 4
...
output for the last 6 months or so, checking for coverage in latest 5.15, 5.10, 5.4 at the time

Above is a sample output from Lica, analysing kernel commits over the past 180 days. Here I've used a really basic approach of looking for fixes via keyword in the commit summary phrase and then further filtering those fixes by looking for hits in a dictionary of common bug classes/terminology, grouped by category.

A (slightly) more nuanced approach, looking at some of the "silent fixes" from earlier, would be to grep for typical causes for bug classes + the omission of bug classes. A simple example might be check.*len for missing length checks.

It's worth noting that while we can use a basic dictionary or even filter by specific reporters (I'm looking at you ZDI), using a bug cause focused dictionary (that omits security-centric terms) yields just as many results.

While more false positives, I think this reiterates that a determined attacker doesn't need to just grep for "buffer overflow privesc" or a CVE to find potentially exploitable vulnerabilities. Whether that's manually enumerating commits or using an approach like this which takes a few hours to put together, which makes me wonder why we have cases such as a researcher being ask to use a non security related email for the "Reported-by" ack[2]??

Back to Lica, I also include a naive check to see if a particular kernel release has the patch, for checking older LTS kernels for backports (the Coverage column). There's no doubt an easier and more reliable way to do this, but hey-ho, this did the trick for now.

Anyways, I tried to make this somewhat extensible and configurable, so I've chucked it up on GitHub in case anyone is interested in having a play with it. You've been warned about the quality!


  1. https://www.kernel.org/doc/html/latest/process/submitting-patches.html
  2. CVE-2021-41073 was disclosed by @chompie1337, although the fix commit has no mention of the exploitability and they also asked her to use a non-security related email for the "Reported-by" ack (as mentioned in @chompie1337's article here)

Takeaways

Despite not getting to spend much time fine tuning or tweaking the tool do some in-depth analysis, it's been a fun little project and broaches an important discussion.

It does feel like, as a security researcher, there is still a lack of transparency and consistency in the processes and handling of security disclosures and fixes in the kernel.

Whether there's intentional omission of security relevant information or just a difference in opinion on what constitutes relevant information, the end result is still a lack of consistency in how reported security issues are handled.

For example, I wrote about my experience disclosing a kernel vulnerability at the beginning of 2022[1]. While the process was a bit convoluted for me, after getting in touch with the right folks, I had no issues with communication and the commit referenced the reporter, CVE and vulnerability being fixed[2].

However, as I touched on earlier in the post, other researchers have had different experiences and the resulting patches can vary in their security relevant content.

On Disclosures

If you want to report a kernel vulnerability, you'll typically end up staring at two pages:

  1. The official kernel documentation on "Security Bugs"[3][4],
  2. The linux-distros mailing list wiki page[5]

The tl;dr here is the kernel security team's focus is solely on finding and applying a fix for security bugs. To allocate a CVE, inform vendors of the security impact (LPE, RCE etc.) then you need to coordinate with the linux-distros list too.

There's been a history of friction between the policies of the two bodies, with security researchers getting caught up between the two. The most recent instance being the public disclosure of CVE-2023-0179 over on oss-security[6].

Unfortunately I don't fully understand the root cause of the misunderstanding. As Solar Designer points out, this seems to stem from a policy change made to accommodate the kernel security team[8], as part of a wider discussion on linux-distros policy last year[9], but I'm not entirely sure what policy this disclosure broke on the kernel documentation for "Security Bugs"[3].

Beyond highlighting the work required on the part of the researcher to make sure they follow the right steps and policies, this instance also shows where this rift might end up if it things carry on the way they are, with Solar Designer commenting:

It may well be the last straw that will result in Linux kernel documentation getting updated so that reporters would not be instructed to contact linux-distros anymore (or would even be instructed not to?)  On one hand, this is bad.  On the other, everyone is tired of the inconsistencies and the drama.

Solar Designer then goes on to explain a potential solution to ensure oss-security still keeps up-to-date with kernel security issues if things do go south:

I suppose we (oss-security community?) could want to setup a crawler detecting likely security issues on Linux kernel mailing lists and among Linux kernel commits (including branches).  This could detect even more issues than are being brought to linux-distros and oss-security now.

While somewhat ironic given the topic of this post (not that my code is fit for scale lol), its a shame that there's still discord regarding the handling of kernel security issues when this is a debate that's been going on for so many years at this point.

I don't have all the information or experience to suggest any solutions for a decades long pain point, but I do hope there's one out there and we can find it soon.

Transparency and consistency surrounding these processes helps to encourage researchers to participate in coordinated vulnerability disclosure for kernel vulns. Having more clarity around the handling and state of security fixes should also help vendors and such too, as well as help us as a community to continue to progress with regards to our attitude and approach to security.


  1. https://sam4k.com/a-dummys-guide-to-disclosing-linux-kernel-vulnerabilities/
  2. https://github.com/torvalds/linux/commit/9aa422ad326634b76309e8ff342c246800621216
  3. https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
  4. small note, the first result on google for me is actually an older copy, from the 4.14 kernel which omits some clarity found in the latest versions
  5. https://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists
  6. https://seclists.org/oss-sec/2023/q1/22
  7. https://www.openwall.com/lists/oss-security/2022/05/24/1
  8. https://seclists.org/oss-sec/2022/q2/99
  9. https://seclists.org/oss-sec/2022/q4/221

Conclusion

Well, this one was a bit of a change of pace for me and was a step out of my comfort zone, considering I normally focus on more objective, technical subjects. That probably explains why it took so much longer to write!

Hopefully I didn't stir the pot too much; my goals for this post were to share some takeaways from a project that otherwise would have been relegated to the recycling bin as well as shed some light on a relevant and important topic within the community.

Despite my criticism of the current status quo, I have a lot of respect for the time and effort put in by all of those involved in the Linux kernel community.

Fingers crossed this was interesting for those of you that made it this far, but don't fear, I've got some more technical posts lined up for both kernel exploitation and internals!

exit(0);

Show Comments