Redscan’s Curious Comments About Vulnerabilities

As a connoisseur of vulnerability disclosures and avid vulnerability collector, I am always interested in analysis of the disclosure landscape. That typically comes in the form of reports that analyze a data set (e.g. CVE/NVD) and draw conclusions. This seems straight-forward but it isn’t. I have written about the varied problems with such analysis many times in the past and yet, companies that don’t operate in the world of vulnerability databases still decide to play in our mud puddle. This time is the company Redscan, who I don’t think I had heard of, doing analysis on NVD data for 2020. Risk Based Security wrote a commentary on their analysis, to which I contributed, but I wanted to keep the party going over here with a few more personal comments. Just my opinions here, as a more outspoken critic on the topic, and where I break from the day job.

I am going to focus on one of my favorite topics; vulnerability tourists. People that may be in the realm of Information Security, but don’t specifically operate day-to-day in the world of vulnerability disclosures, and more specifically to me, vulnerability databases (VDBs). For this blog, I am just going to focus on a few select quotes that made me double-take. Read on after waving to Tourist Lazlo!

“The NVD tracks CVEs logged by NIST since 1988, although different iterations of the NVD account for some variation when comparing like-for-like results over time.”

There’s a lot to unpack here, most of it wrong. First, the NVD doesn’t track anything; they are spoon-fed that data from MITRE, who manages the CVE project. Second, NIST didn’t even create NVD until over five years after CVE started. Third, CVE didn’t track vulnerabilities “since 1988”; they cherry-picked some disclosures from before 1999, when they started, and why CVE IDs start with ‘1999’. Fourth, there was only one different iteration of NVD, that was their ICAT “CVE Metabase” that ran the first year of CVE basically. According to Peter Mell, who created it, said that after starting as its own vulnerability website, “ICAT had become an archival tool for CVE standard vulnerabilities and was only updated every three or four weeks”. Then in 2005 the site relaunched with a new focus and timely updates from CVE. Despite this quote, later in their report they produce a chart that tries to show an even comparison from 1988 to 2020 despite saying it went through iterations and despite not understanding CVSS.

“The growth is also likely attributable to an increase in the number of CVE Numbering Authorities (CNAs) – of which there are now more than 150 worldwide with the power to create and publish CVEs.”

The growth in disclosures aggregated by CVE is a lot more complicated than that, and the increase in CNAs I doubt is a big factor. Of course, they say this and don’t cite any evidence despite CVE now showing who the assigning CNA was (e.g. CVE-2020-2000 is Palo Alto Networks). The data is there if you want to make that analysis but it isn’t that easy since it isn’t included in the NVD exports. That means it requires some real work scraping the CVE website since they don’t include it in their exports either. Making claims without backing them up when the data is public and might prove your argument is not good.

“Again, this is a number that will concern security teams, since zero interaction vulnerabilities are famously difficult to detect and have the potential to cause significant damage.”

This makes me think that Redscan should invent a wall, perhaps made of fire, that could detect and prevent these attacks! Or maybe a system that is designed to detect intrusions! Or even one that can prevent intrusions! This quote is one that is truly baffling because it doesn’t really come with an explanation as to what they mean, and I hope they mean something far different than what this sounds like. I hope this isn’t a fear tactic to make readers think that their managed detection service is needed. Quite the opposite; anyone who says the above probably should not be trusted to do your attack detection.

This chart heading is one of many signs that Redscan doesn’t understand CVSS at all. For a “worst of the worst” vulnerability they got several attributes right but end up with “Confidentiality [High]”. The vulnerability they describe would only be CVSSv2 7.8 (AV:N/AC:L/Au:N/C:C/I:N/A:N) and CVSSv3.1 7.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N). That is not the worst. If it ‘highly’ impacts confidentiality, integrity, and availability then that becomes the worst of the worst, becoming CVSSv2 10.0 and CVSSv3.1 9.8 or 10.0 depending on scope. It’s hard to understand how a security company gets this wrong until you read a bit further where they say they “selected the very ‘worst’ option for every available metric.” My gut tells me they didn’t realize you could toggle ‘High’ for more than one impact and confidentiality is the first listed.

“It is also important to note that these numbers may have been artificially reduced. Tech giants such as Google and Microsoft have to do a lot to maintain their products and services day-to-day. It is common for them to discover vulnerabilities that are not being exploited in the wild and release a quick patch instead of assigning a CVE. This may account for fewer CVEs with a network attack vector in recent years.”

This is where general vulnerability tourism comes in as there is a lot wrong here. Even if you don’t run a VDB you should be passingly familiar with Microsoft advisories, as an example. Ever notice how they don’t have an advisory with a low severity rating? That’s because they don’t publish them. Their advisories only cover vulnerabilities at a certain threshold of risk. So that means that the statement above is partially right, but for the wrong reason. It isn’t about assigning a CVE, it is about not even publishing the vulnerability in the first place. Because they only release advisories for more serious issues, it actually skews their numbers to include more remote vulnerabilities, not less, primarily on the back of “remote” issues that require user interaction such as browser issues or file parsing vulnerabilities in Office.

This quote also suggests that exploitation in the wild is a bar for assigning a CVE, when it absolutely is not. It might also be a surprise to a company like Redscan, but there are vulnerabilities that are disclosed that never receive a CVE ID.

“Smart devices designed for the mass market often contain a worrying number of vulnerabilities due to manufacturer oversight. Firmware within devices is often used by multiple vendors, meaning that any vulnerabilities in this software has the potential to result in lots of CVEs.”

Wrong again, sometimes. If it is known to be the same firmware used in multiple devices, it gets one CVE ID. The only time there are additional IDs assigned is when multiple disclosures don’t positively ID the root cause. When three disclosures attribute the same vulnerability to three different products, it stands to reason there will be three IDs. But it isn’t how CVE is designed because it artificially inflates numbers, and that is the game of others.

“The prevalence of low complexity vulnerabilities in recent years means that sophisticated adversaries do not need to ‘burn’ their high complexity zero days on their targets and have the luxury of saving them for future attacks instead.”
-vs-
“It is also encouraging that the proportion of vulnerabilities requiring high-level privileges has been on the increase since 2016. This trend means that cybercriminals need to work harder to conduct their attacks.”

So which is it? When providing buzz-quote conclusions such as these, that are designed to support the data analysis, they shouldn’t contradict each other. This goes back to what I have been saying for a long time; vulnerability statistics need qualifications, caveats, and explanations.

“Just because a vulnerability is listed in the NVD as hard to exploit doesn’t mean that attackers aren’t developing PoC code to exploit it. The key is to keep up with what’s happening in the threat landscape and respond accordingly.”

I’ll end here since this is a glowing endorsement for why vulnerability intelligence has to be more evolved than what CVE and NVD are offering. Part of the CVSS specifications include Temporal scoring and one of those attributes is Exploit Code Maturity. This is designed to specifically address the problem above; that knowing the capability of potential attackers matters. With over 21,000 vulnerabilities disclosed last year, organizations are finding that just patching based on the CVSSv3 base score isn’t enough. Sure, you patch the 10.0 / 9.8 since those are truly the worst-of-the-worst, and you patch the 9.3 / 8.8 since any random email might carry a payload. Then what? If all things are equal between vulnerabilities that impact your organization you should look to see if a patch is available (also covered by Temporal score) and if an exploit is available.

Numeric scores are not enough, you have to understand the context behind them. That CVSSv2 remote information disclosure that partially affects confidentiality by disclosing an admin password is only a 5.0. Score it under CVSSv3 and you are looking at a 9.8 because it immediately leads to privilege escalation which is factored in under that system. Heartbleed was a CVSSv2 5.0 with a functional exploit and available patch; look what hell that brought upon us. If you aren’t getting that type of metadata, reconsider your choice of vulnerability intelligence.

Before you publish your end-of-year vulnerability statistics…

TL;DR – The CVE dataset does not allow you to determine how many vulnerabilities were disclosed in 2017.


I’ll try to keep this fairly short and to the point, but who am I kidding? Every year for a decade or more, we see the same thing over and over: companies that do not track or aggregate vulnerabilities decide to do their own review and analysis of disclosures for the prior year. Invariably, most do it based on the publicly available CVE/NVD data, and they do it without understanding what the dataset really represents. I know, it seems simple on the surface, but the CVE dataset is not easily understood. Even if you understand the individual contents of the export, you may not understand how it was created, what shortcomings there are, what is missing, and what statistical traps you face in digesting the data. Just doing the basic parsing and automated ‘analysis’ of that data via your tool of choice (be it grep or something fancier) means very little unless you can disclaim and properly explain your results. Either way, follow along with the advice below before you publish your ‘vulnerability stats for 2017’ please!

So let’s start with the basics of CVE data analysis. Begin by grabbing the latest CVE dump, a gzipped CSV file, that represents MITRE’s CVE dataset. Note, this is different than the exports NVD offers and welcome to the first hurdle. While the base vulnerability data is 100% equivalent between the two, NVD does additional analysis and creates metadata that is useful to many organizations. NVD provides CVSS scoring and CPE data for example. The relationship between CVE and NVD is interesting if you observe it over time, where it used to be a clear ‘MITRE publishes, a day later NVD publishes’ relationship. For the last year or two, NVD will sometimes open up a CVE ID before MITRE does for various reasons. This also gave way to Bill Ladd observing and writing about how the Chinese National Vulnerability Database (CNNVD) is actually opening up CVE IDs faster than both NVD and MITRE. Consider that for a minute and understand that the relationship between these three entities is not straightforward. Then consider the relationship between many other entities in the bigger picture, and it gets even more convoluted.

See? You start by grabbing a data dump, a paragraph later you have the start of disclaimers and oddities as pertains to the larger CVE ecosystem. Next, decompress the CVE dump so you have a CSV file to work with. Now, before you eagerly start to parse this data, stop for a moment. Did you do this same analysis last year? If so, great! Do you understand what has changed in the last 18 months with regards to CVE and more specifically MITRE? If you can’t quickly and readily answer that question definitively, the kind of changes that are the first in almost 19 years for the program, reconsider if you should be commenting on this data. In case you missed it, Steve Ragan published an article about MITRE / CVE’s shortcomings in September of 2016. The article pointed out that MITRE was severely deficient in vulnerability coverage, as it has been for a decade. Unlike other articles, or my repeated blogs, Ragan’s article along with additional pressure from the industry prompted the House Energy and Commerce Committee to write a letter to MITRE asking for answers on March 30, 2017. When a certain board member brought it up on the CVE Board list, and directly told MITRE that their response should be made public, MITRE did not respond to that mail in a meaningful manner and ultimately never shared their response to Congress with the CVE Board. It is important for you to understand that MITRE operates CVE as they wish and that any notion of oversight or ‘Board’ input is only as it is convenient to them. The board has little to no real influence over many aspects of MITRE’s operation of CVE other than when they set an official vote on a given policy. Additionally, if you point out how such a vote that impacts the industry is not adopted by certain entities such as CNAs, many years down the road? They don’t want to hear about that either. It’s up to the CNAs to actually care, and fortunately some of them care very much. Oh, you know what a CNA is, and why they matter, right? Good!

OK, so you have your data dump… you better understand the state of CVE and that it is so deficient that Congress is on MITRE’s case. Now, as experienced vulnerability professionals, you know what this means! The rubber-band effect, where MITRE responds quickly and disproportionately to Congress breathing down their neck, and their response impacts the entire CVE ecosystem… and not necessarily in a good way. So welcome to the second half of 2017! Because it took roughly a year for the Congressional oversight and subsequent fallout to strongly influence MITRE. What was their response? It certainly wasn’t to use their abundant taxpayer funded money to directly improve their own processes. That isn’t how MITRE works as I far as I have seen in my career. Instead, MITRE decided to use their resources to better create / enhance what they call a “federated” CNA system.

First, spend a minute looking at the ‘federated’ term in relation to CVE, then look at the use of that term in the recently edited CNA Rules. Notice how the use of ‘federated’ in their context appears to have grown exponentially? Now check the definition of ‘federated’ [dictionary.com, The Free Dictionary, Merriam Webster]. While sufficiently vague, there is a common theme among these definitions. In so many words, “enlist others to do the work for you“. That, is quite simply, what the CNA model is. That is how the CNA model has meant to work from day one, but this has become the saving grace and the crutch of MITRE as well as the broader CVE ecosystem in the last few months. On the surface this seems like a good plan, as more organizations and even independent researchers can do their own assignments. On the downside, if they don’t follow the CNA rules, assignments can get messy and not as helpful to organizations that rely on CVE data. One thing that you may conclude is that any increase in CVE assignments this year may be due, in part, to the increase of CNAs. Of course, it may be interesting to you that at least two of these CNAs have not made a single assignment, and not disclosed any vulnerabilities in prior years either. Curious why they would be tapped to become a CNA.

OK, so you have your data dump… you know of one potential reason that there may be an increase in vulnerabilities this year over last, but you also know that it doesn’t necessarily mean there were actually more disclosures. You only know that there are more CVE IDs being assigned than prior years. Next, you have to consider the simple numbers game when it comes to vulnerability statistics. All CVE IDs are created equal, right? Of course not. MITRE has rules for abstracting when it comes to disclosures. Certain criteria will mean a single ID can cover multiple distinct vulnerabilities, and other VDBs may do it differently. It is easy to argue the merit of both approaches, so I don’t believe one is necessarily right or wrong. Instead, different abstraction rules tend to help different types of users. That said, you will typically see MITRE assign a single CVE ID to a group of vulnerabilities where a) it is the same product and b) it is the same type of vulnerability (e.g. XSS). You can see an example in CVE-2017-16881, which covers XSS vulnerabilities in six different Java files. That is how they typically abstract. Search around for a couple minutes and you will find where they break from that abstraction rule. This may be due to the requesting party filling out separate requests and MITRE not adhering to their own rules, such as CVE-2017-15568, CVE-2017-15569, CVE-2017-15570, and CVE-2017-15571. Then you have to consider that while MITRE will largely assign a single ID to multiple scripts vulnerable to one class (e.g. CSRF, SQLi, XSS), their CNAs do not always follow these rules. You can see examples of this with IBM (CVE-2017-1632, CVE-2017-1549) and Cisco (CVE-2017-12356, CVE-2017-12358) who consistently assign in such a manner. If you think these are outliers that have minimal impact on the overall statistics you generate, reconsider that. In keeping with their abstraction policy, IBM issued two advisories [#1, #2] covering a total of nine CVE IDs for unspecified XSS issues. If MITRE had assigned per their usual abstraction rules, that would have been a single ID.

OK, so you have your data dump… and now you are aware that parsing that dump means very little. MITRE doesn’t follow their own abstraction rules and their CNAs largely follow different rules. So many hundreds, likely a thousand or more of the IDs you are about to parse, don’t mean the same thing when it comes to the number of distinct vulnerabilities. That is around 10% of the total public CVE IDs issued for 2017! OK, forgetting about that for a minute, now you need to consider what the first part of a CVE ID means. CVE-2017-1234 means what exactly? You might think that 2017 is the year the vulnerability was disclosed, and the 1234 is the unique identifier for that year. Perhaps. Or does 2017 mean the year the vulnerability was found and an ID requested? The answer is yes, to both, sometimes. This is another aspect where historically, MITRE made an effort to assign based on when the vulnerability was discovered and/or disclosed to a vendor, not when it was published. Under the old guard, that was an important aspect of CVE as that standard meant more reliable statistics. Under the new guard, basically in the last two years, that standard has disappeared. Not only do they assign a 2017 for a vulnerability discovered and disclosed to a vendor in 2016 but published in 2017, but also they assign a 2017 ID for a vulnerability discovered and disclosed in 2017. Worse? They are also now assigning 2017 IDs to issues discovered and disclosed in previous years. If you need examples, here are MITRE-assigned (as opposed to CNAs that do the same sometimes) 2017 CVE IDs for vulnerabilities disclosed prior to this year; 2016, 2015, 2014, 2013, 2011, 2010, 2008, 2004, and 2002. Notice the missing years? Some of the CNAs cover those gaps! Note that there are over 200 cases like this, and that is important when you start your stats. And we won’t even get into the problem of duplicate CVE assignments that haven’t been rejected, like the first two assignments here (both are invalid assignments and that CNA should know better).

OK, so you have your data dump… you’re ready! Let loose the scripts and analysis! While you do that, I’ll save you some time and math. As of December 24, 2017, there are 18,251 CVE identifiers. 7,436 of them are in RESERVED status, and 133 are REJECTed. As mentioned above, 238 of them have a 2017 ID but were actually disclosed prior to 2017. So a quick bit of math means 18,251 – 7,436 – 133 – 238 = 10,444 entries with 2017 CVE IDs that were disclosed in 2017. This is an important number that will be a bit larger if you parse with Jan 1, 2018 data. This should be your starting point when you look to compare aggregated disclosures, as captured by CVE, to prior years. Based on all of the above, you also now have a considerable list of disclaimers that must be included and explained along with whatever statistics you generate. Because MITRE also stopped using (1) consistent (2) formatting to (3) designate (4) distinct (5) vulnerabilities in a CVE ID, you have no way to parse this data to actually count how many vulnerabilities are present. Finally, know that Risk Based Security’s VulnDB tracked 7,815 distinct vulnerabilities in 2017 that do not have CVE coverage.

Cliff notes? The CVE dataset does not allow you to determine how many vulnerabilities were disclosed in 2017. Hopefully this information helps with your article on vulnerability statistics!

A Note on the Verizon DBIR 2016 Vulnerabilities Claims

[This was originally published on the OSVDB blog.]

[Updated 4/28/2016]

Verizon released their yearly Data Breach Investigations Report (DBIR) and it wasn’t too long before I started getting asked about their “Vulnerabilities” section (page 13). After bringing up some highly questionable points about last year’s report regarding vulnerabilities, several people felt that the report did not stand up to scrutiny. With a few questions leveled at me, I was curious if Verizon and partners learned from last year.

This year’s vulnerability data was provided by Kenna Security (formerly Risk I/O), and Verizon “also utilized vulnerability scan data provided by Beyond Trust, Qualys and Tripwire in support of this section.” So the data isn’t from a single vendor, but at least four vendors, giving the impression that the data should be well-rounded, and have less questions than last year.

From the report:

Secondly, attackers automate certain weaponized vulnerabilities and spray and pray them across the internet, sometimes yielding incredible success. The distribution is very similar to last year, with the top 10 vulnerabilities accounting for 85% of successful exploit traffic. While being aware of and fixing these mega-vulns is a solid first step, don’t forget that the other 15% consists of over 900 CVEs, which are also being actively exploited in the wild.

This is not encouraging, as they have 10 vulnerabilities that account for an incredible amount of traffic, and the footnoted list of CVE IDs suggests the same problems as last year. And just like last year, the report does not explain the methodology for detecting the vulnerabilities, does not include details about the generation of the statistics, and provides a loose definition of what “successfully exploited” means. Without more detail it is impossible for others to reproduce their results, and extremely difficult to explain or disclaim them as a third party reading the report. Going to the Kenna Security page about this report doesn’t really yield much clarity, but does highlight another potential flaw in the methodology:

Kenna’s Chief Data Scientist Michael Roytman was the primary author of this year’s “Vulnerabilities” chapter, analyzing a correlated threat data set that spans 200M+ successful exploitations across 500+ common vulnerabilities and exposures from over 20,000 enterprises in more than 150 countries.

It’s subtle, but notice they went through a data set that spans exploitations across “500+ common vulnerabilities and exposures”, also known as CVE. If the data is only looking for CVEs, then there is an incredibly large bias at play from the start, since they are missing at least half of the disclosed vulnerabilities. More importantly, this becomes a game of fractions that the industry is keen to overlook at every opportunity:

  • CVE represents approximately half of the disclosed vulnerabilities.
  • Vulnerability scanners and IPS/IDS don’t have signatures for all CVE IDs, so they look for some fraction of CVE.
  • Detection signatures are often flawed, leading to false positives and false negatives, meaning they are actually detecting a fraction of the CVE IDs they intend to.

Another crucial factor in how this data is generated is in the detection of the exploits. Of the four companies contributing data, one was founded in 2009 (Risk I/O / Kenna) and another in 2006 (Beyond Trust, in the context of this discussion). That leaves Qualys (founded 1999) and Tripwire (founded 1997), who are likely the sources of the signatures that detected the vulnerabilities. For those around in the late 90s, the vulnerability landscape was very different than today, and security products based on signatures back then are in some ways considered rudimentary compared to today. Over time, most security products do not revisit older signatures to improve them unless they have to, often due to customer demand. Newly formed companies basically never go back and write signatures for vulnerabilities from 1999. So it stands to reason that the detection of these issues are based on Qualys and/or Tripwire’s detection capabilities, and the signatures detecting these vulnerabilities are likely outdated and not as well-constructed as compared to their more recent signatures.

That leads us to ask, how many vulnerabilities are these companies really looking for? Where did the detection signatures originate and how accurate are they? While the DBIR does disclaim that the data used is a sample, they also admit “bias undoubtedly exists”. However, they don’t warn the reader of these extremely limiting caveats that put the entire data set into a perspective clearly showing strong bias. This, combined with the lack of detailed methodology for how these vulnerabilities are detected and correlated to measure ‘success’, ultimately means this data has little value other than for inclusion in pedestrian reports on vulnerabilities.

With that in mind, I can only go by what information is available. We’ll start with the concise list of the top 10 CVE IDs these four vulnerability intelligence providers say are being exploited the most, and Verizon labels as “successfully exploited”:

  1. 2015-03-05 – CVE-2015-1637 – Microsoft Windows Secure Channel (Schannel) RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
  2. 2015-01-06 – CVE-2015-0204 – OpenSSL RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
  3. 2012-02-25 – CVE-2012-1054 – Puppet k5login File Symlink File Overwrite Local Privilege Escalation
  4. 2011-07-19 – CVE-2011-0877 – Oracle Enterprise Manager Grid Control Instance Management Unspecified Remote Issue (2011-0877)
  5. 2004-02-10 – CVE-2003-0818 – Microsoft Windows ASN.1 Library (MSASN1.DLL) BER Encoding Handling Remote Integer Overflows
  6. 2002-01-15 – CVE-2002-0126 – BlackMoon FTP Server Multiple Command Remote Overflow
  7. 2001-12-26 – CVE-2002-0953 – PHPAddress globals.php LangCookie Parameter Remote File Inclusion
  8. 2001-12-20 – CVE-2001-0876 – Microsoft Windows Universal Plug and Play NOTIFY Directive URL Handling Remote Overflow
  9. 2001-04-13 – CVE-2001-0680 – QVT/Net / Term FTP Server LIST Command Traversal Remote File Access
  10. 1999-11-22 – CVE-1999-1058 – Vermillion FTPD Long CWD Command Handling Remote Overflow DoS

This list should raise serious red flags for anyone passingly familiar with vulnerabilities. Not only do we have very odd ‘top 10’ lists from last year and this year, but there is little overlap between them. How does 2015 show a top 10 list exploiting eight vulnerabilities with CVE identifiers between 1999 and 2002, meaning they had been exploited so much as many as thirteen years later, only to see them all drop off the list this year, to be replaced by new 15+ year old vulnerabilities? In addition to this oddity, there are more considerations leading to my top 10 list of questions about their list:

  1. How does a local vulnerability based on a symlink overwrite flaw (CVE-2012-1054) make it into a top 10 list of “85% of successful exploit traffic“?
  2. How does a local vulnerability in Puppet rank #3 on this list, given the install base of Puppet as compared to Adobe or Java?
  3. If they are detecting exploits on the wire, shouldn’t we see Java, Adobe Reader, and Adobe Flash somewhere on the list? The “Slow and steady—but how slow?” section even talks about time-to-exploit for Adobe.
  4. Why doesn’t this list remotely match US-CERT’s “Top 30 Targeted High Risk Vulnerabilities” that includes vulnerabilities back to 2006, but not a single one listed above?
  5. How does a vulnerability that by all accounts is so vague, that it has to be distinguished by the vendor issued CVE ID (CVE-2011-0877), have a signature and get exploited so much?
  6. How does a vulnerability in Oracle Enterprise Manager Grid Control show up as #4, when no Oracle Database vulnerabilities appear?
  7. How do you distinguish an FTP LIST command exploit from one vendor to another? (e.g. CVE-2005-2726, CVE-2002-0558, CVE-2001-0933, CVE-2001-0680) According to the one-liner methodology, this is done via pairing SIEM data, suggesting that BlackMoon and Vermillion are that popular today.
  8. Yet, how does a remote DoS in an Windows-based FTP program that doesn’t appear to have been distributed for a decade make it on this list? Are people really conducting targeted DoS attacks against this software?
  9. Is BlackMoon FTP Server really that prevalent to be exploited so often?
  10. Or is there a problem in generating this data, which would be more easily attributed to loose signatures detecting FTP attacks regardless of vendor?

Figure 12 in the report, which is described as “Count of CVEs exploited in 2015 by CVE publication date” is a curious thing to include, as the CVE publication date is very distinct from the vulnerability disclosure date. While a large percentage of CVE publication dates are within seven days of disclosure, many are not (e.g. CVE-2015-8852 disclosed 2015-03-23 and CVE publication on 2016-04-26). Enough to make this chart questionable as far as the insight it provides. Taking the data as presented, are they really saying that only ~ 73 vulnerabilities with a 2015 ID were successfully exploited in 2015 across “millions of successful real-world exploitations“? Given that 40 vulnerabilities were discovered in the wild, 33 of which have 2015 CVE IDs, that means that only ~40 other 2015 vulnerabilities were successfully exploited? If that is the takeaway, how is the security industry unable to stop the increasing wave of data breaches, the same kinds that led to this report? Something doesn’t add up here.

While people are cheering about the DBIR disclaiming there is sample bias (and not really enumerating it), they ignore the measurement bias, don’t speak to publication bias, don’t explain the attrition bias between 2015 and 2016, or potential chaining bias. As usual, the media is happy to glom onto such reports without asking any of these questions or providing critical analysis. As an industry, we need to keep challenging metrics and statistics to ensure they are not only accurate, but provide meaning that benefits us.


4/28/2016According to Gabe (@gdbassett), the list of CVEs in the DBIR is incorrect. He posted a new list of CVEs (mostly the same) via Twitter in a reply to Andreas Lindh who was surprised at the top ten list of vulnerabilities as well. Gabe also confirmed that afterwards they “compared the figure CVEs (listed above) against the raw data. After removing non-confirmed breaches, they match.He went on to link to another source showing “data” about one of the CVEs, that really doesn’t mean anything without more context. Meanwhile, Michael Roytman, who did the vulnerability section of the report confirmed that he/Kenna would be responding to this blog with one of their own.

I hate to harp on simple transposition style mistakes in a report, but given the severity behind using numeric identifiers for vulnerabilities, it seems like that should have been double and triple-checked. Even then, I don’t understand how someone familiar with vulnerabilities could see either list and not ask many of the same questions I did, or provide more information in the report to back the claims. That said, let’s look at Gabe’s new list of CVEs. Bold and links are used to highlight the new ones:

  1. 2015-03-05 – 2015-1637 – Microsoft Windows Secure Channel (Schannel) RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
  2. 2015-01-06 – 2015-0204 – OpenSSL RSA Temporary Key Handling EXPORT_RSA Ciphers Downgrade MitM (FREAK)
  3. 2004-02-10 – 2003-0818 – Microsoft Windows ASN.1 Library (MSASN1.DLL) BER Encoding Handling Remote Integer Overflows
  4. 2002-07-22 – 2002-1054 – Pablo FTP Server LIST Command Arbitrary Directory Listing Remote Information Disclosure
  5. 2002-01-15 – 2002-0126 – BlackMoon FTP Server Multiple Command Remote Overflow
  6. 2001-12-26 – 2002-0953 – PHPAddress globals.php LangCookie Parameter Remote File Inclusion
  7. 2001-12-20 – 2001-0877 – Microsoft Windows Universal Plug and Play NOTIFY Request Remote DoS
  8. 2001-12-20 – 2001-0876 – Microsoft Windows Universal Plug and Play NOTIFY Directive URL Handling Remote Overflow
  9. 2001-04-13 – 2001-0680 – QVT/Net / Term FTP Server LIST Command Traversal Remote File Access
  10. 1999-11-22 – 1999-1058 – Vermillion FTPD Long CWD Command Handling Remote Overflow DoS

The list gained another FTP server issue that doesn’t necessarily lead to privileges, and another remote denial of service attack, while losing the Puppet symlink (CVE-2012-1054) and the vague Oracle Enterprise Manager issue (CVE-2011-0877). All said and done, the list is just as confusing as before, perhaps more so. That gives us four FTP vulnerabilities, only one of which leads to remote code execution, and two denial of service attacks, that gain no real privileges for an attacker. As Andreas Lindh points out, that I failed to highlight, having a man-in-the-middle vulnerability occupy two spots on this list is also baffling in the context of the volume of attacks stated. Also note that with the addition of CVE-2002-1054 (Pablo FTP), there are now two vulnerabilities that appear on the DBIR 2015 and DBIR 2016 top ten CVE list.

Hopefully the forthcoming blog from Michael Roytman will shed some light on these issues.

Reviewing the Secunia 2015 Vulnerability Review (A Redux)

It’s that time of year again! Vulnerability databases whip up reports touting statistics and observations based on their last year of collecting data. It’s understandable, especially for a commercial database, to show why your data source is the best. In the past, we haven’t had a strong desire to whip up a flashy PDF with our take on vulnerability disclosure statistics. Instead, we’ve focused more on educating others on why their statistics are often so horribly wrong or misleading. It should come as no surprise then, as we find ourselves forced to point out the errors in Secunia’s ways. We did the same thing last year in a long blog, giving more perspective and our own numbers for balance.

After reading this year’s report from Secunia, we could basically use last year’s blog as a template and just plug in new numbers. Instead, it is important to point out that after last year’s review, Secunia opted not to revise their methodology. Worse, they did not take any pointers from a presentation on vulnerability statistics Steve Christey (CVE / MITRE) and I did at BlackHat before last year’s report, pointing out the most common flaws in statistics and where the bias comes from. It is easy to chalk up last year’s report as being naive and not fully appreciating or understanding such statistics. This year though, they have no excuse. The numbers they report seem purposefully misleading and are a disservice to their customers and the community.

The headline of their press release and crux of their report is that they identified 15,435 vulnerabilities in 2014. This is entirely inaccurate, and completely misleading. This is fundamentally due to the same very flawed methodology they used last year, where they count a single vulnerability as many as 210 times (e.g. this year CVE-2014-0160, aka Heartbleed). Then factor in all of the other high profile “logo” vulnerabilities (e.g. FREAK, ShellShock), especially in OpenSSL, and the same handful of vulnerabilities will get counted hundreds or even a thousand times. To be abundantly clear, a vulnerability in a third-party library such as OpenSSL is one vulnerability. It doesn’t matter how many other products use and integrate that code, the fundamental flaw is in the library. Counting each product that implements OpenSSL as a distinct vulnerability, rather than a distinct occurrence of a vulnerability, is wrong. Worse, it actually highlights just how poor their statistics are, if you do accept their flawed methodology, as it is heavily used among thousands of applications that Secunia doesn’t cover, even when a vendor like IBM issues numerous advisories that they miss. No matter how you cut it, their numbers are invalid.

So poking at our database, understanding we don’t quite have a 100% mapping to Secunia, but feel it is pretty close, we see that around 5,280 of our entries have a Secunia reference and 4,530 of those also have a CVE reference (compared to more than 8,500 entries we have with a CVE reference). Even counting for a good margin of error, it still appears that Secunia does not cover almost 3,000 vulnerabilities in 2014, which are covered by CVE entries. Those familiar with this blog know that we are more critical of CVE than any other Vulnerability Database. If a commercial VDB with hundreds of high-dollar clients can’t even keep a 100% mapping with an over-funded, U.S. government run VDB that is considered the “lowest common denominator” by many, that speaks volumes. I could end the blog on that note alone, because anything less than mediocrity is a disgrace. How do you trust, and pay for, a vulnerability intelligence service missing thousands of publicly disclosed vulnerabilities, that are served up on a platter and available to anyone for free use and inclusion in their own database? You don’t. After all, remember this quote from the Secunia report last year?

CVE has become a de facto industry standard used to uniquely identify vulnerabilities, which have achieved wide acceptance in the security industry.

All of the above is not only important, it is absolutely critical to appropriately frame their statistics. Not only does Secunia avoid using the minimum industry standard for vulnerability aggregation, they opt to use their own methodology, which they now know beyond doubt seriously inflates their ‘vulnerability’ count. Last year, it was by almost three times more than reality. This year, it was also by about three times more than reality, as they only covered approximately 5,500 unique vulnerabilities. That is almost 3,000 less than CVE and 8,000 less than OSVDB. Any excuses of Secunia coverage lacking that are based on them claiming to verify vulnerabilities and “not covering vulnerabilities in beta software” are also invalid, as a majority of vulnerabilities reported are in stable software, and regularly come from trusted researchers or the vendors themselves. Regardless, it should be clear to anyone passingly familiar with the aggregation of vulnerabilities that Secunia is playing well outside the bounds of reality. Of course, every provider of vulnerability intelligence has some motive, or at least desire, to see the numbers go up year to year. It further justifies their customers spending money on the solution.

“Every year, we see an increase in the number of vulnerabilities discovered, emphasizing the need for organizations to stay on top of their environment. IT teams need to have complete visibility of the applications that are in use, and they need firm policies and procedures in place, in order to deal with the vulnerabilities as they are disclosed.” — Kasper Lindgaard, Director of Research and Security at Secunia.

While Mr. Lindgaard’s quote is flashy and compelling, it is also completely wrong, even according to their own historical reports. If Mr. Lindgaard had read Secunia’s own report from 2010 and 2011, he might have avoided this blunder.

This report presents global vulnerability data from the last five years and identifies trends found in 2010. The total number of vulnerabilities disclosed in 2010 shows a slight decrease of 3% compared to 2009. – Yearly Report 2010 (Secunia)

Analysing the long-term and short-term trends of all products from all vendors in the Secunia database over the last six years reveals that the total number of vulnerabilities decreased slightly in 2011 compared to 2010. – Vulnerabilities are Resilient (2011, Secunia Report)

In 2007 OSVDB cataloged 9,574 vulnerabilities as compared to 11,050 in 2006. In 2009 they cataloged 8,175 vulnerabilities as compared to 9,807 in 2008. In 2011 they cataloged 7,913 vulnerabilities compared to 9,161 in 2010. That is three distinct years that were not an increase at all. Taking Mr. Lindgaard’s quote into account, and comparing it to our own data that is based on a more standard method of abstraction, we also show that it has not been a steady rise every year:

osvdbbyyear

With highly questionable statistics, based on a flawed methodology, Secunia also lucked out this year, as vendors are more prudent about reporting vulnerabilities in used third-party libraries (e.g. OpenSSL, PHP). This plays out nicely for a company that incorrectly counts each vendor’s use of vulnerable software as a distinct issue.

Getting down to the statistics they share, we’ll give a few additional perspectives. As with all vulnerability statistics, they should be properly explained and disclaimed, or they are essentially meaningless.

In 2014, a total of 15,435 vulnerabilities were discovered in 3,870 products from 500 vendors.

As stated previously, this is absolutely false. Rather than track unique vulnerabilities, Secunia’s model and methodology is to focus more on products and operating systems. For example, they will issue one advisory for a vulnerability in OpenSSL, then dozens of additional advisories for the exact same vulnerability. The only difference is that each subsequent advisory covers a product or operating system that uses it. Above I mentioned the redundant ‘HeartBleed’ entries, but to further illustrate the point and demonstrate how this practice can lead to wildly inaccurate statistics, let’s examine the OpenSSL ‘FREAK’ vulnerability as well (29 Secunia advisories). The flaw is in the OpenSSL code, not the implementation from each vendor, so it is still that same one vulnerability. Depending on how good their coverage is, this could be more problematic. Looking at IBM, who uses OpenSSL in a wide variety of their products, we know that at least 56 are impacted by the vulnerability, which may mean that many more Secunia advisories cover them. If we had a perfect glimpse into what products use OpenSSL, it would likely be thousands. Trying to suggest that is ‘thousands of unique vulnerabilities’ is simply ignorant.

83% of vulnerabilities in all products had patches available on the day of disclosure in 2014.

We assume that by ‘patches’, they mean a viable vendor-provided solution (e.g. upgrade) as well. Based on our aggregation, it is closer to 64% that have a solution available, and that is not counting “the day of disclosure”. That includes vendors who have patched since the issue was disclosed. This big gap comes from the number of vulnerabilities aggregated by each database since the start of 2014.

25 zero-day vulnerabilities were discovered in total in 2014, compared to 14 the year before.

Assuming we use the same definition of “zero-day” as Secunia, this is considerably under what we tracked. We flag such issues “discovered in the wild”, meaning the vulnerability was being actively exploited by attackers when it was discovered and/or disclosed. In 2014, we show 43 entries with this designation, compared to 73 in 2013. That is a sharp decrease, not increase. Part of that large swing for us comes from a rash of attacker-trojaned software distributions that were used to target mobile users in 2013. Since trojaned software from a legitimate vendor meets the criteria to be called a vulnerability (it can be abused by a bad actor to cross privilege boundaries), we include it where Secunia and other databases generally ignore it.

20 of the 25 zero-day vulnerabilities were discovered in the 25 most popular products – 7 of these in operating systems.

Since we got into the “popular products” bit in last year’s rebuttal to their statistics, we’ll skip that this year. However, in 2014 we show that 16 of those “zero day” were in Microsoft products (two in Windows, six in the bundled MSIE), and four were in Adobe products.

In 2014, 1,035 vulnerabilities were discovered in the 5 most popular browsers: Google Chrome, Mozilla Firefox, Internet Explorer, Opera and Safari. That is a 42% increase from 2013.

Without searching our data, it is likely very safe to say these numbers are way off, and that Secunia’s coverage of browser security is woefully behind. In addition to not aggregating as much data, they likely don’t properly take into account vulnerabilities in WebKit, which potentially affects Chrome, Safari, and Opera. Are WebKit vulnerabilities attributed to one, the other, or all three? With that in mind, let’s compare numbers. We see ~200 for Mozilla, ~250 for Microsoft IE, and not even 10 between Safari and Opera. In reality, Opera is not forthcoming in their information about vulnerabilities they patch, so their actual vulnerability count is likely much higher. That leaves us with Chrome, which had over 300 vulnerabilities in their own code. As mentioned, factor in another 100+ vulnerabilities in WebKit that may impact up to three browsers. That comes to a total of 889, far less than their claimed 1,035. That makes us curious if they are using the same flawed methodology with browser vulnerabilities, as they do with libraries (i.e. counting WebKit / Blink issues multiple times for each browser using it). Some vulnerabilities in these browsers are also due to flaws in other 3rd party libraries and not in the browsers’ own code (e.g. ICU4C and OpenJPEG). Finally, Google Chrome provides Adobe Flash Player, so is Secunia adding the number of vulnerabilities addressed in Flash Player to their Google Chrome count as well, further bloating it? So many unknowns and another reminder of why vulnerability statistics are often misleading or serve no value. Without explaining the methodology, we’re left with guesses and assumptions that strongly suggest this statistic can’t be trusted.

Even more offensive, is that the “vulnerability intelligence experts at Secunia Research” don’t demonstrate any expertise in informing readers why Google Chrome numbers are so high. Instead, they publish the information, withhold details of the methodology, and let the media run wild parroting their pedestrian statistics and understanding. As a lesson to the media, and Secunia, consider these additional pieces of information that help explain those numbers, and why you simple can’t compare disclosed vulnerability totals:

  1. Google releases details about the vulnerabilities fixed in Chrome, unlike some other browser vendors. For example, Opera releases almost no information. Microsoft typically releases information on the more serious vulnerabilities, and none from their internal auditing. Of the 310 Chrome vulnerabilities we tracked, 66 were CVSSv2 5.0 or less.
  2. Google has perhaps the most aggressive bug bounty program in the industry, which encourages more researchers to audit their code. How aggressive? They paid out more than US$550,000 in 2014, just for Chrome vulnerabilities.
  3. Google’s turnaround in fixing Chrome vulnerabilities is quite respectable. Since many of the vulnerabilities were reported to them and fixed promptly, it is easy to argue with the claim it is the “least secure” browser.
  4. Google fields several teams of security experts that audit their code, including Chrome. They find and fix potential issues before disclosing them. By doing that, they are ensuring the security of the browser, yet effectively getting punished by Secunia for doing so.
  5. In addition to all of the issues disclosed, a tiny fraction were actually proven to be exploitable. Like many software vendors, they find it is easier and more efficient to spend a small amount of time to fix the possible vulnerability rather than the larger amount of time required to prove it is exploitable.

There are many pitfalls of using simple disclosed vulnerability totals to compare products. This is explained over and over by those familiar with the vulnerability disclosure world, and should be common knowledge to ‘experts’ in vulnerability research. Instead, Secunia not only fuels media articles that aren’t critical of their report, they Tweet little snippets without any context. They also have their partners repeat their statistics, giving them more exposure, misleading the customers of other companies.

Vulnerability statistics can be useful. But they must be presented in a responsible and educated manner to serve their purpose. Those not familiar with vulnerability aggregation and generating the resulting statistics typically do a poor job, and add confusion to the matter. We like to call such people “vulnerability tourists”, and we can now add Secunia to the growing list of them.

[Update: 2015-04-01. One additional quote from the 2010 Secunia Report was added, and clarification of wording regarding the yearly vulnerability total chart was made.]

Reviewing the Secunia 2013 Vulnerability Review

[This was originally published on the OSVDB blog.]

On February 26, Secunia released their annual vulnerability report (link to report PDF) summarizing the computer security vulnerabilities they had cataloged over the 2013 calendar year. For those not familiar with their vulnerability database (VDB), we consider them a ‘specialty’ VDB rather than a ‘comprehensive’ VDB (e.g. OSVDB, X-Force). They may disagree on that point, but it is a simple matter of numbers that leads us to designate them as such. That also tends to explain why some of our conclusions and numbers are considerably different and complete than theirs.

In past years this type of blog post would not need a disclaimer, but it does now. OSVDB, while the website is mostly open to the public, is also the foundation of the VulnDB offering from our commercial partner and sponsor Risk Based Security (RBS). As such, we are now a direct competitor to Secunia, so any criticism leveled at them or their report may be biased. On the other hand, many people know that I am consistently critical of just about any vulnerability statistics published. Poor vulnerability statistics have plagued our industry for a long time. So much so that Steven Christey from CVE and I gave a presentation last year at the BlackHat briefings in Las Vegas on the topic.

One of the most important messages and take-aways from that talk is that all vulnerability statistics should be disclaimed and explained in advance. That means that a vulnerability report should start out by explaining where the data came from, applicable definitions, and the methodology of generating the statistics. This puts the subsequent statistics in context to better explain and disclaim them, as a level of bias enters any set of vulnerability statistics. Rather than follow the Secunia report in the order they publish them, I feel it is important to skip to the very end first. For that is where they finally explain their methodology to some degree, which is absolutely critical in understanding how their statistics were derived.

On page 16 (out of 20) of the report, in the Appendix “Secunia Vulnerability Tracking Process”, Secunia qualifies their methodology for counting vulnerabilities.

A vulnerability count is added to each Secunia Advisory to indicate the number of vulnerabilities covered by the Secunia Advisory. Using this count for statistical purposes is more accurate than counting CVE identifiers. Using vulnerability counts is, however, also not ideal as this is assigned per advisory. This means that one advisory may cover multiple products, but multiple advisories may also cover the same vulnerabilities in the same code-base shared across different programs and even different vendors.

First, the ‘vulnerability count’ referenced is not part of a public Secunia advisory, so their results cannot be realistically duplicated. The next few lines are important, as they invalidate the Secunia data set for making any type of real conclusion on the state of vulnerabilities. Not only can one advisory cover multiple products, multiple advisories can cover the same single vulnerability, just across different major versions. This high rate of duplicates and lack of unique identifiers make the data set too convoluted for meaningful statistics.

CVE has become a de facto industry standard used to uniquely identify vulnerabilities which have achieved wide acceptance in the security industry.

This is interesting to us because Secunia is not fully mapped to CVE historically. Meaning, there are thousands of vulnerabilities that CVE has cataloged, that Secunia has not included. CVE is a de facto industry standard, but also a drastically incomplete one. At the bare minimum, Secunia should have a 100% mapping to them and they do not. This further calls into question any statistics generated off this data set, when they knowingly ignore such a large number of vulnerabilities.

From Remote
From remote describes other vulnerabilities where the attacker is not required to have access to the system or a local network in order to exploit the vulnerability. This category covers services that are acceptable to be exposed and reachable to the Internet (e.g. HTTP, HTTPS, SMTP). It also covers client applications used on the Internet and certain vulnerabilities where it is reasonable to assume that a security conscious user can be tricked into performing certain actions.

Classification for the location of vulnerability exploitation is important as this heavily factors into criticality; either via common usage, or through scoring systems such as CVSS. In their methodology, we see that Secunia does not make a distinction between ‘remote’ and ‘context-dependent’ (or ‘user-assisted’ by some). This means that the need for user interaction is not factored into an issue and ultimately, scoring and statistics become based on network, local (adjacent) network, or local vectors.

Secunia further breaks down their classification in the appendix under “Secunia Vulnerability Criticality Classification“. However, it is important to note that their breakdown does not really jibe with any other scoring system. Looking past the flaw of using the word ‘critical’ in all five classifications, the distinction between ‘Extremely Critical’ and ‘Highly Critical’ is minor; it appears to be solely based on if Secunia is aware of exploit code existing for that issue based on their descriptions. This mindset is straight out of the mid 90s in regards to threat modeling. In today’s landscape, if details are available about a vulnerability then it is a given that a skilled attacker can either write or purchase a vulnerability for the issue within a few days, for a majority of disclosed issues. In many cases, even when details aren’t public but a patch is, that is enough to reliably reverse it and leverage it for working exploit code in a short amount of time. Finally, both of these designations still do not abstract on if user interaction is required. In each case, it may or may not be. In reality, I imagine that the difference between ‘Extremely’ and ‘Highly’ is supposed to be based on if exploits are happening in the wild at time of disclosure (i.e. zero day).

Now that we have determined their statistics cannot be reproduced, use a flawed methodology, and are based on drastically incomplete data, let’s examine their conclusions anyway!


The blog announcing the report is titled “1,208 vulnerabilities in the 50 most popular programs – 76% from third-party programs” and immediately calls into question their perspective. Reading down a bit, we find out what they mean by “third-party programs”:

“And the findings in the Secunia Vulnerability Review 2014 support that, once again, the biggest vulnerability threat to corporate and private security comes from third-party – i.e. non-Microsoft – programs.”

Unfortunately, this is not the definition of a third-party program by most in our industry. On a higher more general level, a “third-party software component” is a “is a reusable software component developed to be either freely distributed or sold by an entity other than the original vendor of the development platform” (Wikipedia). In the world of VDBs, we frequently refer to a third-party component a ‘library‘ that is integrated into a bigger package. For example, Adobe Reader 10 which is found on many desktop computers is actually built on Adobe’s own code, but also as many as 212 other pieces of software. The notion that “non-Microsoft” software is “third-party” is very weird for lack of better words, and shows the mindset and perspective of Secunia. This completely discounts users of Apple, Linux, VMs (e.g. Oracle, VMware, Citrix), and mobile devices among others. Such a Microsoft-centric report should clearly be labeled as such, not as a general vulnerability report.

In the Top 50 programs, a total of 1,208 vulnerabilities were discovered in 2013. Third-party programs were responsible for 76% of those vulnerabilities, although these programs only account for 34% of the 50 most popular programs on private PCs. The share of Microsoft programs (including the Windows 7 operating system) in the Top 50 is a prominent 33 products – 66%. Even so, Microsoft programs are only responsible for 24% of the vulnerabilities in the Top 50 programs in 2013.

This is aiming for the most convoluted summary award apparently. I really can’t begin to describe how poorly this comes across. If you want to know the ‘Top 50 programs’, you have to read down to page 18 of the PDF and then resolve a lot of questions, some of which will be touched on below. When you read the list, and see that several ‘Microsoft’ programs actually had 0 vulnerabilities, it will call into question the “prominent 33 products” and show how the 66% is incorrectly weighted.

“However, another very important security factor is how easy it is to update Microsoft programs compared to third-party programs.” — Secunia CTO, Morten R. Stengaard.

When debunking vulnerability statistics, I tend to focus on the actual numbers. This is a case where I simply have to branch out and question how a ‘CTO’ could make this absurd statement. In one sentence, he implies that updating Microsoft is easy, while third-party programs (i.e. non-Microsoft programs per their definition) are not. Apparently Mr. Stengaard does not use Oracle Java, Adobe Flash player, Adobe Air, Adobe Reader, Mozilla Firefox, Mozilla Thunderbird, Google Chrome, Opera, or a wide range of other non-Microsoft desktop software, all of which have the same one-click patching/upgrade ability. Either Mr. Stengaard is not qualified to speak on this topic, or he is being extremely disingenuous in his characterization of non-Microsoft products to suit the needs of supporting this report and patch management business model. If he means that patching Windows is easier on an enterprise scale (e.g. via SCCM or WSUS), then that is frequently true, but such qualifications should be clear.

This is a case where using a valid and accepted definition of ‘third-party programs’ (e.g. a computing library) would make this quote more reasonable. Trying to upgrade ffmpeg, libav, or WebKit in the context of the programs that rely on them as libraries is not something that can be done by the average user. The problem is further compounded when portions of desktop software are used as a library in another program, such as AutoCad which appears in the Adobe Reader third-party license document linked above. However, these are the kinds of distinctions that any VDB should be fully aware of, and be able to disclaim and explain more readily.


Moving on to the actual ‘Secunia Vulnerability Review 2014‘ report, the very first line opens up a huge can of worms as the number is incorrect and entirely misleading. The flawed methodology used to generate the statistic cascades down into a wide variety of other incorrect conclusions.

The absolute number of vulnerabilities detected was 13,073, discovered in 2,289 products from 539 vendors.

It is clear that there are a significant amount of vulnerabilities that are being counted multiple times. While this number is generated from Secunia’s internal ‘vulnerability count’ number associated with each advisory, they miss the most obvious flaw; that many of their advisories cover the exact same vulnerability. Rather than abstract so that one advisory is updated to reflect additional products impacted, Secunia will release additional advisories. This is immediately visible in cases where a protocol is found to have a vulnerability, such as the “TLS / DTLS Protocol CBC-mode Ciphersuite Timing Analysis Plaintext Recovery Cryptanalysis Attack” (OSVDB 89848). This one vulnerability impacts any product that implements that protocol, so it is expected to be widespread. As such, that one vulnerability tracks to 175 different Secunia advisories. This is not a case where 175 different vendors coded the same vulnerability or the issue is distinct in their products. This is a case of a handful of base products (e.g. OpenSSL, GnuTLS, PolarSSL) implementing the flawed protocol, and hundreds of vendors using that software bundled as part of their own.

While that is an extreme example, the problem is certainly front-and-center due to their frequent multi-advisory coverage of the same issue. Consider that one OpenSSL vulnerability may be covered in 11 Secunia advisories. Then look at other products that are frequently used as libraries or found on multiple Linux distributions, each of which get their own advisory. Below is a quick chart showing examples of a single vulnerability in one of several products, along with the number of Secunia advisories that references that one vulnerability:

Example w/ 1 Vuln# of Secunia Adv
CVE-2013-6367 Linux Kernel15
CVE-2013-6644 Google Chrome5
CVE-2013-5609 Mozilla14
CVE-2013-4408 Samba10
CVE-2013-6415 Ruby on Rails10
CVE-2014-0368 Oracle Java27

This problem is further compounded when you consider the number of vulnerabilities in those products in 2013, where each one received multiple Secunia advisories. This table shows the products from above, and the number of unique vulnerabilities as tracked by OSVDB for that product in 2013 that had at least one associated Secunia advisory:

Software# of Vulns in product in 2013 w/ Secunia Ref
Linux Kernel131
Google Chrome267
Mozilla148
Samba10
Ruby on Rails14
Oracle Java194

It is easy to see how Secunia quickly jumped to 13,073 vulnerabilities while only issuing 3,327 advisories in 2013. If there is any doubt about vulnerability count inflation, consider these four Secunia advisories that cover the same set of vulnerabilities, each titled “WebSphere Application Server Multiple Java Vulnerabilities“. Secunia created four advisories for the same vulnerabilities simply to abstract based on the major versions affected, as seen in this table:

Secunia Advisory# of Vulns in product in 2013
56778reported in versions 8.5.0.0 through 8.5.5.1.
56852reported in versions 8.0.0.0 through 8.0.0.8.
56891reported in version 7.0.0.0 through 7.0.0.31.
56897reported in versions 6.1.0.0 through 6.1.0.47.

The internal ‘vulnerability count’ for these advisories are very likely 25, 25, 25, and 27, adding up to 102. Applied against IBM, you have 27 vulnerabilities inflated greatly and counting for 102 instead. Then consider that IBM has several hundred products that use Java, OpenSSL, and other common software. It is easy to see how Secunia could jump to erroneous conclusions:

The 32% year-on-year increase in the total number of vulnerabilities from 2012 to 2013 is mainly due to a vulnerability increase in IBM products of 442% (from 772 vulnerabilities in 2012 to 4,181 in 2013).

This unfair and potentially libelous statement was picked up by InformationWeek in an article titled “IBM Software Vulnerabilities Spiked In 2013“, which Secunia was quick to Tweet about.


The next set of statistics is convoluted on the surface, but even more confusing when you read the details and explanations for how they were derived:

Numbers – Top 50 portfolio
The number of vulnerabilities in the Top 50 portfolio was 1,208, discovered in 27 products from 7 vendors plus the most used operating system, Microsoft Windows 7.

To assess how exposed endpoints are, we analyze the types of products typically found on an endpoint. Throughout 2013, anonymous data has been gathered from scans of the millions of private computers which have the Secunia Personal Software Inspector (PSI) installed. Secunia data shows that the computer of a typical PSI user has an average of 75 programs installed on it. Naturally, there are country- and region-based variations regarding which programs are installed. Therefore, for the sake of clarity, we chose to focus on a representative portfolio of the 50 most common products found on a typical computer and the most used operating system, and analyze the state of this portfolio and operating system throughout the course of 2013. These 50 programs are comprised of 33 Microsoft programs and 17 non-Microsoft (third-party) programs.

Reading down to page 18 of the full report, you see the table listing the “Top 50” software installed as determined by their PSI software. On the list is a wide variety of software that are either components of Windows (meaning they come installed by default, but show up in the “Programs” list e.g. Microsoft Visual C++ Redistributable) or in a few cases third-party software (e.g. Google Toolbar), many of which have 0 associated vulnerabilities. In other cases they include product driver support tools (e.g. Realtek AC 97 Update and Remove Driver Tool) or ActiveX components that are generally not installed via traditional means (e.g. comdlg32 ActiveX Control). With approximately half of the Top 50 software having vulnerabilities, and mixing different types of software components, it causes summary put forth by Secunia to be misleading. Since they include Google Chrome on the list, by their current logic, they should also include WebKit which is a third-party library wrapped into Chrome, just as they include ‘Microsoft Powerpoint Viewer’ (33) which is a component of ‘Microsoft Powerpoint’ (14) and does not install separately.

Perhaps the most disturbing thing about this Top 50 summary is that Secunia only counts 7 vendors in their list. Reading through the list carefully, you see that there are actually 10 vendors represented: Microsoft, Adobe, Oracle, Mozilla, Google, Realtek, Apple, Piriform (CCleaner), VideoLAN, and Flexera (InstallShield). This seriously calls into question any conclusions put forth by Secunia regarding their Top 50 list and challenges their convoluted and irreproducible methodology.


Rather than offer a rebuttal line by line for the rest of the report and blog, we’ll just look at some of the included statistics that are questionable, wrong, or just further highlight that Secunia has missed some vulnerabilities.

In 2013, 727 vulnerabilities were discovered in the 5 most popular browsers: Google Chrome, Mozilla Firefox, Internet Explorer, Opera and Safari.

By our count, there were at least 756 vulnerabilities in these browsers: Google Chrome (295), Mozilla Firefox (155), Internet Explorer (138), Opera (9), Apple Safari (8 on desktop, 4 on mobile), and WebKit (component of Chrome and Safari, 147). The count in Opera is likely very low though. In July 2013, Opera issued the first browser based on Blink, so it’s very likely that it has been affected by the vast majority of the Blink vulnerability fixes by Google. However, Opera is not very good at clearly reporting vulnerabilities, so this very likely accounts for the very low count that both we and Secunia have; something they should clearly have disclaimed.

In 2013, 70 vulnerabilities were discovered in the 5 most popular PDF readers: Adobe Reader, Foxit Reader, PDF-XChange Viewer, Sumatra PDF and Nitro PDF Reader.

By our count, there were at least 76 vulnerabilities in these PDF readers: Adobe Reader (69), Foxit (2), PDF-XChange (1), Sumatra (0), and Nitro (4).

The actual vulnerability count in Microsoft programs was 192 in 2013; 128.6% higher than in 2012.

Based on our data, there were 363 vulnerabilities in Microsoft software in 2013, not 192. This is up from 207 in 2012, giving us a 175.3% increase.

As in 2012, not many zero-day vulnerabilities were identified in 2013: 10 in total in the Top 50 software portfolio, and 14 in All products.
A zero-day vulnerability is a vulnerability that is actively exploited by hackers before it is publicly known, and before the vendor has published a patch for it.

By that definition, which we share, we tracked 72 vulnerabilities that were “discovered in the wild” in 2013. To be fair, our number is considerably higher because we actually track mobile vulnerabilities, something Secunia typically ignores. More curious is that based on a cursory search, we find 17 of their advisories that qualify as 0-day by their definition, suggesting they do not have a method for accurately counting them: SA51820 (1), SA52064 (1), SA52116 (2), SA52196 (2), SA52374 (2), SA52451 (1), SA53314 (1), SA54060 (1), SA54274 (1), SA54884 (2), SA55584 (1), SA55611 (1), and SA55809 (1).

Find out how quickly software vendors issue fixes – so-called patches – when vulnerabilities are discovered in All products.

This comes from their “Time to Patch for all products” summary page. This statement seems pretty clear; How fast do vendors issue fixes when vulnerabilities are discovered? However, Secunia does not track that specifically! The more appropriate question that can be answered by their data is “When are patches available at or after the time of public disclosure?” These are two very different metrics. The information on this page is generated using PSI/CSI statistics. So if a vulnerability is disclosed and a fix is already available at that time, it counts as within 24 hours. It doesn’t factor in that the vendor may have spent months fixing the issue before disclosure and patch.


In conclusion, while we appreciate companies sharing vulnerability intelligence, the Secunia 2013 vulnerability report is ultimately fluff that provides no benefit to organizations. The flawed methodology and inability for them to parse their own data means that the conclusions cannot be relied upon for making business decisions. When generating vulnerability statistics, a wide variety of bias will always be present. It is absolutely critical that your vulnerability aggregation methodology be clearly explained so that results are qualified and have more meaning.

I could do this all day… (Poor vuln stats from @GFISoftware)

[This was originally published on the OSVDB blog.]

Despite the talk given at BlackHat 2013 by Steve Christey and myself, companies continue to produce pedestrian and inaccurate statistics. This batch comes from Cristian Florian at GFI Software and offers little more than confusing and misleading statistics. Florian falls into many of the traps and pitfalls outlined previously.

These are compiled from data from the National Vulnerability Database (NVD).

There’s your first problem, using a drastically inferior data set than is available. The next bit really invalidates the rest of the article:

On average, 13 new vulnerabilities per day were reported in 2013, for a total of 4,794 security vulnerabilities: the highest number in the last five years.

This is laughable. OSVDB cataloged 10,472 disclosed vulnerabilities for 2013 (average of 28 a day), meaning these statistics were generated with less than half of known vulnerabilities. 2013 was our third year of breaking 10,000 vulnerabilities, where the rest have a single year (2006) if any at all. Seriously; what is the point of generating statistics when you knowingly use a data set lacking so much? Given that 2012 was another ’10k’ year, the statement about it being the highest number in the last five years is also wrong.

Around one-third of these vulnerabilities were classified ‘high severity’, meaning that an exploit for these vulnerabilities would have a high impact on the attacked systems.

By who? Who generated these CVSS scores exactly, and why isn’t that disclaimed in the article? Why no mention of the ‘CVSS 10′ scoring problem as VDBs must default to that for a completely unspecified issue? With a serious number of vulnerabilities either scored by vendors with a history of incorrect scoring, or VDBs forced to use ’10’ for unspecified issues, these numbers are completely meaningless and skewed.

The vulnerabilities were discovered in software provided by 760 different vendors, but the top 10 vendors were found to have 50% of the vulnerabilities:

I would imagine Oracle is accurate on this table, as we have cataloged 570 vulnerabilites in 2013 from them. However, the rest of the table is inaccurate because #2 is wrong. You say Cisco with 373, I say ffmpeg with 490. You say #10 is HP with 112 and I counter that WebKit had 139 (which in turn adds to Apple and Google among others). You do factor in that whole “software library” thing, right? For example, what products incorporate ffmpeg that have their own vulnerabilities? These are contenders for taking the #1 and #2 spot on the table.

Most Targeted Operating Systems in 2013

As we frequently see, no mention of severity here. Of the 363 Microsoft vulnerabilities in 2013, compared to the 161 Linux Kernel issues, impact and severity is important to look at. Privilege escalation and code execution is typical in Microsoft, while authenticated local denial of service accounts for 22% of the Linux issues (and only 1% for Microsoft).

In 2013 web browsers continued to justle – as in previous years – for first place on the list of third-party applications with the most security vulnerabilities. If Mozilla Firefox had the most security vulnerabilities reported last year and in 2009, Google Chrome had the “honor” in 2010 and 2011, it is now the turn of Microsoft Internet Explorer to lead with 128 vulnerabilities, 117 of them ‘critical’.

We already know your numbers are horribly wrong, as you don’t factor in WebKit vulnerabilities that affect multiple browsers. Further, what is with the sorting of this table putting MSIE up top despite it not being reported with the most vulnerabilities?

Sticking to just the browsers, Google Chrome had 297 reported vulnerabilities in 2013 and that does not count additional WebKit issues that very likely affect it. Next is Mozilla and then Microsoft IE with Safari at the lowest (again, ignoring the WebKit issue).

An open letter to Ashley Carman, @SCMagazine, and @SkyboxSecurity

[This was originally published on the OSVDB blog.]

[Sent to Ashley directly via email. Posting for the rest of the world as yet another example of how vulnerability statistics are typically done poorly. In this case, a company that does not aggregate vulnerabilities themselves, and has no particular expertise in vulnerability metrics weighs in on 2013 “statistics”. They obviously did not attend Steve Christey and my talk at BlackHat last year titled “Buying Into the Bias: Why Vulnerability Statistics Suck“. If we do this talk again, we have a fresh example to use courtesy of Skybox.]

[Update: SkyboxSecurity has quickly written a second blog in response to this one, clarifying a lot of their methodology. No word from Carman or SC Magazine. Not surprised; they have a dismal history as far as printing corrections, retractions, or even addressing criticism.]


Ashley,

In your recent article “Microsoft leads vendors with most critical vulnerabilities“, you cite research that is factually incorrect, and I fully expect a retraction to be printed. In fact, the list of errata in this article is considerably longer than the article itself. Some of this may seem to be semantics to you, but I assure you that in our industry they are anything but. Read down, where I show you how their research is *entirely wrong* and Microsoft is not ‘number one’ here.

1. If Skybox is only comparing vendors based on their database, as maps to CVE identifiers, then their database for this purpose is nothing but a copy of CVE. It is important to note this because aggregating vulnerability information is considerably more demanding than aggregating a few databases that do that work for you.

2. You say “More than half of the company’s 414 vulnerabilities were critical.” First, you do not disclaim that this number is limited to 2013 until your last paragraph. Second, Microsoft had 490 disclosed vulnerabilities in 2013 according to OSVDB.org, apparently not one of the “20” sources Skybox checked. And we don’t claim to have all of the disclosed vulnerabilities.

3. You cite “critical vulnerability” and refer to Microsoft’s definition of that as “one that allows code execution without user interaction.” Yet Skybox did not define ‘critical’. This is amateur hour in the world of vulnerabilities. For example, if Microsoft’s definition were taken at face value, then code execution in a sandbox would still qualify, while being considerably less severe than without. If you go for what I believe is the ‘spirit’ of the research, then you are talking about vulnerabilities with a CVSS score of 10.0 (network, no user interaction, no authentication, full code execution to impact confidentiality / integrity / availability completely), then Microsoft had 10 vulnerabilities. Yes, only 10. If you add the ‘user interaction’ component, giving it a CVSS score of 9.3, they had 176. That is closer to the ‘216’ Skybox is claiming. So again, how can you cite their research when they don’t define what ‘critical’ is exactly? As we frequently see, companies like to throw around vulnerability statistics but give no way to reproduce their findings.

4. You say, “The lab’s findings weren’t particularly surprising, considering the vendors’ market shares. Microsoft, for instance, is the largest company and its products are the most widely used.” This is completely subjective and arbitrary. While Microsoft captures the desktop OS market share, they do not capture the browser share for example. Further, like all of the vendors in this study, they use third-party code from other people. I point this line out because when you consider that another vendor/software is really ‘number one’, it makes this line seem to
be the basis of an anecdotal fallacy.

5. You finish by largely parroting Skybox, “Skybox analyzed more than 20 sources of data to determine the number of vulnerabilities that occurred in 2013. The lab found that about 700 critical vulnerabilities occurred in 2013, and more than 500 of them were from four vendors.” We’ve covered the ‘critical’ fallacy already, as they never define what that means. I mentioned the “CVE” angle above. Now, I question why you didn’t challenge them further on this. As a security writer, the notion that “20” sources has any meaning in that context should be suspect. Did they simply look to 20 other vulnerability databases (that do all the initial data aggregation) and then aggregate them? Did they look at 20 unique sources of vulnerability information themselves (e.g. the MS / Adobe / Oracle advisory pages)? This matters greatly. Why? OSVDB.org monitors over 1,500 sources for vulnerability information. Monitoring CVE, BID, Secunia, and X-Force (other large vulnerability databases) is considered to be 4 of those sources. So what does 20 mean exactly? To me, it means they are amateurs at best.

6. Jumping to the Skybox blog, “Oracle had the highest total number of vulnerabilities at 568, but only 18 percent of their total vulnerabilities were deemed critical.” This is nothing short of a big red warning flag to anyone familiar with vulnerabilities. This line alone should have made you steer clear from their ‘research’ and demanded you challenge them. It is well known that Oracle does not follow the CVSS standards when scoring a majority of their vulnerabilities. It has been shown time and time again that what they scored is not grounded in reality, when compared to the
researcher report that is eventually released. Every aspect of a CVSS score is frequently botched. Microsoft and Adobe do not have that reputation; they are known for generally providing accurate scoring. Since that scoring is the quickest way to determine criticality, it is important to note here.

7. Now for what you are likely waiting for. If not Microsoft, who? Before I answer that, let me qualify my statements since no one else at this table did. Based on vulnerabilities initially disclosed in 2013, that have a CVSS score of 10.0 (meaning full remote code execution without user interaction), we get this:

Oracle: 48
Adobe: 29
Microsoft: 10

Two vendors place higher than Microsoft based on this. Now, if we consider “context-dependent code execution”, meaning that user interaction is required but it leads to full code execution (e.g. click this malicious PDF/DOC/GIF and we base that on a 9.3 CVSS score (CVSS2#AV:N/AC:M/Au:N/C:C/I:C/A:C”)) or full remote code execution (CVSS2#AV:N/AC:L/Au:N/C:C/I:C/A:C) we get the following:

Microsoft 176
Adobe: 132
Oracle: 122

I know, Microsoft is back on top. But wait…

ffmpeg: 326
libav: 286

Do you like apples?

Brian Martin
OSF / OSVDB.org

Android versus iOS Security – Not Again…

[This was originally published on the OSVDB blog.]

About two weeks ago, another round of vulnerability stats got passed around. Like others before, it claims to use CVE to compare Apple iOS versus Android in an attempt to establish which is more secure based on “vulnerability counts”. The statistics put forth are basically meaningless, because like most people using a VDB to generate stats, they don’t fully understand their data source. This is one type of bias that enters the picture when generating statistics, and one of many points Steve Christey (MITRE/CVE) and I will be making next week at BlackHat (Wednesday afternoon).

As with other vulnerability statistics, I will debunk the latest by showing why the conclusions are not based on a solid understanding of vulnerabilities, or vulnerability data sources. The post is published on The Verge, written by ‘Mechanicix’. The results match last year’s Symantec Internet Security Threat Report (as mentioned in the comments), as well as the results published this year by Sourcefire in their paper titled “25 Years of Security Vulns“. In all three cases, they use the same data set (CVE), and do the same rudimentary counting to reach their results.

The gist of the finding is that Apple iOS is considerably less secure than Android, as iOS had 238 reported vulnerabilities versus the 27 reported in Android, based on CVE and illustrated through CVEdetails.com.

Total iOS Vulnerabilities 2007-2013: 238
Total Android Vulnerabilities 2009-2013: 27

Keeping in mind those numbers, if you look at the CVE entries that are included, a number of problems are obvious:

  1. We see that the comparison timeframes differ by two years. There are at least 3 vulnerabilities in Android SDK reported before 2009, two of which have CVEs (CVE-2008-0985 and CVE-2008-0986).
  2. These totals are based on CVE identifiers, which does not necessarily reflect a 1-to-1 vulnerability mapping, as they document. You absolutely cannot count CVE as a substitute for vulnerabilities, they are not the same.
  3. The vulnerability totals are incorrect due to using CVE, a data source that has serious gaps in coverage. For example, OSVDB has 71 documented vulnerabilities for Android, and we do not make any claims that our coverage is complete.
  4. The iOS results include vulnerabilities in WebKit, the framework iOS Safari uses. This is problematic for several reasons.
    1. First, that means Mechanicix is now comparing the Android OS to the iOS operating system and applications.
    2. Second, WebKit vulnerabilities account for 109 of the CVE results, almost half of the total reported.
    3. Third, if they did count WebKit intentionally then the numbers are way off as there were around 700 WebKit vulnerabilities reported in that time frame.
    4. Fourth, the default browser in Android uses WebKit, yet they weren’t counted against that platform.
  5. The results include 16 vulnerabilities in Safari itself (or in WebKit and just not diagnosed as such), the default browser.
  6. At least 4 of the 238 are vulnerabilities in Google Chrome (as opposed to WebKit) with no mention of iOS in the CVE.
  7. A wide variety of iOS applications are included in the list including Office Viewer, iMessage, Mail, Broadcom BCM4325 and BCM4329 Wi-Fi chips, Calendar, FreeType, libxslt, and more.

When you factor in all of the above, Android likely comes out on top for the number of vulnerabilities when comparing the operating systems. Once again, vulnerability statistics seem simple on the surface. When you consider the above, and further consider that there are likely more points that influence vulnerability counts, we see that it is anything other than simple.

CVE Vulnerabilities: How Your Dataset Influences Statistics

[This was originally published on the OSVDB blog.]

Readers may recall that I blogged about a similar topic just over a month ago, in an article titled Advisories != Vulnerabilities, and How It Affects Statistics. In this installment, instead of “advisories”, we have “CVEs” and the inherent problems when using CVE identifiers in the place of “vulnerabilities”. Doing so is technically inaccurate, and it negatively influences statistics, ultimately leading to bad conclusions.

NSS Labs just released an extensive report titled “Vulnerability Threat Trends; A Decade in Review, Transition on the Way“, by Stefan Frei. While the report is interesting, and the fundamental methodology is sound, Frei uses a dataset that is not designed for true vulnerability statistics. Additionally, I believe that some factors that Frei attributes to trends are incorrect. I offer this blog as open feedback to bring additional perspective to the realm of vulnerability stats, which is a long ways from approaching maturity.

Vulnerabilities versus CVE

In the NSS Labs paper, they define a vulnerability as “a weakness in software that enables an attacker to compromise the integrity, availability, or confidentiality of the software or the data that it processes.” This is as good a definition as any. The key point here is a weakness, singular. What Frei fails to point out, is that the CVE dictionary is not a vulnerability database in the same sense as many others. It is a specialty database designed primarily to assign a unique identifier to a vulnerability, or a group of vulnerabilities, to coordinate tracking and discussion. While CVE says “CVE Identifiers are unique, common identifiers for publicly known information security vulnerabilities” , it is more important to note the way CVE abstracts, which is covered in great detail. From the CVE page on abstraction:

CVE Abstraction Content Decisions (CDs) provide guidelines about when to combine multiple reports, bugs, and/or attack vectors into a single CVE name (“MERGE”), and when to create separate CVE names (“SPLIT”).

This clearly denotes that a single CVE may represent multiple vulnerabilities. With that in mind, every statistic generated by NSS Labs for this report is not accurate, and their numbers are not reproduceable using any other vulnerability dataset (unless it too is only based on CVE data and does not abstract differently, e.g. NVD). This distinction puts the report’s statements and conclusions in a different light:

As of January 2013 the NVD listed 53,489 vulnerabilities ..
In the last ten years on average 4,660 vulnerabilities were disclosed per year ..
.. with an all-‐time high of 6,462 vulnerabilities counted in 2006 ..

The abstraction distinction means that these numbers aren’t just technically inaccurate (i.e. terminology), they are factually inaccurate (i.e. actual stats when abstracting on a per-vulnerability basis). In each case where Frei uses the term “vulnerability”, he really means “CVE”. When you consider that a single CVE may cover as many as 66 or more distinct vulnerabilities, it really invalidates any statistic generated using this dataset as he did. For example:

However, in 2012 alone the number of vulnerabilities increased again to a considerable 5,225 (80% of the all-‐time high), which is 12% above the ten-‐year average. This is the largest increase observed in the past six years and ends the trend of moderate declines since 2006.

Based on my explanation, what does 5,225 really mean? If we agree for the sake of argument, that CVE averages two distinct vulnerabilities per CVE assignment, that is now over 10,000 vulnerabilities. How does that in turn change any observations on trending?

The report’s key findings offer 7 high-level conclusions based on the CVE data. To put all of the above in more perspective, I will examine a few of them and use an alternate dataset, OSVDB, that abstracts entries on a per-vulnerability basis. With those numbers, we can see how the findings stand. NSS Labs report text is quoted below.

The five year long trend in decreasing vulnerability disclosures ended abruptly in 2012 with a +12% increase

Based on OSVDB data, this is incorrect. Both 2009 (7,879) -> 2010 (8,835) as well as 2011 (7,565) -> 2012 (8,919) showed an upward trend.

More than 90 percent of the vulnerabilities disclosed are moderately or highly critical – and therefore relevant

If we assume “moderately” is “Medium” criticality, as later defined in the report, is 4.0 -‐ 6.9 then OSVDB shows 57,373 entries that are CVSSv2 4.0 – 10.0, out of 82,123 total. That means 90% is considerably higher than we show. Note: we do not have complete CVSSv2 data for 100% of our entries, but we do have them for all entries affiliated with the ones Frei examined and more. If “moderately critical” and “highly critical” refer to different ranges, then they should be more clearly defined.

It is also important to note that this finding is a red herring, due to the way CVSS scoring works. A remote path disclosure in a web application scores a 5.0 base score (CVSS2#AV:N/AC:L/Au:N/C:P/I:N/A:N). This skews the scoring data considerably higher than many in the industry would agree with, as 5.0 is the same score you get for many XSS vulnerabilities that can have more serious impact.

9 percent of vulnerabilities disclosed in 2012 are extremely critical (with CVSS score>9.9) paired with low attack/exploitation complexity

This is another red herring, because any CVSS 10.0 score means that “low complexity” was factored in. The wording in the report implies that a > 9.9 score could be paired with higher complexity, which isn’t possible. Further, CVSS is scored for the worst case scenario when details are not available (e.g. CVE-2012-5895). Given the number of “unspecified” issues, this may seriously skew the number of CVSSv2 10.0 scores.

Finally, there was one other element to this report that was used in the overview, and later in the document, that is used to attribute a shift in disclosure trends. From the overview:

The parallel and massive drop of vulnerability disclosures by the two long established purchase programs iDefense VCP and TippingPoint ZDI indicate a transition in the way vulnerability and exploit information is handled in the industry.

I believe this is a case of “correlation does not mean causation“. While these are the two most recognized third-party bug bounty programs around, there are many variables at play here. In the bigger picture, shifts in these programs do not necessarily mean anything. Some of the factors that may have influenced disclosure numbers for those two programs include:

  • There are more bug bounty programs available. Some may offer better price or incentive for disclosing through them, stealing business from iDefense/ZDI.
  • Both companies have enjoyed their share of internal politics that affected at least one program. In 2012, several people involved in the ZDI program left the company to form their startup. It has been theorized that since their departure, ZDI has not built the team back up and that disclosures were affected as a result.
  • ZDI had a small bout of external politics, in which one of their most prevalent bounty collectors (Luigi Auriemma) had a serious disagreement about ZDI’s handling of a vulnerability, as relates to Portnoy and Exodus. Auriemma’s shift to disclose via his own company would dramatically affect ZDI disclosure totals alone.
  • Both of these companies have a moving list of software that they offer a bounty on. As it changes, it may result in spikes of disclosures via their programs.

Regardless, iDefense and ZDI represent a small percentage of overall disclosures, it is curious that Frei opted to focus on this so prominently as a reason for vulnerability trends changing without considering some influencing factors. Even during a good year, 2011 for example, iDefense (42) and ZDI (297) together accounted for 339 out of 7,565 vulnerabilities, only ~ 4.5% of the overall disclosures. There are many other trends that could just as easily explain relatively small shifts in disclosure totals. When making statements about trends in vulnerability disclosure and how it affects statistics, it isn’t something that should be done by casual observers. They simply miss a lot of the low-level details you glean on the day-to-day vulnerability handling and cataloging.

To be clear, I am not against using CVE/NVD data to generate statistics. However, when doing so, it is important that the dataset be explained and qualified before going into analysis. The perception and definition of what “a vulnerability” is changes based on the person or VDB. In vulnerability statistics, not all vulnerabilities are created equal.

Advisories != Vulnerabilities, and How It Affects Statistics

[This was originally published on the OSVDB blog.]

I’ve written about the various problems with generating vulnerability statistics in the past. There are countless factors that contribute to, or skew vulnerability stats. This is an ongoing problem for many reasons. First, important numbers are thrown around in the media and taken as gospel, creating varying degrees of bias in administrators and owners. Second, these stats are rarely explained to show how they were derived. In short, no one shows their work, shows potential bias, caveats, or other issues that should be included as a responsible security professional. A recent article has highlighted this problem again. To better show why vulnerability stats are messy, but important, I will show you how it is trivial to skew numbers simply by using different criteria, along with several pitfalls that must be factored into any set of stats you generate. The fun part is that the word used to describe the differences can be equally nebulous and they are all valid, if properly disclaimed!

I noticed a Tweet from SC Magazine about an article titled “The ghosts of Microsoft: Patch, present and future”. The article is by Alex Horan, security strategist, CORE Security and discusses Microsoft’s vulnerabilities this year. Reading down, the first line of the second paragraph immediately struck me as being incorrect.

Based on my count, there were 83 vulnerabilities announced by Microsoft over the past year. This averages out to a little more than six per month, a reasonable number of patches (and reboots) to apply to your systems over the course of a year.

It is difficult to tell if Horan means “vulnerabilities” or “patches”, as he appears to use the same word to mean both, when they are quite different. The use of ’83’ makes it very clear, Horan is referencing Microsoft advisories, not vulnerabilities. This is an important distinction as a single advisory can contain multiple vulnerabilities.

A cursory look at the data in OSVDB showed there were closer to 170 vulnerabilities verified by Microsoft in 2012. Doing a search to include references for “MS12” (used in their advisory designation), 160 results. This is how it was easy to determine the number Horan used was inaccurate, or his wording was. If you generate statistics based on advisories versus independent vulnerabilities, results will vary greatly. To add a third perspective, we must also consider the total number of disclosed vulnerabilities in Microsoft products. This means ones that did not correspond to a Microsoft advisory (e.g. perhaps a KB only), did not receive a CVE designation, or were missed completely by the company. On Twitter, Space Rogue (@spacerog) asked about severity breakdowns over the last few years. Since that would take considerable time to generate, I am going to stay focused on 2012 as it demonstrates the issues. Hopefully this will give him a few numbers though!

If we look at the 2012 Microsoft advisories versus 2012 Microsoft CVE versus 2012 Microsoft total vulnerabilities, and do a percentage breakdown by severity, you can see heavy bias. We will use the following breakdown of CVSS scores to determine severity: 9 – 10 = critical, 7 – 8.9 = important, 4 – 6.9 = moderate, 0 – 3.9 = low.

Base SourceCriticalImportantModerateLow
2012 Advisories (83)35 (42.2%)46 (55.4%)2 (2.4%)
2012 CVE (160)100 (62.5%)18 (11.3%)39 (24.4%)3 (1.8%)
2012 Total (176)101 (57.4%)19 (10.8%)41 (23.3%)15 (8.5%)

It isn’t easy to see the big shifts in totals in a chart, but it is important to establish the numbers involved when displaying any type of chart or visual representation. If we look at those three breakdowns using simple pie charts, the shifts become much more apparent:

The visual jump in critical vulnerabilities from the first to the second two charts is distinct. In addition, notice the jump from the first two charts to the third in regards to the low severity vulnerabilities and that they didn’t even make an appearance on the first chart. This is a simple example of how the “same” vulnerabilities can be represented, based on terminology and the source of data. If you want to get pedantic, there are additional considerations that must be factored into these vulnerabilities.

In no particular order, these are other points that should not only be considered, but disclaimed in any presentation of the data above. While it may seem minor, at least one of these points could further skew vulnerability counts and severity distribution.

  • MS12-080 Only contains 1 CVE if you look at immediate identifiers, but also contains 2 more CVE in the fine print related to Oracle Outside In, which is used by the products listed in the advisory.
  • MS12-058 actually has no immediate CVEs! If you read the fine print, it actually covers 13 vulnerabilities. Again, these are vulnerabilities in Oracle Outside In, which is used in some Microsoft products.
  • Of the 176 Microsoft vulnerabilities in 2012, as tracked by OSVDB, 10 do not have CVE identifiers assigned.
  • OSVDB 83750 may or may not be a vulnerability, as it is based on a Microsoft KB with uncertain wording. Vague vulnerability disclosures can skew statistics.
  • Most of these CVSS scores are taken from the National Vulnerability Database (NVD). NVD outsources CVSS score generation to junior analysts from a large consulting firm. Just as we occasionally have mistakes in our CVSS scores, so does NVD. Overall, the number of scores that have serious errors are low, but they can still introduce a level of error into statistics.
  • One of the vulnerabilities (OSVDB 88774 / CVE-2012-4792) has no formal Microsoft advisory, because it is a 0-day that was just discovered two days ago. There will almost certainly be a formal Microsoft advisory in January 2013 that covers it. This highlights a big problem with using vendor advisories for any statistic generation. Vendors generally release advisories when their investigation of the issue has completed, and a formal solution is made available. Generating statistics or graphics off the same vulnerabilities, but using disclosure versus solution date will give two different results.

These are just a few ways that statistics can be manipulated, often by accident, and why presenting as much data and explanation is beneficial to everyone. I certainly hope that SCMagazine and/or CORE will issue a small correction or explanation as to the what the “83” number really represents.