[This was originally published on RiskBasedSecurity.com.]
Shortly after a year closes out, the industry is treated to dozens of security companies that want to tell you all about vulnerability totals and trends from the previous year. In many cases, the companies offering the predictions are armchair experts of a sorts, who do not aggregate vulnerability intelligence on their own. Instead, they simply download a set of vulnerability data, do some rudimentary analysis, compare it to prior years, and give their opinion on what will happen the coming year.
The first and most important problem with these companies is that they all tend to use the same data set from the MITRE curated CVE project, which is basically echoed at NIST’s National Vulnerability Database (NVD). For the purposes of counting vulnerabilities, the data sets are basically identical (NIST adds two primary types of metadata to the entries from CVE). The second problem that has been prevalent in our industry since CVE’s creation is the incorrect notion that the CVE database is comprehensive. While it certainly appears more comprehensive some years, it is also most certainly is not comprehensive the last few years. Since most of the vulnerability statistic fortune tellers don’t aggregate the data, or follow CVE’s procedures in any way, they don’t tend to think of why the data might not be complete and what kind of disclaimers their ‘analysis’ should carry. When working with any given data set, it is crucial that you fully understand how it was created, what the limitations were, and what the intent of the collection was. Not understanding such data typically leads to predictions that are inaccurate and a disservice to the industry, and their customers.
Steve Christey, one of the founders of CVE, and our own Brian Martin gave a presentation at the BlackHat Briefings(PPT) several years ago in which we outline many of the ways bias enters vulnerability metrics. One of the biases that Steve specifically addressed is how the staffing and resources of a vulnerability database can directly influence how many vulnerabilities are actually aggregated. For example, in 2015 there are a number of things going on at MITRE that directly influence CVEs performance. Many people, including our team at RBS, have observed that the wait to receive a CVE assignment from MITRE can take as many as 45 days, sometimes much longer. Considering this isn’t in line with an uptick in assignments compared to previous years. This relatively new delay in assignments, along with MITRE being honest about internal issues make it clear that 2015 CVE data may be heavily influenced.
Instead of looking beyond a blob of data or consulting those more familiar with the topic, many companies tend to do very simple counting based on the CVE ID, using it as a way to come up with a total for the number of vulnerabilities disclosed in a given year. Then they look at the number of disclosures per year and come up with their observations and trends they perceive. This typically revolves around such claims that “vulnerabilities are up last year!” or “vulnerability disclosures dropped last year!” or “last year was a record year for whatever!” Which leads to our own prediction!
Early next year, at least a dozen companies will write about how vulnerabilities are down in 2015.
They will do this while ignoring plenty of criticism of the same type of predictions in the past. As of December 6, 2015, there are 8,465 CVE identifiers. However, 3,391 of them are RESERVED and may or may not be used, with another 43 flagged as REJECTED. That gives us 5,031 live IDs with just under a month to go. Looking at the CVEdetails site, which actually uses NVD data instead of CVE exports, we see they have 5,906 entries open. That is still over 2,000 short of last year but ahead of 2013 and 2014.
For the newcomers to vulnerability statistics looking at the data compared to last year, the predictions will say vulnerabilities are down. For the ones looking at the last several years, they might say that 2014 was an abnormal spike in disclosures. And indeed, it was in a specific way that increased CVEs count by approximately 1,380 vulnerabilities (Will Dormann’s “Tapioca” project), and is heavily responsible for the 2014 uptick. Looking at the historical breakdown of CVE/NVD data according to CVEdetails:
That chart can only legitimately be used to observe a single trend; how many vulnerabilities CVE aggregated over the years. It absolutely cannot be used to speak to the number of vulnerabilities found or disclosed any given year. Next, compare those numbers to VulnDB’s coverage:
There are clearly dips and trends for any given vulnerability database, and some of the dips and peaks may follow each other in some years. While that is interesting and may speak to events that influenced vulnerability disclosures, it isn’t a certainty. Looking at the VulnDB data, one might be inclined to say that a second data source indicates vulnerabilities are down. However, with almost a month to go, VulnDB is on par to have an equivalent year to last, while the CVE data shows they are not likely to come close. Comparing these two sources of vulnerability intelligence, side-by-side, you see several years that don’t follow the same trend as far as totals:
With this, we implore companies not to put forth ‘their’ statistics and observations using third-party vulnerability data, without fully understanding the data and how it was derived. Many companies operate in the realm of security and vulnerabilities, but aggregation and analysis of vulnerability disclosures is a very different expertise than finding or patching them. Just because your company does ‘security’, doesn’t mean it is has expertise in this realm. Rather than try to present yourself as an expert in in this part of your world. If you have thoughts on vulnerability statistics, or want to better visualize trends, then invest the time to better understand the data before you do your own predictions. Failing that, reach out to those who work in this little slice of the vulnerability landscape and tap their expertise (contact us!). Their knowledge of aggregation and trends, along with your own expertise in other facets of security, can make a complimentary offering and analysis.
Remember the wise words of Lao Tzu, who said:
Those who have knowledge, don’t predict. Those who predict, don’t have knowledge.