During the process of assigning a CVE ID, there is a time period between the assignment and the disclosure, and again between the disclosure and it becoming available on MITRE’s CVE site or NIST’s National Vulnerability Database (NVD). During this period, the ID will be shown as RESERVED.
First, it is important to note that when an ID is part of a CVE Numbering Authority (CNA) pool of IDs to potentially be assigned, it is shown in RESERVED status. If an ID is not assigned that year, it is then supposed to be moved to REJECT status the following year per CVE rules. Bit odd they say the reason for the rejection will “will most often be stated“; most often and not always? If a CNA other than MITRE assigns an ID and the researcher and/or vendor later publicly discloses the vulnerability, it may still show as RESERVED. This happens when the CNA fails to notify MITRE despite being stipulated in CNA rules. It can also happen if the CNA notifies MITRE but it slips through the cracks. Pretty simple right?
If MITRE assigns the ID to a researcher, it is a lot more likely to stay in RESERVED status after disclosure because the researcher who publicly discloses the vulnerability doesn’t always notify MITRE. You may ask why MITRE doesn’t open the CVE with details themselves if it is public, and that is a great question! The simple answer is, MITRE does not really monitor public sources for disclosures any longer. Back in the day they would monitor Bugtraq and NTBugtraq and encouraged researchers to just disclose directly to those mail lists. During that time, they also said they monitored four sources for new vulnerability information but notably did not include either mail list, instead including four different summaries being published. I think we can chalk that up to an error in documentation.
For those not familiar with MITRE’s coverage for CVE, consider that they no longer publish three lists of interest. As late as March 6, 2016, MITRE maintained lists of what they considered:
“Full Coverage Sources” – “For nearly all issues disclosed by the source that could be associated with a CVE entry, there will be an associated CVE entry, regardless of the criticality of the issue. Although a source is named as Full Coverage, we purposely use the phrasing “nearly all issues disclosed” to allow the flexibility to potentially postpone coverage of minor issues.”
“Partial Coverage Sources” – “The source will be actively monitored but issues will be processed and associated with CVE entries based on a variety of editorial judgments.”
“Must-Have Products” – “All products listed are considered to be “must have.” This means that we will ensure that a CVE-ID is issued for any public disclosure for the product provided that the following to provisions are met…”
By the end of 2016, that page maintained the same URL but changed content to become what would be their CNA coverage page. By early 2017, the old URL redirected to a new one about requesting a CVE ID and CNA coverage, which is roughly the same as currently available. This is an important shift in how CVE operates as MITRE basically threw in the towel trying to actively monitor disclosures and moved to relying almost entirely on CNAs and researchers coming to them.
The part that is truly baffling to me is that this tax-payer funded project, costing us millions a year, thought that monitoring 48 sources for “full” coverage, 45 sources for “partial” coverages, and guaranteeing 45 products was ever adequate to begin with, and somehow a burden at that point. They also disclaimed that they “actively [monitor] many sources beyond this list. These sources include things like blogs from vulnerability researchers, conference proceedings, and media outlets.” Despite that claim and coverage, MITRE was already missing thousands of vulnerability disclosures a year including ones from sources on their list.
What should worry consumers of CVE is that other vulnerability databases monitor a lot more sources than that for a lot less money. Any claims of it being more complicated or the issue being due to their processes mean there is an incredible amount of red tape or horribly outdated technical processes that were never updated. If another database can monitor literally several thousand sources a week for a fraction of the price, it speaks to MITRE not evolving over the years. Whew, glad that wraps it up!
Sorry, one last thing. Like entries in REJECT status, we can’t trust entries in RESERVED status either. Based on above and how MITRE operates, we know there are bound to be quite a few vulnerabilities where a researcher requested an ID, published details, and did not notify MITRE. Their backwards choice of not monitoring sources for disclosures means a disclosure may sit in RESERVED status for some time. How long? I went poking around a bit for fun and found this one. At the time of this blog, CVE-2000-1253 is still in RESERVED status (archive).
The issue? That was disclosed in 2015, and likely earlier. The actual vulnerability details were public at far back as 2003, maybe earlier. The good news? If you aren’t worried about remote root on a medical device, no need to be worried about this one.
For analysts and practitioners that digest CVE regularly, you will likely be familiar with CVEs that are in REJECT status. If you are new to CVE or not familiar with some of the more gritty details, a CVE assignment may be rejected for various reasons. When that happens, it will receive a capitalized REJECT status:
The REJECT links to a page that offers more information, but as of April, 2021, actually links to the correct page but the wrong anchor. I’ll link to the correct anchor for your reference, which gives us several reason an ID might be rejected including “it being a duplicate CVE Record, it being withdrawn by the original requester, it being assigned incorrectly, or some other administrative reason.“
At the time of this blog, there are almost 9,500 IDs that have been rejected. A significant portion of those come from MITRE being more proactive and enforcing that CVE Numbering Authorities (CNA) reject unused IDs from prior years, along with a general increase in the total CVE assigned per year.
The process of rejecting, and the presence of REJECT entries is straightforward.
That’s right, if I am taking the time to blog about a topic seemingly so easy, there’s probably more to it. In this case, I wanted to point out a couple examples of CVE IDs that are in REJECT status, but highlight issues. The first is a simple one that underscores that the process of CNAs rejecting CVE IDs may have a problem, or that MITRE has an issue in the way they described the rejected ID. We’ll take CVE-2018-1226 (archive), that was rejected because “The CNA or individual who requested this candidate did not associate it with any vulnerability during 2018. Notes: none.” That is easy enough, right? The problem is that it was rejected by March 19, 2018. Not even a quarter of the way through 2018 and it was rejected because it was not associated with a vulnerability that year? That seems problematic. I’m sure there is a good explanation for this, but the description sure doesn’t do it.
You may think that pointing that out is pedantic, and you are right. However, there is an important reason we need to be pedantic and expect accurate descriptions from CVE, even for a rejected entry. What if the REJECT message was factually incorrect? What if that CVE ID represented a valid vulnerability that impacted your organization? If you rely on CVE/NVD you would have a blinds pot as a result of errors in their process, which are critical to you. Looking at two older rejected CVE IDs as an example, CVE-2015-0788 (archive) and CVE-2015-0789 (archive), we see that both are in REJECT status because they were not associated with a vulnerability in 2015.
Looking closer, we can see that the assigning CNA was Micro Focus International. As such they should be the single source of truth and provenance for any vulnerability information associated with those CVEs. MITRE would be secondary and should not necessarily be trusted if there is a dispute. In this case, there is a dispute in the form of NetIQ Identity Manager release notes listing both CVEs as fixed issues in version 4.5 Service Pack 2.
NetIQ was founded in 1995, acquired by Attachmate in 2006, and then acquired by Micro Focus in 2014. With this document we see the conflict where Micro Focus says that they were assigned, represented legitimate vulnerabilities, and were fixed and CVE says they are rejected.
The takeaway here is that while a CVE may be listed as REJECTed, trust, but verify.
For almost two decades, CVE has been considered an industry standard for vulnerability tracking. A CVE ID can be affiliated with many vulnerabilities, in a format like CVE-2014-54321. Note my choice in ID, from 2014 with a consecutive set of numbers. That is because I specifically chose a ‘sample’ CVE that was set aside as an example of the CVE ID Syntax Change in 2014. This change occurred when it was determined that 9,999 IDs for a single year was not going to be sufficient. Technical guidance on this is available, as well as more basic information and the announcement about the change. Starting out with this hopefully demonstrates that there may be more to an ID than meets the eye.
Fundamentally, the ID is simple; you have the CVE prefix, followed by a year identifier, and a numeric identifier. In the CVE used above, it would represent ID 54321 with a 2014 year identifier. Fairly simple! But you are reading an entire blog on these IDs by me so the spoiler is here. It isn’t so simple unfortunately. I want to give a rundown of what a CVE ID really is, and set the record straight. Why? Because I don’t think MITRE has done a good job with that, and worse, actively works against what could be a clear and simple policy. We’ll use CVE-YEAR-12345 as a representative example for the purpose of discussing these IDs to be clear about which part of an ID we’re talking about.
When CVE was started in 1999, assignments were made based on a public disclosure. However, from the beginning, the YEAR portion almost immediately was not guaranteed to represent the year of disclosure. This was because MITRE’s policy was to assign an ID for a pre-1999 vulnerability using a CVE-1999 ID. We can see this with CVE-1999-0145 which was assigned for the infamous Sendmail WIZ command, allowing remote root access. This feature was publicly disclosed as a vulnerability on November 26, 1983 as best I have determined (the Sendmail changelog). While it was a known vulnerability and used before that, it was privately shared. If there is a public reference to this vulnerability before that date, leave a comment please!
The takeaway is that a vulnerability from 1983 has a CVE-1999 identifier. So from the very first year, MITRE set a clear precedent that the YEAR portion of an ID does not represent the year of discovery or disclosure. You may think this only happened for vulnerabilities prior to 1999, but that isn’t the case. In the big picture, meaning the 22 years of CVE running, an ID typically does represent the disclosure year. However, per one of CVE’s founders, “because of CVE reservation, sometimes it aligned with year of discovery“. That is entirely logical and expected as a CVE ID could be used to track a vulnerability internally at a company before it was disclosed. For example, BigVendor could use the CVE ID not only for their internal teams, such as communicating between security and engineering, but when discussing a vulnerability with the researcher. If a researcher reported several vulnerabilities, using an ID to refer to one of them was much easier than the file/function/vector.
For the early CVE Numbering Authorities (CNA), companies that were authorized to assign a CVE without going through MITRE, this was a common side effect of assigning. If a researcher discovered a vulnerability on December 25 and immediately reported it to the vendor, it may be given e.g. a CVE-2020 ID. When the vendor fixed the vulnerability and the disclosure was coordinated, that might happen in 2021. The founder of CVE I spoke to told me there “weren’t any hard and fast rules for CNAs” even at the start. So one CNA might assign upon learning of the vulnerability while another might assign on public disclosure.
Not convinced for some reason? Let’s check the CVE FAQ about “year portion of a CVE ID”!
What is the significance and meaning of the YEAR portion of a CVE ID CVE IDs have the format CVE-YYYY-NNNNN. The YYYY portion is the year that the CVE ID was assigned OR the year the vulnerability was made public (if before the CVE ID was assigned).
The year portion is not used to indicate when the vulnerability was discovered, but only when it was made public or assigned.
A vulnerability is discovered in 2016, and a CVE ID is requested for that vulnerability in 2016. The CVE ID would be of the form “CVE-2016-NNNN”. A vulnerability is discovered in 2015 and made public in 2016. If the CVE ID is requested in 2016, the CVE ID would be of the form “CVE-2016-NNNNN”.
All clear, no doubts, case closed!
That clear policy is conflicting or may introduce confusion in places. Looking at MITRE’s page on CVE Identifiers, we see that the “The process of creating a CVE Record begins with the discovery of a potential cybersecurity vulnerability.” My emphasis on ‘discovery’ as that means the ID would reflect when it was discovered, and not necessarily even when it was reported to the vendor. There are many cases where a researcher finds a vulnerability but may wait days, weeks, months, or even years before reporting it to the vendor for different reasons. So it is more applicable that the ID will be assigned based on when the vendor learns of the vulnerability in cases of coordinated disclosure with a CNA. Otherwise, a bulk of CVEs are assigned based on the disclosure year.
It gets messier. At the beginning of each year, each CNA will get a pool of CVE IDs assigned. The size of the pool varies by CNA and is roughly based on the prior year of assignments. A CNA that disclosed 10 vulnerabilities in the prior year is likely to get 10 – 15 IDs the subsequent year. Per section 5.1.4 of the CNA rules, any IDs that are not used in a calendar year should be REJECTed if they were not assigned to an issue. “Those CVE IDs that were unused would be rejected.” But then, it stipulates that the CNA can get “CVE IDs for previous calendar years can always be requested if necessary.” So per current rules, a CNA can request a new ID from a prior year despite REJECTing IDs that were previously included in their pool. That means it is entirely optional, up to each CNA, on how they assign.
[Update: Note that the pool of IDs a CNA gets one year may not be the same the next. Not only in regards to the size of the pool, but the first ID may be in an entirely different range. e.g. 2019-1000 vs 2020-8000.]
The take-away from all this is that we now have many reasons why a CVE ID YEAR component does not necessarily tie to when it was disclosed. The more important take-away? If you are generating statistics based on the YEAR component, you are doing it wrong. Any statistics you generate are immediately inaccurate and cannot be trusted. So please don’t do it!
Finally, a brief overview of the numeric string used after the YEAR. Going back to our example, CVE-YEAR-12345, it is easy to start to make assumptions about 12345. The most prevalent assumption, and completely incorrect, is that IDs are issued in a sequential order. This is not true! Covered above, CNAs are given pools of IDs at the beginning of each year. Oracle and IBM assign over 700 vulnerabilities a year, so the pool of IDs they receive is substantial. There are over 160 participating CNAs currently, and if each only received 100 IDs, that is over 16,000 IDs that are assigned before January 1st.
In 2021, the effect of this can be seen very clearly. Halfway through April and we’re already seeing public IDs in the 30k range. For example, CVE-2021-30030 is open and represents a vulnerability first disclosed on March 28th. According to VulnDB, there are only 7,074 total vulnerabilities disclosed this year so far. That means we can clearly see that CVE IDs are not assigned in order.
In July of 2019, many noticed that the Bugtraq mail list stopped having posts approved, including Art Manion at CERT. Since there are many other outlets for vulnerability disclosure, such as the Full-Disclosure mail list, Packetstorm, Exploit Database, and increasingly on GitHub, it didn’t receive much attention. It wasn’t like the days when the list was created when there were very few places to disclose that would be seen by other security professionals and hackers. Despite that, the list has a long history and many came up in their respective scene recognizing what it represented.
The last post to the list, as of now, is dated July 26, 2019. In December of 2019 I tweeted to SecurityFocus and Symantec, its parent company at the time, asking if they had killed the list. I received no reply. Months passed and I got curious again, so I reached out to both companies via email trying to ascertain the status of the list. Two of the three public addresses I found on their website immediately bounced as undeliverable while the third went unanswered. I shared that update with Twitter as well in May.
After the last Tweet, someone reached out and offered to help me figure out the disposition of Bugtraq. We started chatting in May and they went to work. You may think this would be a relatively simple task and it would be for smaller companies. However, remember that while Symantec acquired SecurityFocus back in 2002, Broadcom acquired Symantec in November, 2019, and then Accenture acquired Symantec from Broadcom in April, 2020. You can imagine the amount of chaos going on in that organization and the layers of management along with the vast number of departments.
During that initial chat, I said “a lot of people don’t want to see the Bugtraq list just vanish given its history. We’re hoping Broadcom starts it back up or will pass it off to someone else in the industry to run.” That same day my contact figured out the general org structure involved and where to start asking around. A couple weeks later they reported back that it fell under an Accenture business unit, that there was discussion going as to the disposition, and that the pandemic was slowing things down. Jump to August, 2020, and they reported back that they were still working on it and that “breathing life into [the list] might be possible”.
In August, before they checked in with that update, I had decided to update the Wikipedia entry for the Bugtraq list as it was pretty sparse originally. I added a significant amount to better document the history around the list as well as some highlights like some controversies. My contact said those updates were actually helpful in gaining traction which I thought was cool. Nay-sayers about Wikipedia take notes! A few months passed and my contact reached back out in November with an update saying “I have a bit of an uphill battle here”. Giving it back to the community was being discussed, but we both immediately realized the next challenge if that happened; who would run it. No chance in hell I would. I said that if they posted to the list asking, I am sure they would get many volunteers. Vetting those and figuring out a viable long-term option might be tricky though.
I sent another message to my contact thanking them for fighting for it, and that I was happy a clear message had been sent finishing off the list. There was some concern about the possible fallout for the team, and that they “still remember hiring folks who said their goal was to be referenced on the Bugtraq list.” That was a great reminder that today claiming CVEs is some misguided notion of skill while back then, getting a post to Bugtraq approved meant a lot.
We continued the conversation on January 16 and they cryptically told me “there may be hope yet”, citing a ZDnet article that apparently got the attention of some executives. I joked with them saying “that would be an epic unintentional troll, if it got resurrected weeks later” to which they agreed. The list admins allowed one response to the post through that day, a shout-out from the old-school hackers of UPT. A day after that, the list admin sent a mail titled “On Second Thought…” stating that based on feedback, they have “decided to keep the Bugtraq list running. We’ll be working in the coming weeks to ensure that it can remain a valuable asset to the community for years to come.” Accenture followed that up with a more lengthy blog about the list revival.
Here we are months later and I thought back to part of our conversation in which I told them I “wish people knew just how long you had to fight on this and how much of a hurdle it was just to post the list.” They said it was probably a story best told over a beer, or something harder, suggesting I probably don’t know the half of what they went through to get all of this rolling. So I wrote this blog with the hope that they said it was ok to post, to share the story. InfoSec is full of heroes that work behind the scenes, fighting the good fight, and trying to make things a bit better. They deserve more recognition than they get. This is just one example of that. So thank you, so much, for your work in helping keep the historic list alive.
I am currently reading “This Is How They Tell Me The World Ends” by Nicole Perlroth, only on page 60 in Chapter 5, so a long ways to go before completing the 471 page tome. I hit chapter 4, titled “The First Broker” and it was of specific interest to me for sure, prompting this (second) blog on the book. A broker is defined as “a person who buys and sells goods or assets for others” so I was never a vulnerability broker by that definition. I am not trying to claim to be the actual first broker of zero-days in that context at all. Instead, I would like to share a couple of my own stories that are adjacent to the topic. This is all to the best of my recollection, but my memory isn’t the best due to being a diabetic and not having it under control for several years. If anyone involved in any of these stories has a different memory please feel free to comment or reach out directly and I will update this blog accordingly.
First, I was someone who ‘brokered’ deals in the sense of trading zero-day vulnerabilities for a few years in the mid-90s. As a member of multiple hacking groups, some an actual member and some an honorary member, one of my roles in several of those groups was not writing the zero-days because I simply wasn’t a coder and did not have that skill. Instead, it was to barter and try to gain access to specific zero-days one group or member wanted and my currency was other zero-days we had. While I couldn’t code, my social network of hackers was sizable.
Some of what I was authorized to trade for was toward the goal of obtaining e.g. “any remote zero-day in $target operating system” while in other cases it was “trade anything and everything we have for $specific-zero-day“. I acted as a go-between for the groups I was in and a liaison to the general hacker scene. Many knew me to have a well-rounded vulnerability collection and we already traded more pedestrian exploits, some of which weren’t public, but definitely more circulated in such groups.
Back then it was just hackers and groups, not companies, so we didn’t have “duffel bags stuffed full of half a million dollars in cash to buy zero-day bugs” (p.49). Instead we had other zero-day bugs which were just as valuable between groups and acted as the ideal currency. Just like Perlroth describes in her book relating the story of “Jimmy Sabien” (p.43), not his real name, the vulnerabilities had serious value back then too. Some were very closely guarded, to the point of not being shared with their group. For example, Sally may have shared 99% of her exploits and zero-days with her group but held one back because it was so valuable. That one she would use sparingly herself so as not to burn it or authorize it to be traded for a vulnerability of equal value. In those rare cases I would know just enough about the vulnerability to try to arrange a trade on her behalf, sometimes never seeing the vulnerability myself.
There were rumors at the time that some hackers had sold vulnerabilities to specific agencies in European governments. There were also rumors that some were trading zero-day exploits to a European law enforcement agency as a proffer or part of a plea to avoid being charged for hacking activity. But those were just rumors at that point. To me, that was the precursor to the more financial based zero-day market.
Later in the 90s, I was one of the two founders of a startup called Repent Security Inc. (RSI or RepSec). We were three people and started trying to be a penetration testing shop. This was still early in the world of commercial penetration testing and we were going up against companies that either had an established business reputation like a couple of the ‘Big 5’ at the time, or companies that were pioneers in the game like The Wheel Group. We also created software for securely streaming logs over an encrypted tunnel so if a system was popped, you had the logs on a remote host with timestamps including your shell histories (which didn’t have timestamps natively). That software was partially outsourced to a renowned “InfoSec luminary” who had it developed by one of his interns on a compromised .edu machine and later essentially stole the software after RSI imploded. But that story is for another day because it isn’t part of the zero-day world, it’s part of the Charlatan and Errata world.
One thing RSI had of real value was the vulnerability database that I had been maintaining since 1993. It was first maintained for the hacker group I was part of (TNo) where it was originated by other members. When I took over maintaining it I worked on further organizing it, adding several points of metadata, and expanding it. After that group drifted apart I kept maintaining it while a member of w00w00 and honorary member of ADM, where I brokered some trades. I did not maintain the databases for either of those groups which were separate from mine, but I was privy to some of their exploits and shared some of what I had. Members from both groups would frequently ask me to check my database for exploits specific to an operating system or service they were targeting, as this was before Google and Yahoo! didn’t aggregate much in the big picture. Even though a majority of vulnerabilities were posted to Bugtraq, you couldn’t just skim it quickly to determine what was there that you could use for your purpose. Someone that had them all sorted in a database with metadata was fairly valuable. To this day, many friends and colleagues still ask me to do vulnerability lookups, now with VulnDB.
Throughout my hacker days I maintained that database, and then continued to as I transitioned into a career doing penetration testing. Like Perlroth documents in her book about the early days of iDefense and the outfit that “Sabien” worked for, we all scoured Bugtraq for our information primarily. I had the benefit of several circles of hackers and hackers-turned-legit that still traded vulnerability intelligence (vuln intel). Essentially the grey market back when the currency was still vuln intel not those duffels of cash. By that point, the database that RSI had was unparalleled in the commercial world. This was initially created before and maintained during Fyodor’s Exploit World and Ken Williams’ Packetstorm. The RSI database came before the ISS XForce database, before BID, before NIST’s ICAT Metabase, and before MITRE’s CVE. More importantly, it was heavy on exploit code but light on proper descriptions or solutions, so it was geared toward penetration testing and compromising machines rather than mature vulnerability intelligence.
As RSI struggled to get penetration testing gigs and opted to work on the “Secure Remote Streaming” (SRS) product, we had taken a trip to Atlanta to talk to ISS about selling a copy of our database to their relatively new X-Force penetration testing team (I forgot who we met there, but I would love to remember!). That deal did not happen and we soon found ourselves in talks with George Kurtz at Ernst & Young, one of the ‘Big 5’. While most or all of the ‘Big 5’ had penetration testing teams, their reputation wasn’t the best at the time. That was primarily due to their testers frequently being traditional auditors turned penetration testers, rather than being a ‘real’ tester; someone that came up through the hacking ranks.
It is also important to remind everyone that back then these companies “did not hire hackers“. They literally printed it in advertisements as a selling point that they did not hire and would not consort with so-called black hats. This was almost always an outright lie. Either the company knew the background of their team and lied, or they did not know the background and conveniently overlooked that their employees had zero experience on their resume around that skillset, yet magically were badass testers. Years of companies claiming this also led to what we see now, where many security professionals from that time still refuse to admit they used to hack illegally even 25 years later.
Anyway, back to George and E&Y. It made sense that a shop like that would want to get their hands on RSI’s database. If their testers were primarily from the auditor / bean-counter side of things they would not have had their own solid database. Even if they had hackers it didn’t mean they came with the same vuln intel we had. As best I recall, the negotiations went back and forth for a couple weeks and we settled on a one-time sale of the RSI database for $75,000 with the option to revisit selling ‘updates’ to it as we continued to maintain it. This would have become the first commercial vulnerability intelligence feed at the time I believe, in early 1999. Then, disaster.
The FBI raided the offices of RSI, which was my apartment. At the time that was a death sentence to a penetration tester’s career. Regardless of guilt, the optics were one of black hat / criminal hacking, and finding someone to trust you to break into their systems was not happening. RSI dissolved and I found myself struggling to find work of any kind. So I reached back out to George about the deal we had on the table that we were close to signing and said I was fine with the price, let’s do it. Suddenly, Kurtz had a change of heart.
He didn’t have a change of heart as far as doing the deal, his change was in the price. Instead of $75,000 he came back and said we could do the deal for $25,000 instead, just a third of what we had agreed to. He knew I was in a tight spot and needed the money and he took full advantage of that. This is someone who had a reputation of being a friend to hackers, someone that had bridged the gap between the business world and hackers to put together a reputable team at E&Y. He even had his name on a book about penetration testing, co-authored with names other hackers recognized. He was also very explicit that he knew I had no real power at that point and refused to budge on his one-third offer.
So when he had a chance to honor the deal we originally worked on, a chance to be a friend to a hacker, at no expense of his own? He opted to screw me. Since I was out of options and my limited savings were dwindling I had to accept the offer. That takes me full circle, via a meandering path I know, to likely making one of the largest vulnerability sales at the time. While it wasn’t a single exploit, a $25k deal that was originally set to be $75k is pretty impressive for the time. If RSI had made it, odds are we would have become a software (SRS) and vulnerability intelligence shop rather than a penetration testing shop.
Many aspects of how Perlroth describes the early days of iDefense and “Sabien’s” shop, we were already doing. With a lot fewer people than they claimed, but we were aggregating information from Bugtraq and other sources, writing exploits for some of the vulnerabilities, and then we began to try to sell that information. I guess it isn’t a big surprise I ended up in the vulnerability intelligence business eventually.
I am currently reading “This Is How They Tell Me The World Ends” by Nicole Perlroth, only on page 17 in Chapter 2, so a long ways to go before completing the 471 page tome. While only 17 pages in, there are already some annoyances to be sure, but the tone, scope, and feel of the book is enjoyable so far. I am not sure if I will do a full review at the end or perhaps write some blogs specific to topics like this one. It obviously didn’t take long at all to get to the point where I thought a quick blog with my perspective might be interesting to some.
At the end of Chapter 1, Perlroth summarizes what she sees as the long road ahead for her to tackle the subject of zero-day exploits. This follows her describing one dinner with a variety of security folks from all sides of the topic but seems to center around two zero-day exploit writers not answering some ‘basic’ questions like “who do you sell to?” She uses this to enumerate a list of questions around the topic of zero-day exploits that she would have to face to cover the topic thoroughly. Of the 28 questions she posed to herself, two stood out to me but requires two more to better set the stage:
Who did they sell their zero-days to? To whom would they not? How did they rationalize the sale of a zero-day to a foreign enemy? Or to governments with gross human rights violations?
Depending on who you ask, or when you ask them, you may be told these are simple questions and answers, very complex, or like an onion.
When you ask if an exploit broker will sell to governments with “gross human rights violations“, that gets complicated in today’s world of geopolitics while remaining much more simple as far as morals and ethics go. If gross human rights violations are the line in the sand, meaning regular human rights violations are acceptable (?), then it cuts out all of the biggest players in the game; United States, China, Russia, North Korea, and Iran. Before any of my European friends head straight to the comment section, I am not forgetting or neglecting you. Some of the European countries maintain teams that are extremely accomplished and arguably better than the countries I listed. Why? You don’t see their names being splashed in every other headline and attribution claim. Further, some of the most elite zero-day writers from the late 80’s and early 90’s were European. I used to be privy to a handful of some of those exploits and on occasion, brokered (traded, not sold) them between groups. Further, I don’t associate most European countries with the other five as far as gross human rights violations, at least not in recent history.
Since zero-day exploit writers do sell to some of those countries at least (US, CN, RU), and presumably some sell to the other two (IR, KP), now we’re talking shades of grey or onions, depending on your favorite analogy. You’re left trying to draw a line in the sand as to which human rights violations you can accept and at that point, does the question even have relevance? I don’t want to get into a pissing war over who is holier or more evil than the other because each of the five countries above has their long list of sordid atrocities.
Let’s jump back to the third question there, the notion of “foreign enemy”. This is peculiar since the book had already thrown around the term “mercenary” several times in the prologue, and that scenario answers the question simply. A mercenary sells their services to the highest bidder typically, ethics takes a seat in the trunk if it even comes along for the ride. So a simple summary is that some will sell to the highest bidder, end of story.
But does any of the above really matter? Long ago I heard a great quote that is both funny and sardonic, that I think has relevance to the other question:
“We refuse to join any organization that would have us as a member.”
If we’re discussing the notion of being involved with another group (country in this case), isn’t the ethics of selling a zero-day that you know will potentially be used against your own country a lesson in abject self examination? If you are willing to sell to such an organization, one that might cause a power outage, risk human life, or undermine security and privacy as only a nation-state can, is that the kind of organization you want to be a part of? If such an organization or country is willing to buy zero-day exploits from you to use for those purposes, is that the type of organization you want to be affiliated with?
If the answer is no, then Perlroth has the beginning of her answer. If the answer is yes, then we’re back to square mercenary. Pretty simple maybe?
As a connoisseur of vulnerability disclosures and avid vulnerability collector, I am always interested in analysis of the disclosure landscape. That typically comes in the form of reports that analyze a data set (e.g. CVE/NVD) and draw conclusions. This seems straight-forward but it isn’t. I have written about the varied problems with such analysis many times in the past and yet, companies that don’t operate in the world of vulnerability databases still decide to play in our mud puddle. This time is the company Redscan, who I don’t think I had heard of, doing analysis on NVD data for 2020. Risk Based Security wrote a commentary on their analysis, to which I contributed, but I wanted to keep the party going over here with a few more personal comments. Just my opinions here, as a more outspoken critic on the topic, and where I break from the day job.
I am going to focus on one of my favorite topics; vulnerability tourists. People that may be in the realm of Information Security, but don’t specifically operate day-to-day in the world of vulnerability disclosures, and more specifically to me, vulnerability databases (VDBs). For this blog, I am just going to focus on a few select quotes that made me double-take. Read on after waving to Tourist Lazlo!
“The NVD tracks CVEs logged by NIST since 1988, although different iterations of the NVD account for some variation when comparing like-for-like results over time.”
There’s a lot to unpack here, most of it wrong. First, the NVD doesn’t track anything; they are spoon-fed that data from MITRE, who manages the CVE project. Second, NIST didn’t even create NVD until over five years after CVE started. Third, CVE didn’t track vulnerabilities “since 1988”; they cherry-picked some disclosures from before 1999, when they started, and why CVE IDs start with ‘1999’. Fourth, there was only one different iteration of NVD, that was their ICAT “CVE Metabase” that ran the first year of CVE basically. According to Peter Mell, who created it, said that after starting as its own vulnerability website, “ICAT had become an archival tool for CVE standard vulnerabilities and was only updated every three or four weeks”. Then in 2005 the site relaunched with a new focus and timely updates from CVE. Despite this quote, later in their report they produce a chart that tries to show an even comparison from 1988 to 2020 despite saying it went through iterations and despite not understanding CVSS.
“The growth is also likely attributable to an increase in the number of CVE Numbering Authorities (CNAs) – of which there are now more than 150 worldwide with the power to create and publish CVEs.”
The growth in disclosures aggregated by CVE is a lot more complicated than that, and the increase in CNAs I doubt is a big factor. Of course, they say this and don’t cite any evidence despite CVE now showing who the assigning CNA was (e.g. CVE-2020-2000 is Palo Alto Networks). The data is there if you want to make that analysis but it isn’t that easy since it isn’t included in the NVD exports. That means it requires some real work scraping the CVE website since they don’t include it in their exports either. Making claims without backing them up when the data is public and might prove your argument is not good.
“Again, this is a number that will concern security teams, since zero interaction vulnerabilities are famously difficult to detect and have the potential to cause significant damage.”
This makes me think that Redscan should invent a wall, perhaps made of fire, that could detect and prevent these attacks! Or maybe a system that is designed to detect intrusions! Or even one that can prevent intrusions! This quote is one that is truly baffling because it doesn’t really come with an explanation as to what they mean, and I hope they mean something far different than what this sounds like. I hope this isn’t a fear tactic to make readers think that their managed detection service is needed. Quite the opposite; anyone who says the above probably should not be trusted to do your attack detection.
This chart heading is one of many signs that Redscan doesn’t understand CVSS at all. For a “worst of the worst” vulnerability they got several attributes right but end up with “Confidentiality [High]”. The vulnerability they describe would only be CVSSv2 7.8 (AV:N/AC:L/Au:N/C:C/I:N/A:N) and CVSSv3.1 7.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N). That is not the worst. If it ‘highly’ impacts confidentiality, integrity, and availability then that becomes the worst of the worst, becoming CVSSv2 10.0 and CVSSv3.1 9.8 or 10.0 depending on scope. It’s hard to understand how a security company gets this wrong until you read a bit further where they say they “selected the very ‘worst’ option for every available metric.” My gut tells me they didn’t realize you could toggle ‘High’ for more than one impact and confidentiality is the first listed.
“It is also important to note that these numbers may have been artificially reduced. Tech giants such as Google and Microsoft have to do a lot to maintain their products and services day-to-day. It is common for them to discover vulnerabilities that are not being exploited in the wild and release a quick patch instead of assigning a CVE. This may account for fewer CVEs with a network attack vector in recent years.”
This is where general vulnerability tourism comes in as there is a lot wrong here. Even if you don’t run a VDB you should be passingly familiar with Microsoft advisories, as an example. Ever notice how they don’t have an advisory with a low severity rating? That’s because they don’t publish them. Their advisories only cover vulnerabilities at a certain threshold of risk. So that means that the statement above is partially right, but for the wrong reason. It isn’t about assigning a CVE, it is about not even publishing the vulnerability in the first place. Because they only release advisories for more serious issues, it actually skews their numbers to include more remote vulnerabilities, not less, primarily on the back of “remote” issues that require user interaction such as browser issues or file parsing vulnerabilities in Office.
This quote also suggests that exploitation in the wild is a bar for assigning a CVE, when it absolutely is not. It might also be a surprise to a company like Redscan, but there are vulnerabilities that are disclosed that never receive a CVE ID.
“Smart devices designed for the mass market often contain a worrying number of vulnerabilities due to manufacturer oversight. Firmware within devices is often used by multiple vendors, meaning that any vulnerabilities in this software has the potential to result in lots of CVEs.”
Wrong again, sometimes. If it is known to be the same firmware used in multiple devices, it gets one CVE ID. The only time there are additional IDs assigned is when multiple disclosures don’t positively ID the root cause. When three disclosures attribute the same vulnerability to three different products, it stands to reason there will be three IDs. But it isn’t how CVE is designed because it artificially inflates numbers, and that is the game of others.
“The prevalence of low complexity vulnerabilities in recent years means that sophisticated adversaries do not need to ‘burn’ their high complexity zero days on their targets and have the luxury of saving them for future attacks instead.” -vs- “It is also encouraging that the proportion of vulnerabilities requiring high-level privileges has been on the increase since 2016. This trend means that cybercriminals need to work harder to conduct their attacks.”
So which is it? When providing buzz-quote conclusions such as these, that are designed to support the data analysis, they shouldn’t contradict each other. This goes back to what I have been saying for a long time; vulnerability statistics need qualifications, caveats, and explanations.
“Just because a vulnerability is listed in the NVD as hard to exploit doesn’t mean that attackers aren’t developing PoC code to exploit it. The key is to keep up with what’s happening in the threat landscape and respond accordingly.”
I’ll end here since this is a glowing endorsement for why vulnerability intelligence has to be more evolved than what CVE and NVD are offering. Part of the CVSS specifications include Temporal scoring and one of those attributes is Exploit Code Maturity. This is designed to specifically address the problem above; that knowing the capability of potential attackers matters. With over 21,000 vulnerabilities disclosed last year, organizations are finding that just patching based on the CVSSv3 base score isn’t enough. Sure, you patch the 10.0 / 9.8 since those are truly the worst-of-the-worst, and you patch the 9.3 / 8.8 since any random email might carry a payload. Then what? If all things are equal between vulnerabilities that impact your organization you should look to see if a patch is available (also covered by Temporal score) and if an exploit is available.
Numeric scores are not enough, you have to understand the context behind them. That CVSSv2 remote information disclosure that partially affects confidentiality by disclosing an admin password is only a 5.0. Score it under CVSSv3 and you are looking at a 9.8 because it immediately leads to privilege escalation which is factored in under that system. Heartbleed was a CVSSv2 5.0 with a functional exploit and available patch; look what hell that brought upon us. If you aren’t getting that type of metadata, reconsider your choice of vulnerability intelligence.
Soul (2020) Medium: Movie (Disney) Rating: 5/5 movie and music magic Reference(s): IMDB Listing || Disney Disney knows how to do modern cartoons and this is no exception. The story follows Joe, a school band teacher who seems to have lost his way. As he sees a spark of passion in one student’s musical ability and then lands the gig of his life, he has a mishap and finds himself at the pearly gates but refuses to accept that fate. In limbo Joe runs into an odd one known as “22” and finds himself on an adventure to help 22 find a spark so that they can live a life on earth. The movie has a great stride and flows very well with an amazing cast of vocal talent as well as some incredible music by an unlikely trio, Trent Reznor and Atticus Ross for the original score and John Batiste with original jazz songs. The movie brings the laughs and the feels and is perfect for all ages.
Kajillionaire (2020) Medium: Movie (Netflix) Rating: 4.5/5 stick with it Reference(s): IMDB Listing || Amazon This quirky movie is billed as a Crime/Drama but as far as modern movies go, that is about the farthest thing from what it really is. It’s more of a slow-play dry commentary on the nature of humans and how odd we can be, with a splash of low-end grifting, wrapped into a family-dynamic sleeper hit that also moonlights as a love-story. For me, it started out slow and confused as I couldn’t figure out what type of movie it was. About half-way through I was hooked as I realized it wasn’t trying to be any specific genre; it just did its thing with Evan Rachel Wood stealing the show. If you dig on off-the-beaten-path flicks, this one is worth a go.
Greenland (2020) Medium: Movie (Multiple) Rating: 2/5 the title is the most redeeming quality Reference(s): IMDB Listing || Amazon It must have been a few years since the last earth-snuffing porn, as we tend to get one movie like that every so often, although more recently in the form of plagues and zombies. Gerard Butler and end of the world, pretty much tells you what you need to know about this movie. All the stereotypical things from this genre of movie too; poorly manufactured explosions to tide you over before the real city-snuffing comes, impromptu gangs that make no sense, cell service outages for plot advancement, and really bad dialogue snippets. I definitely like I watched this so you wouldn’t have to.
Lupin, Part 1 (2020) Medium: TV (Netflix) Rating: 3.9/5 pas une série de braquages Reference(s): IMDB Listing || Netflix This 10 episode series is described as “inspired by the adventures of Arsène Lupin, gentleman thief Assane Diop sets out to avenge his father for an injustice inflicted by a wealthy family”. The first episode of five in part one sets the stage of a master thief and the heist of a 20-million dollar piece of jewelry. Unfortunately, we quickly learn that the main character is not really a master thief. While he has skill in makeup, blending in, and pickpocketing, there are no other grand heists involved. Instead, it becomes more of a drama around avenging his father’s death with the thief / con man / grifter components as a side piece to facilitate the main story. Overall it is fairly entertaining but entirely too predictable and not very thought-provoking. Great for falling asleep to.
News of the World (2020) Medium: Movie (Multiple) Rating: 3.5/5 bit of a slow read Reference(s): IMDB Listing || Amazon We follow Captain Kidd (Tom Hanks) as he travels from town to town reading the “news of the world”. Along the way he encounters a young girl, Johanna, played by Helena Zengel, who has grown up in an Indian tribe that was decimated by whites and speaks no English. Kidd decides to take her to the family she was going to before becoming stranded, and the story progresses. Given the movie stars Hanks and is a period piece, I expected an amazing movie. Unfortunately it just didn’t come together and became disjointed the farther it went. At almost two hours it still felt like parts ended up on the cutting room floor that might have tied some of the beginning to the end better. Worth a watch, wait for it to hit Netflix.
Freaks (2018) Medium: Movie (Netflix) Rating: 4/5 every single character is a freak Reference(s): IMDB Listing || Netflix This Canadian-made super-(anti)-hero movie is a different style than many movies of the genre. It starts out a bit slow and leaves you wondering what is happening and some of those questions go unanswered until very late in the movie. But it has a good slow buildup, good casting, a simple premise, and a solid conclusion. Slightly dystopian where anyone with any power is labeled a ‘freak’ and hunted by the government. This movie doesn’t spoon feed you a simple person with powers like most mainstream films of the sort. Worth a watch.
Joker (2019) Medium: Movie (HBO Max) Rating: 5/5 he’ll laugh, you’ll laugh Reference(s): IMDB Listing || Amazon I saw this in theaters, you know, just before the society-crippling pandemic robbed us of basic joys. I left the theater confused, not sure if I really liked the movie or really didn’t. By that night, after a discussion with Lyger, I realized that I really liked it. I re-watched it recently and still really enjoy it. The biggest factor is that it is a complete break from the DC universe as far as style goes. While we have seen Batman’s origin story, in one form or another, many times over, the villain’s origin stories are often relegated to fairly quick scenes (Suicide Squad) or not explored (The Dark Knight). Having an entire movie to see how Todd Phillips’ envisioned this iconic villain’s origin was worth the adventure. This movie leans a bit toward Nolan’s Batman trilogy as far as feel and is the polar opposite of other DC offerings like Superman, Wonder Woman, or Aquaman. Forget the DC universe when you go into this, just focus on this movie and Phoenix’s incredible portrayal of Joker.
Aquaman (2018) Medium: Movie (Multiple) Rating: 0.5/5 this movie s(t)inks Reference(s): IMDB Listing || Amazon For some reason, DC Comics has a problem making good movies with few exceptions, and this isn’t Nolan’s Batman or Wonder Woman. Instead, Aquaman had the feel of a franchise desperate to create the feel of a Marvel Universe movie. Every single thing was predictable, cliché, and boring. “There’s too many casualties!” But let’s stop for a sloppy wet kiss of course. Seriously, we need a new word for “overdone movie cliché”. They tried to make this by loading it with big names but as we often see, put that many big names together and they still can’t save a movie. Skip this, take a bath instead.
Prospect (2018) Medium: Movie (Netflix) Rating: 4.5/5 I dig it Reference(s): IMDB Listing || Netflix A sci-fi movie I hadn’t heard of that turned out pretty damn good, what gives? Oh, Pedro Pascal is in it and he has enjoyed a little attention recently. This movie has a small cast set on some distant world where brave adventurers go to prospect a part of an alien life form that requires some skill and finesse rather than brute strength. When a father / daughter duo touch down chasing the ultimate score, things go sideways. The movie is more of a thriller and sci-fi a vehicle to deliver the underlying story, which is compelling and well-done. If you can look past a few simple plot holes, you may find this movie really enjoyable like I did.
Rememory (2017) Medium: Movie (Netflix) Rating: 3.5/5 A bit forgettable Reference(s): IMDB Listing || Amazon Sam, the main character played by Peter Dinklage, injects himself into the life of a brilliant scientist who is brilliant, and the movie makes sure you know he is brilliant. The science is being able to record and playback memories, ala Strange Days. But for some reason Sam plays back mostly on a tiny screen in a briefcase that is the device. Anyway, he ends up in the middle of the life and murder of this scientist and decides to find out who did it, with this new technology being the central piece of the story. Ultimately, the movie has some neat ideas, good acting, but just falls short as it all doesn’t fully come together. It’s the kind where you can’t quite put your finger on it but just know something was lacking.
Each miserable incident is scored by a team of psychologists based on the “three pillars of misery” which are “physical pain, emotional trauma, and long-term psychological impact“. That boils it down to a numeric score between 0 and 100. After watching a few episodes I was curious how they compared… so I made a spreadsheet. Big surprise there, I know. The more I watched, the more metadata I started tracking ultimately having to re-watch some past episodes to pick out data I hadn’t originally collected. In doing so it brought fun rewards quickly.
For example, in S02E09, contesting Katherine says that Sal helped win the most money during first season (10 episodes) and Brian ‘Q’ Quinn came in second. In reality, Sal helped win $64,000 while Q helped bring in $71,000. Either way, sound choice as they bring in the most by a good margin over the other two Jokers up to that point. Even better that she watched the first 10 episodes with that data point in mind. But… how about getting to that final stage where the big money is? Knowing how miserable events are scored is what it takes. That’s where this data comes in.
If you want to get on this game show you now have everything you need to better understand scoring and be ready for events. Even knowing, for example, that no event has been scored lower than 11 can help immensely in the final stage. The data:
Tab 1, “Ep Metadata” includes the season/episode number, air date, contestants, the two Jokers paired with each, winner, winner’s gender, final stage Joker assistant, total won, and notes. By the end of the second season there were four “perfect games” where the contestant won the maximum amount ($33,000 during a normal game, $50,000 during the Christmas special). Finally, it includes a running total of the prize money to date, $607,500 by the end of season two.
Tab 2, “Misery Data” is the meat of it while containing relatively few columns, but representing the most work. It includes the episode number, the miserable incident as listed on the game board, the score, a VNTO designation, the reward, if the contestant won the money, and comments. The VNTO designation indicates the format of the event which is a video, news article, text, or ‘other’. The column with the reward is color coded green or red to indicate if they won or lost that money. The time-consuming part is in column B, that lists the miserable incident but also links to it. I actually spent the time finding the exact news article or video in most cases. More on that in a bit.
Tab 3, “Statistics” is where we get the fun digestible information and the bigger take-aways like there being a single miserable incident scored at 100 or the average score of all incidents across two seasons is 56.0. It also has joker pairings, types of media totals, contestant breakdowns, and more.
Tab 4, “Charts” is a set of visual representations of the statistics, because people like colors and shapes!
Jumping back to finding the exact news article or video, that effort made it very clear early on that the news article headlines they show are often not real. The show will take the headline and make minor edits to it presumably for readability and to convey the relevant points. That’s fine, I get it. But… the problem is that on rare occasion they actually leave out something specific that might drastically alter the misery score. What isn’t clear is if the panel that scored knew that detail. Let’s look at the biggest example:
In a video clip from ABC News, the show includes some of the audio recording and transcription as seen above. The contestants are asked to then score “Your Doctor Disses You During Surgery“. OK, based on that info you consider the three pillars and make your guess and maybe you got it right (51) or maybe you guessed lower because some people said mean things behind your back. The real question is, did the panelists score based on that or did they have knowledge that it resulted in a malpractice suit that yielded $500,000 for the victim? Pretty sure that would drop the misery quite a bit. These little omissions are interesting since they can impact the game, but a contestant has no way of knowing the missing details or if it was factored in on a score.
So, there we have it! Going into season three, hopefully contestants choose Sal (213k) or Q (190k) in the final round instead of Joe (144k) or Murr (60.5k). I know it seems like Murr doesn’t do well but he has only been selected four times as compared to Sal who was selected 11 times. With that factored in, that means Sal only averages $19,363 and Murr $15,125. But when you are playing for that kind of money, every dollar matters.
Hopefully this data will help future contestants! If you notice any errors in my data please leave a comment so I can fix it up. As time permits, I will continue to update the sheet if the show for future episodes. Enjoy!
I really don’t have the time or desire to dig into this too heavily, but my response to the friend was “immediately problematic“. I’ll cliff notes some of the things that stand out to me, starting with the first graphic included which she specifically asked me about.
The header graphic displays the metrics for the CVSSv3 scoring system, but is just labeled “CVSS”. Not only is this sloppy, it belies an important point of this summary that the paper’s work is based on CVSSv2 scores, not CVSSv3. They even qualify that just below: “We should note the analysis conducted by Ross et al. is based upon the CVSS Version 2 scoring system…“
“Ross et al. note that many exploits exist without associated CVE-IDs. For example, only 9% of the Symantec data is associated with a CVE-ID. The authors offered additional caveats related to their probability calculation.” That sounds odd, but it is readily explained above when they summarize what that data is: “Symantec’s Threat Database (SYM): A database extracted from Symantec by Allodi and Massacci that contains references to over 1000 vulnerabilities.” First, that data set contains a lot more than vulnerabilities. Second, if Symantec is really sitting on over 900 vulnerabilities that don’t have a CVE ID, then as a CNA they should either assign them an ID or work with MITRE to get an ID assigned. Isn’t that the purpose of CVE?
“Ross et al. use four datasets reporting data on vulnerabilities and CVSS scores…” and then we see one dataset is “Exploit Database (Exploit-DB): A robust database containing a large collection of vulnerabilities and their corresponding public exploit(s).” Sorry, EDB doesn’t assign CVSS scores so the only ones that would be present are ones given by the people disclosing the vulnerabilities via EDB, some of whom are notoriously unreliable. While EDB is valuable in the disclosure landscape, serving as a dataset of CVSS scores is not one of them.
“About 2.7% of the CVE entries in the dataset have an associated exploit, regardless of the CVSS V2 score.” This single sentence is either very poorly written, or it is all the evidence you need that the authors of the paper simply don’t understand vulnerabilities and disclosures. With a simple search of VulnDB, I can tell you at least 55,280 vulnerabilities have a CVE and a public exploit. There were 147,490 live CVE IDs as of last night meaning that is almost 38% that have a public exploit. Not sure how they arrived at 2.7% but that number should have been immediately suspect.
“In other words, less than half of the available CVSS V2 vector space had been explored despite thousands of entries…” Well sure, this statement doesn’t qualify one major reason for that. Enumerate all the possible CVSSv2 metric combinations and derive their scores, then look at which numbers don’t show up on that list. A score of 0.1 through 0.7 is not possible for example. Then weed out the combinations that are extremely unlikely to appear in the wild, which is most that have “Au:M” as an example, and it weeds out a lot of possible values.
“Only 17 unique CVSS vectors described 80% of the NVD.” Congrats on figuring out a serious flaw in CVSSv2! Based on the 2.7% figure above, I would immediately question the 80% here too. That said, there is a serious weighting of scores primarily in web application vulnerabilities where e.g. an XSS, SQLi, RFI, LFI, and limited code execution could all overlap heavily.
“Input: Vulnerabilities (e.g., NVD), exploit existence, (e.g., Exploit-DB), the number of clusters k” This is yet another point where they are introducing a dataset they don’t understand and make serious assumptions about. Just because something is posted to EDB does not mean it is a public exploit. Another quick search of VulnDB tells us there are at least 733 EDB entries that are actually not a vulnerability. This goes back to the reliability of the people submitting content to the site.
“The authors note their approach outperforms CVSS scoring when compared to Exploit-DB.” What does this even mean? Exploit-DB does not do CVSS scoring! How can you compare their approach to a site that doesn’t do it in the first place?
Perhaps this summary is not well written and the paper actually has more merit? I doubt it, the summary seems like it is comprehensive and captures key points, but I don’t think the summary author works with this content either. Stats and math yes. Vulnerabilities no.