us 17 Herr Bug Collisions Meet Government Vulnerability Disclosure Taking Stock Vulnerability Rediscovery HKS

2020-03-01 147浏览

1.T H E CY B E R S EC U R I T Y P R OJ ECT Taking Stock Estimating Vulnerability Rediscovery Trey Herr Bruce Schneier Christopher Morris PAPER JULY 20 17
2.The Cyber Security Project Belfer Center for Science and International Affairs Harvard Kennedy School 79 JFK Street Cambridge, MA 02138 www.belfercenter.org/Cyber Statements and views expressed in this report are solely those of the authors and do not imply endorsement by Harvard University, the Harvard Kennedy School, or the Belfer Center for Science and International Affairs. Design & Layout by Andrew Facini Cover image and opposite page 1: A screenshot from the Google Chrome source code. Copyright 2017, President and Fellows of Harvard College Printed in the United States of America
3.T H E CY B E R S EC U R I T Y P R OJ ECT Taking Stock Estimating Vulnerability Rediscovery Trey Herr Bruce Schneier Christopher Morris PAPER JULY 20 17
4.About the Authors Trey Herr, Post-Doctoral Fellow Belfer Center Cyber Security Project, Harvard Kennedy School trey_herr@hks.harvard.edu Bruce Schneier, Research Fellow and Lecturer Belfer Center Cyber Security Project, Harvard Kennedy School schneier@schneier.com Christopher Morris, Research Assistant Harvard School of Engineering and Applied Sciences christophermorris@college.harvard.edu Acknowledgments This paper acknowledges support from the Belfer Family and the Flora and William Hewlett Foundation. The dataset and the resulting paper would not have been possible without Eduardo Vela Nava and Andrew Whalley of Google; Casey Ellis and Payton O’Neal of Bugcrowd; Rich Salz of OpenSSL; Richard Barnes, Dan Veditz, and Al Billings of Mozilla; and Art Manion and Allen Householder of CERT/CC. Special thanks are also owed to Martin Shelton, Katherine Bjelde, and Jim Waldo for their help cleaning and formatting the data. The authors would additionally like to thank Beth Friedman, Annie Boustead, Sasha Romanosky, Jay Healey, Mailyn Fidler, Thomas Dullien, Herb Lin, Fabio Massacci, Gary Belvin, Beau Woods, Tudor Dumitras, Ben Laurie, Tod Beardsley, and the Belfer Cyber Security team for their feedback.
5.Table of Contents Abstract.................................................................................................................................... 1 1 Introduction..................................................................................... 2 2 Background..................................................................................... 3 Value of Rediscovery................................................................................................................5 Rediscovery and Previous Literature..................................................................................... 6 3 Methodology and Data................................................................... 9 Counting Duplicate Vulnerabilities........................................................................................ 9 Coding and Data Sources...................................................................................................... 11 4 Analysis..........................................................................................15 Vulnerability Rediscovery in the Aggregate......................................................................... 15 Multiple Rediscovery.............................................................................................................16 Vulnerability Rediscovery Lag...............................................................................................18 Rediscovery Over Time......................................................................................................... 20 5 Limitations of the Dataset........................................................... 22 6 Implications.................................................................................. 25 7 Conclusions ...................................................................................31
6.
7.Abstract How often do multiple, independent, parties discover the same vulnerability? There are ample models of vulnerability discovery, but little academic work on this issue of rediscovery. The immature state of this research and subsequent debate is a problem for the policy community, where the government’s decision to disclose a given vulnerability hinges in part on that vulnerability’s likelihood of being discovered and used maliciously by another party. Research into the behavior of malicious software markets and the efficacy of bug bounty programs would similarly benefit from an accurate baseline estimate for how often vulnerabilities are discovered by multiple independent parties. This paper presents a new dataset of more than 4,300 vulnerabilities, and estimates vulnerability rediscovery across different vendors and software types. It concludes that rediscovery happens more than twice as often as the 1-9% range previously reported. For our dataset, 15% to 20% of vulnerabilities are discovered independently at least twice within a year. For just Android, 13.9% of vulnerabilities are rediscovered within 60 days, rising to 20% within 90 days, and above 21% within 120 days. For the Chrome browser we found 12.57% rediscovery within 60 days; and the aggregate rate for our entire dataset generally rises over the eight-year span, topping out at 19.6% in 2016. We believe that the actual rate is even higher for certain types of software. When combined with an estimate of the total count of vulnerabilities in use by the NSA, these rates suggest that rediscovery of vulnerabilities kept secret by the U.S. government may be the source of up to one-third of all zero-day vulnerabilities detected in use each year. These results indicate that the information security community needs to map the impact of rediscovery on the efficacy of bug bounty programs and policymakers should more rigorously evaluate the costs of non-disclosure of software vulnerabilities. Belfer Center for Science and International Affairs Harvard Kennedy School 1
8.1 Introduction Vulnerabilities are an important resource. Both intelligence and law enforcement activities increasingly emphasize the use of software vulnerabilities to gain access to targeted systems. These same software flaws are also a critical piece of defensive information, granting companies like Apple and open source projects like Apache insight into where there are holes in their software in need of repair. Left unfixed, software vulnerabilities provide malicious parties a point of access into any computer system running the software. Programs to pay researchers and others who disclose vulnerabilities to software developers, so-called bug bounties, are an increasingly popular way for companies to discover flaws in their code. Underlying the choices to pay for a software vulnerability, as well as government decisions to keep some a secret, are assumptions about how often those same software flaws could be discovered by someone else, a process called rediscovery. There is very little rigorous research into how often rediscovery takes place, and yet we know it happens, sometimes in high-profile ways. For example, the Heartbleed vulnerability in OpenSSL lay dormant for three years, and yet was discovered twice within just a few days, by both Neel Mehta of Google and researchers at the Finnish information security firm Codenomicon.1 This rediscovery rate becomes particularly important if the software in question is a widely used open source project or a cryptographic library. This paper introduces the issue of vulnerability rediscovery, our research, and addresses some of the larger questions raised by this scholarly collision. A particularly challenging issue with rediscovery is that this rate changes over time and varies for different types of software. We collected data from multiple software vendors and an open source project to track vulnerability records and look for duplicates, where a single software flaw had been disclosed multiple times. We then used those duplicates to estimate the rate at which vulnerabilities were discovered more than once by independent parties. We also use this data to track how the rediscovery 1 2 Posted on April Lee, “How Codenomicon Found The Heartbleed Bug Now Plaguing The Internet,” ReadWrite, April 13, 2014,http://readwrite.com/2014/04/13/heartbleed-security-codenomicon-discovery/.TakingStock:Estimating Vulnerability Rediscovery
9.rate can grow over time and to measure rediscovery lag, the time between when a vulnerability is first disclosed and subsequent duplicate disclosures. The paper begins with background information on vulnerabilities and their discovery, along with a review of relevant literature and previous work. This leads into a discussion of the data collected for this project and our methodology, including the challenges present in counting vulnerabilities and measuring rediscovery. Following this are several analyses of this data, considering variation such as the vendor in question and change over time as well as other measures such as rediscovery over time and rediscovery lag. The final two sections discuss some of the paper’s limitations and then conclude by suggesting implications of this work for scholarly and policy communities. 2 Background Software vulnerabilities are flaws or features in code that allow a third party to manipulate the computer running this software. The level of design security in major commercial software products varies widely, from the vulnerability-rich history of Adobe’s products to Apple’s comparatively locked-down iOS operating system. Such design insecurity is generally the result of poorly secured software, insecure programming languages, the growing complexity of commercial code bases, and simple human error, among a host of other causes. For example, a program that expects to retrieve a simple image file but fails to check the supplied file type might return an executable software program instead. The procedure to retrieve an image is intentional, but failing to check the file type allows a third party to manipulate it. The Love Letter virus of 2000 relied on the fact that Windows 2000 and XP hid known file extensions when reading file names from right to left. The virus file (LOVE-LETTER-FOR-YOU.TXT.vbs) hid itself by putting the executable file extension (.vbs) outside of a benign one (.txt), so Windows would only show .txt and the user would be none the wiser.2 Vulnerabilities may also be introduced directly to hardware through 2 Microsoft, “VBS.LOVELETTER Worm Virus” (Microsoft, January 29, 2007),http://support.microsoft.com/kb/282832.Belfer Center for Science and International Affairs Harvard Kennedy School 3
10.compromises in chip design or manufacture somewhere along the supply chain.3 Not all vulnerabilities are created equal—some are easier to find than others and only a small number will provide easy access to the best-secured software. What this means is that not all groups looking for vulnerabilities are necessarily looking for the same vulnerabilities. An intelligence organization is likely to have the engineering and mathematics capacity to take low-value or difficult-to-use vulnerabilities and combine them into a working exploit. Less capable groups may have to wait until they find a vulnerability which can immediately be used to gain access to a computer system to develop a useful exploit. The knowledge of a vulnerability’s existence is valuable information, with similar properties to the location of buried treasure on a map or a secret told to a friend. The bug hunter’s challenge is to choose who to whisper their secret to, along with proof-of-concept code that proves their secret is in fact true. This secret is a source of value and creates adilemma:malicious actors who wish to gain access to a vulnerability are almost always willing to pay more than the software’s original vendor—sometimes a great deal more.4 Because a vulnerability is something embedded in a piece of software, a person who independently discovers it has no guarantee of being the only one who knows about its existence. Every passing day brings a higher probability that someone else working to find vulnerabilities in the same piece of software will stumble upon the bug, leading to rediscovery. This leads to an economy of buying and selling vulnerability information among criminal groups, companies, and governments.5 4 3 Georg T. Becker et al., “Stealthy Dopant-Level Hardware Trojans,” August 21, 2013,http://people.umass.edu/gbecker/BeckerChes13.pdf; Bruce Schneier, “How to Design — And Defend Against — The Perfect Security Backdoor,” WIRED, October 16, 2013,https://www.wired.com/2013/10/howto-design-and-defend-against-the-perfect-backdoor/.4 A. Algarni and Y. Malaiya, “Software VulnerabilityMarkets:Discoverers and Buyers,” International Journal of Computer, Information Science and Engineering 8, no. 3 (2014): 71–81. 5 T. J. Holt, “Examining the Forces Shaping Cybercrime Markets Online,” Social Science Computer Review 31, no. 2 (September 10, 2012): 165–77,doi:10.1177/0894439312452998;Kurt Thomas et al., “Framing Dependencies Introduced by Underground Commoditization,” 2015,http://damonmccoy.com/papers/WEIS15.pdf; Herr and Ellis - Ch.7 “Disrupting Malware Markets” in Richard Harrison and Trey Herr, eds., CyberInsecurity:Navigating the Perils of the Next Information Age (Lanham,MD:Rowman & Littlefield, 2016),https://books.google.com/books?id=NAp7DQAAQBAJ&source.For more on the policy debate around which vulnerabilities governments should disclose, see Ari Schwartz and Rob Knake, “Government’s Role in Vulnerability Disclosure” (Belfer Center, June 2016),http://www.belfercenter.org/sites/default/files/legacy/files/vulnerability-disclosure-web-final3.pdf;Fidler - Ch.17 “Government Acquisition and Use of Zero-Day Software Vulnerabilities” in Harrison and Herr, Cyber Insecurity. TakingStock:Estimating Vulnerability Rediscovery
11.Value of Rediscovery Because vulnerabilities are non-rivalrous, they can be discovered and held by more than one party simultaneously. A rediscovered vulnerability may be the same as the original, or a closely related flaw deemed similar enough to be identical. Rediscovery describes the likelihood that two independent parties will discover the same flaw in a piece of software. This is slightly different from a bug collision, which is when a vulnerability which had previously only been known to a single party enters the public domain. Asking about the rate of rediscovery only assumes that the two groups that find the same bug are independent of each other. Rediscovery is an important issue for the cybersecurity community, perhaps most prominently in the debate over how government should balance the choice to keep secret or disclose vulnerabilities to software developers. In the United States, the Vulnerabilities Equities Process (VEP) is the interagency process intended to make decisions about the disclosure of software vulnerabilities known by U.S. government agencies and organizations.6 The key choice for this VEP process is whether to keep the vulnerability secret or disclose the flaw to its developer, whereupon the government loses the opportunity to use the vulnerability (though the time this takes to happen can vary dramatically). Measuring the cost of disclosure vs. non-disclosure involves a range of factors but an important factor is the likelihood that a vulnerability, kept secret from a vendor and unpatched, might be rediscovered by another party and used against U.S. citizens.7 The answer is critical to determining the cost of non-disclosure of a vulnerability. If a vulnerability in the possession of the U.S. government is not likely to be discovered by another party, then the risk of keeping it a secret is lower than if the likelihood of rediscovery is high. However, there is more value in the question of rediscovery than just the VEP. Rediscovery can impact how the software security industry thinks about bug-bounty programs, which pay researchers in exchange 6 Schwartz and Knake, “Government’s Role in Vulnerability Disclosure”; Jason Healey, “The U.S. Government and Zero-DayVulnerabilities:From Pre-Heartbleed to Shadow Brokers,” Columbia Journal of International Affairs, November 2016, 4,https://jia.sipa.columbia.edu/sites/default/files/attachments/Healey%20VEP.pdf.7 Mailyn Fidler and Trey Herr, “PATCH:Debating Codification of the VEP,” Lawfare, May 17, 2017,https://www.lawfareblog.com/patch-debating-codification-vep.Belfer Center for Science and International Affairs Harvard Kennedy School 5
12.for disclosure of a software flaw. In paying for a vulnerability, companies expect that it can be patched (fixed). As these patches accumulate and users apply them, one of two things shouldhappen:either these companies slowly reduce the total number of flaws in their codebase, or they find and fix enough old bugs to keep pace with new ones created as software is updated over time. Understanding the speed of rediscovery helps inform companies, showing how quickly a disclosed but unpatched bug could be rediscovered by a malicious party and used to assault the company’s software. This information should drive patch cycles to be more responsive to vulnerabilities with short rediscovery lag, while allowing more time for those where the lag is longer. With additional work, rediscovery may also contribute to more accurate estimates of the density of vulnerabilities in software. Academic research into the malware markets is also likely to benefit from better estimates of vulnerability rediscovery. Rediscovery impacts the lifespan of a vulnerability; the likelihood of its being disclosed to or discovered by the vendor grows with every instance of rediscovery. Just as one can compare a supermarket’s need to renew its stock of bread vs. salted herring, some vulnerabilities are likely to “go stale,” and thus be of little value, much faster than others.8 Estimating rediscovery can help shed light on which types of vulnerabilities are more likely to decay relative to others, based on the frequency with which they are discovered by multiple parties. Rediscovery and Previous Literature Despite the importance of this issue, the academic record on rediscovery is relatively sparse.9 A 2005 paper by Andy Ozment applied software reliability growth models to vulnerability discovery to gauge the total population of software vulnerabilities in the operating system BSD.10 This paper also 6 8 A vulnerability can rise in value if integrated with many others as part of an exploit kit, a mechanism to deploy malicious software using dozens of vulnerabilities, or if few targets apply the patch that fixes the corresponding flaw. 9 There is an ample literature on vulnerability discovery that deals with an important but slightly different question from this paper’s focus on rediscovery. For a detailed literature review, see Fabio Massacci and Viet Hung Nguyen, “An Empirical Methodology to Evaluate Vulnerability Discovery Models,” IEEE Transactions on Software Engineering 40, no. 12 (2014): 1147–1162. 10 Andy Ozment, “The Likelihood of Vulnerability Rediscovery and the Social Utility of Vulnerability Hunting,” 2005,http://www.infosecon.net/workshop/pdf/10.pdf.TakingStock:Estimating Vulnerability Rediscovery
13.addressed rediscovery, using a collection of vulnerability bulletins from Microsoft between 2002 and 2004 to catalogue when the company credited more than one disclosing party. Ozment found an average rediscovery rate of just under 8%, aggregated over different types of software, including operating systems, applications, and supporting libraries. Writing in 2013, Finifter et al. looked at the relative cost efficiency of vulnerability reward programs against directly employing security personnel. Looking at Firefox and Chrome, they found that most vulnerabilities are reported from within firms, though by 2012 this trend had shifted for critical vulnerabilities in Chrome and more were reported from outside the company. In a small section looking at rediscovery, the group’s paper calculated a mean rediscovery rate for Chrome of 4.6% and provided anecdotal evidence of similar rates in Firefox.11 A collaboration between the bug bounty company HackerOne and researchers at Harvard and MIT produced a system dynamics model of the vulnerability discovery and stockpiling process.12 As part of this work, the group presented results of a random discovery simulation at RSA that showed a 9% rediscovery rate for immature software and less than 1% for “hardened” or more mature codebases. Unfortunately, no formal paper describing the methodology and data employed in this study has yet been published. In each of these instances, the question of rediscovery was tangential to a different debate. Ozment’s work was a response to Eric Rescorla’s contention that there was little long-term utility to vulnerability discovery and patching.13 Finifter and his group were studying the efficacy of bounties over hiring security talent as full-time employees, and the HackerOne estimate of rediscovery came in the context of discussing the relative density of vulnerabilities in old vs. new software. In each of these, there was little 11 Matthew Finifter, Devdatta Akhawe, and David Wagner, “An Empirical Study of Vulnerability Rewards Programs,” in USENIX Security, 2013,https://www.usenix.org/system/files/conference/usenixsecurity13/sec13-paper_finifter.pdf. 12 Katie Moussouris and Michael Siegel, “The Wolves of VulnStreet:The 1st Dynamic Systems Model of the 0day Market” (RSA, 2015),https://www.rsaconference.com/events/us15/agenda/sessions/1749/the-wolves-of-vuln-street-the-1st-dynamic-systems.13 Eric Rescorla, “Is Finding Security Holes a Good Idea?,” IEEE Security and Privacy 3, no. 1 (January 2005): 14–19. Belfer Center for Science and International Affairs Harvard Kennedy School 7
14.actual data available to judge the rate of rediscovery and other potentially interesting characteristics. Most recently, a 2017 study from the RAND Corporation used a small private dataset to evaluate the nature and behavior of zero-day vulnerabilities—those used by attackers before the vendors learn about them.14 The study found that over the span of a year, on average only 5.76% of vulnerabilities were rediscovered in the public domain, while for 90 days or less the figure was less than 1%. This is much lower than our findings that 13% to 20% of vulnerabilities are rediscovered within a year, including more than 21% in the Android operating system. While the two papers are scoped to different ends, much of the distinction is likely attributable to differences in data sources and methodology. For more on these differences, see Section 6. 14 8 Lillian Ablon and Andy Bogart, “Zero Days, Thousands of Nights” (Santa Monica,CA:The RAND Corporation, 2017),https://www.rand.org/content/dam/rand/pubs/research_reports/RR1700/RR1751/RAND_RR1751.pdf. TakingStock:Estimating Vulnerability Rediscovery
15.3 Methodology and Data This paper addresses a gap in the literature by integrating vulnerabilities reported in several different codebases, including the browsers Firefox and Chrome, the open source project OpenSSL, and the Android operating system to generate estimates of vulnerability rediscovery and related measures such as rediscovery lag.15 The goal in selecting these codebases was to cover more than one software type, span multiple vendors, and have the best possible access to complete data.16 Counting Duplicate Vulnerabilities Measuring rediscovery is difficult because once the original vulnerability is disclosed and made public, there is little incentive for anyone to come forward and make a new disclosure about the same vulnerability, except where to do so might result in reputational rewards. The dataset collected here, and all those that look at disclosure records, don’t measure discovery directly. Instead, this data captures disclosure as a proxy for disclosure. This limits the “rediscovery window” to capture rediscovery for each vulnerability record to the period between an initial vulnerability disclosure and public notice of the bug’s existence. Limiting our data to this rediscovery window has some benefit as well. Looking at rediscovery, there is a reasonable assumption that over a long enough period any vulnerability will be discovered multiple times. Because the window in which we can observe rediscovery is effectively limited to the period between initial disclosure and when a patch is made publicly available, there is a natural time constraint. This is built on Ozment’s method of counting multiple credited discoverers, a record-keeping process that would end with a patch being made available to the public. 15 A codebase is the collection of code used to develop a piece of software, including both current and previous versions. 16 All the data we used is available in our GitHub repository (https://github.com/mase-gh/Vulnerability-Rediscovery),and we continue to work to expand this dataset to include closed source software and other open source projects. Belfer Center for Science and International Affairs Harvard Kennedy School 9
16.Within this rediscovery window, our method of tabulating the discovery of a rediscovered vulnerability looks for twocriteria:are there multiple parties given credit for independently disclosing the same vulnerability and/or has the bug been marked a duplicate and merged with another? We employ one or both approaches with each codebase. More detail on the specific method of counting duplicates for each piece of software can be found below. Calculating rediscovery, we take measure of the number of all vulnerability records with duplicates as a proportion of all vulnerability records. In a given period, if there are ten different vulnerability records and one of those records has received duplicate disclosures, then only one vulnerability record has a duplicate. As a result, the rediscovery rate is 1/10 or 10%.17 In our estimates, we collect the total population of vulnerabilities and sample those of high or critical severity to measure this rediscovery rate. The other two measures presented in this paper, rediscovery over time and rediscovery lag, are derived from the same data. The likelihood of rediscovery appears to grow in the months after a vulnerability’s initial disclosure, eventually leveling off within a few months. Rediscovery over time measures this rate of change, the brevity of which suggests that the distribution of discovery attention is not uniform and changes over time. Rediscovery lag measures the time between an original disclosure and any subsequent duplicate disclosures; e.g., the time between DisclosureOriginal and Disclo- sureDuplicateA is X and time between DisclosureDuplicateA and DisclosureDuplicateB is Y. Thus, the rediscovery lag for DuplicateA would be X while the lag for DuplicateB would be X + Y. 17 10 This 10% figure shows the proportion of vulnerabilities that are rediscovered and not the total number of duplicates. If that same vulnerability was rediscovered 10 more times, for a total of 20 vulnerability disclosures, the rediscovery rate is unchanged in this methodology. This is not to say these additional duplicates aren’t of interest; we address their frequency and significance in Multiple Rediscovery in Section 4 below. TakingStock:Estimating Vulnerability Rediscovery
17.Coding and Data Sources In each source of data for vulnerability rediscovery, we used only vulnerabilities of high or critical severity to improve the quality of our data and impact of our analysis. Software bugs are generally given a severity score to help organize them and prioritize which need to be fixed first. The definitions of high and critical severity vary somewhat between organizations but generally settle on critical including anything that allows the execution of arbitrary code while high covers most instances where an attacker could manipulate software functions or operate without restriction if local to the targeted computer. For example, Google defines severityas:•Critical:“issues allow an attacker to run arbitrary code on the underlying platform with the user’s privileges in the normal course of browsing.” •High:“...vulnerabilities allow an attacker to execute code in the context of, or otherwise impersonate other origins. Bugs which would normally be critical severity with unusual mitigating factors may be rated as high severity.”18 By constraining this dataset to high and critical vulnerabilities, the paper offers analyses based on the most impactful software flaws. This means that the dataset comprises only a subset of the total population of vulnerabilities but emphasizes those most critical to developers and policymakers. This has also generally improved the quality of data, as record keeping appears to be most detailed for these most important bugs. We intend to expand to medium- and low-severity bugs, and their equivalent naming conventions across different vendors, in future work. Data for this range of products and vendors came from four sources that we integrated into a single dataset. There are numerous challenges in counting vulnerabilities, and the variable state of record keeping discovered across vendors and open source projects only underlines this. By 18 The Chromium Projects, “Severity Guidelines for Security Issues,” For Developers,https://sites.google.com/a/chromium.org/dev/developers/severity-guidelines. Belfer Center for Science and International Affairs Harvard Kennedy School 11
18.limiting our collection to these high- and critical-severity vulnerabilities, the dataset is made less generalizable but more reliable. 12 •Firefox:The Firefox dataset is scraped from records in the Bugzilla bug tracker for Firefox and related software dependencies.19 This data constitutes all bugs labeled as high-priority or critical in the Severity field from Extended Service Release (ESR) advisories for Firefox between 2012 and 2016 and comprises 473 vulnerability records and 81 records with duplicates. Firefox presented a challenge in how to create a working subset of only those vulnerabilities from the total pool of bug records. Firefox has nightly builds—new versions of the codebases with small changes or trial features—and many of the vulnerabilities discovered in Firefox are found there and fixed immediately. This dataset does not include these vulnerabilities, since they are never exposed to the public as part of an Extended Stable Release (ESR), and are thus unlikely to represent bugs that might be independently rediscovered. By counting only bugs from ESRs, we could obtain a subset of vulnerabilities that were exposed for discovery and exploitation by all users. In Bugzilla, each vulnerability record has a report page, with a Tracker subsection. For some vulnerabilities, that Tracker subsection contains a Duplicates field with a record of associated bug codes and their status. Those records with data in the Duplicates field were what was coded as duplicates. •Chrome:The Chrome dataset is scraped from bugs collected for Chromium, an open source software project whose code constitutes most the Chrome browser. 20 On top of Chromium, Google adds a few additional features, such as a PDF viewer, but there is substantial overlap, so we treat this as essentially identical to Chrome.21 Chrome presented a similar problem to Firefox, so to 19 Mozilla, “Bugzilla,” Bugzilla@Mozilla,https://bugzilla.mozilla.org/buglist.cgi?quicksearch=ALL%20kw:sec%20cf_status_firefox_esr31%3Afixed%2Cverified.20 The Chromium Projects, “Chromium Bug Tracker,” Bugs, current,https://bugs.chromium.org/p/chromium/issues/list?can=1&q=label%3ASecurity_Severity&sort=security_severity&colspec=ID+Component+Summary+Security_Severity+Reward+Reporter+Status&x=m&y=releaseblock&cells=ids. 21 Others have previously made this same choice, including Finifter, Akhawe, and Wagner, “An Empirical Study of Vulnerability Rewards Programs.” TakingStock:Estimating Vulnerability Rediscovery
19.record only vulnerabilities with a reasonable likelihood of public discovery, we limited our collection to bugs labeled as high or critical severity from the Chromium bug tracker. This portion of the dataset comprises 3,397 vulnerability records of which there are 468 records with duplicates. For Chrome, we coded a vulnerability record as a duplicate if it had been merged with another, where merges were noted in the comments associated with each vulnerability record, or marked as a Duplicate in the Status field. •OpenSSL:Our data comprises all vulnerability records from 2014 to 2016 for the OpenSSL project.22 The start time for this window is the disclosure of Heartbleed. Based on conversations with members of the OpenSSL community, the consistency of record keeping appears to have improved commensurate with a new influx of resources after the Heartbleed disclosure in April 2014. This portion of the dataset comprises 85 vulnerability records of which 2 have duplicate disclosures.23 In OpenSSL, vulnerabilities are noted as a duplicate if they credit two separate disclosers for the bug. Records with an ‘and’ between credited reporters are coded as a collaboration and so not duplicate disclosures. Records with an ‘&’ between reporters are coded as independent discoveries and thus duplicates. •Android:We were provided access to data from Google’s tracking systems for Androidvulnerabilities:the public system, CodeSite, and an internal-use-only platform called Issue Tracker. The internal site track used by Google employees tracks disclosures from within the company as well as some from outside while CodeSite records disclosures from the public. This data provides a glimpse into rediscovery rates for Android over a 17-month time frame between July 2015 and November 2016. A member of the Google security team downloaded bugs from both tracking systems, then correlated them to remove identical records that existed in both systems.24 This individual then added a CVE ID 22 OpenSSL, “OpenSSL Vulnerabilities,” News/Vulnerabilities,https://www.openssl.org/news/vulnerabilities.xml.23 Underlining the record-keeping issue—neither of these disclosures includes the Heartbleed bug, which was discovered twice within the span of a few days and reported in April 2014. 24 This is possible because Google maintains both respective tracking systems’ vulnerability ID value for all records on both platforms. Belfer Center for Science and International Affairs Harvard Kennedy School 13
20.for all records, to replace the internal tracking IDs. Instances of rediscovery, duplicate disclosures, were coded when a vulnerability record in CodeSite was noted as a duplicate and linked to a report credited to a separate researcher(s) from the original. Duplicates were also coded if two bugs were merged together. The Android portion of the paper’s dataset comprises 352 vulnerability records where 77 records had at least 1 duplicate. Table 1—Dataset Summary25 Source Date Range Total Population Sample Vulnerabilities Sample Duplicates Rediscovery Rate Google—Chrome 2009–2016 6817 3397 468 13.8% Mozilla—Firefox 2012–2016 1112 473 81 17.1% Google—Android 2015–2016 682* 352 77 21.9% OpenSSL 2014–2016 85 85 2 2.4% Total 2009–2016 8696 4307 628 14.6% The above table summarizes the four sources for this paper’s dataset, covering more than 4,300 vulnerability records over eight years from four different software projects. 25 14 • Source—the software these vulnerabilities come from, explained in detail above • Date Range—the time range for the population of vulnerabilities and our sample • Total Population—the total number of vulnerabilities available from each of the four data sources, of all criticality levels, from the date range specified. • Sample Vulnerabilities—our sample of high and critical vulnerabilities, a subset of the total population of vulnerabilities for the source software * Because of our inability to directly access Google’s records for the total population of Android vulnerabilities, we use here the total number for Android reported to the National Vulnerability Database in the specified date range TakingStock:Estimating Vulnerability Rediscovery
21.4 • Sample Duplicates—the number of vulnerability record from our sample which had duplicates; see above for more detailed explanation of what constitutes a duplicate for each source • Rediscovery Rate—the proportion of vulnerabilities from each source with at least one duplicate disclosure. Analysis Our analysis covers vulnerabilities in a range of software types, including standalone applications like Chrome and Firefox, and the library OpenSSL. The first section of analysis takes all this software together, producing an aggregate measure of vulnerability rediscovery. Following this is an analysis of multiple rediscovery (where there are multiple duplicate bugs), evaluating trends in specific codebases, and then an analysis of rediscovery lag, the time between initial disclosure and the first duplicate report. The final subsection evaluates rediscovery over time. Each section draws from all or part of the dataset, with explanations for the use of subsets where appropriate. Vulnerability Rediscovery in the Aggregate Vulnerabilities in the eight-year span of this dataset see an aggregate 14.9% rate of rediscovery. This is higher than previous open-source estimates, which ranged from 6.84% in early empirical work to 9% in more recent simulations. Figure 1 below charts the annualized discovery rate over the whole dataset. Belfer Center for Science and International Affairs Harvard Kennedy School 15
22.Figure 1—Aggregate Vulnerability Rediscovery Rate Change Over Time This sample of high and critical vulnerability records shows a rate of rediscovery that is twice as high as previously thought, and which has been steadily increasing over the past decade in the software we examined. Multiple Rediscovery This section looks at the rate of rediscovery by codebase and the phenomenon of multiple rediscovery, where more than two parties disclose the same vulnerability. While we are not able to control for each vendor characteristic individually—for example, the difference between bug bounty payouts or secure coding practices—this section demonstrates that there are relatively consistent trends in multiple rediscovery rates between the vendors in our sample. Across this paper’s entire dataset, where rediscovery did take place, one duplicate was the norm, though some vulnerabilities saw as many as four, five, and even one case of 11 duplicate disclosures. The three charts below in Figures 2–4 describe this multiple rediscovery, putting it in context across Firefox, Chrome, and Android. Each pie chart has threevalues:16 • NoDuplicates:'>Duplicates: