I am a postdoctoral researcher at the University of Washington in the Allen School of Computer Science and Engineering where I am a member of both the Security and Privacy Lab and the Molecular Information Systems Lab. My research is focused on understanding how increased computerization and automation in the biotechnology sector is creating new cyber-security threats. I also research the security of consumer genetic services and was the first to demonstrate that molecular information, like DNA, can be a vector for computer security issues. I previously worked to understand the state of cell phone security and has built and deployed measurement systems designed to detect rogue base stations. I completed my Ph.D. in Computer Science at the University of Washington and received my bachelor's degrees from the University of Wisconsin-Madison.
Customers of direct-to-consumer (DTC) genetic testing services routinely download their raw genetic data and give it to third-party companies that support additional features. One type of analysis, called genetic genealogy, uses genetic data and genealogical methods to find new relatives. While genetic genealogy is quite popular, it has raised new privacy concerns. Genetic genealogy services can be leveraged to find the person corresponding to anonymous genetic data and have been used dozens of times by law enforcement to solve crimes. We hypothesized that the open design and broad API offered by some genetic genealogy services raise other significant security and privacy issues. To test this hypothesis, we analyzed the security practices of GEDmatch, the largest third-party genetic genealogy service. Here, we experimentally show how the GEDmatch API is vulnerable to a number of attacks from an adversary that only uploads normally formatted genetic data files and runs standard queries. Using a small number of specifically designed files and queries, an attacker can extract a large percentage of the genetic markers from other users; 92% of markers can be extracted with 98% accuracy, including hundreds of medically sensitive markers. We also find that an adversary can construct genetic data files that falsely appear like relatives to other samples in the database; in certain situations, these false relatives can be used to make the re-identification of genetic data more difficult. These attacks are possible because of the rich set of features supported by the API, including detailed visualizations, that are meant to enhance usability. We conclude with security recommendations for genetic genealogy services.Project Page
The rapid improvement in DNA sequencing has sparked a big data revolution in genomic sciences, which has in turn led to a proliferation of bioinformatics tools. To date, these tools have encountered little adversarial pressure. This paper evaluates the robustness of such tools if (or when) adversarial attacks manifest. We demonstrate, for the first time, the synthesis of DNA which — when sequenced and processed — gives an attacker arbitrary remote code execution. To study the feasibility of creating and synthesizing a DNA-based exploit, we performed our attack on a modified downstream sequencing utility with a deliberately introduced vulnerability. After sequencing, we observed information leakage in our data due to sample bleeding. While this phenomena is known to the sequencing community, we provide the first discussion of how this leakage channel could be used adversarially to inject data or reveal sensitive information. We then evaluate the general security hygiene of common DNA processing programs, and unfortunately, find concrete evidence of poor security practices used throughout the field. Informed by our experiments and results, we develop a broad framework and guidelines to safeguard security and privacy in DNA synthesis, sequencing, and processing.Project Page
Cell-site simulators are cell phone surveillance devices that are used around the world by governments and criminals. These powerful devices are capable of precisely locating phones, evesdroping on conversations, and sending spam or malware. However, our primary source of information about their use comes from journalists and anonymous leaks. To gain a better technical understanding of how often, when, and where cell-site simulators are used, we built and deployed SeaGlass, a city-wide cell-site simulator detection system. SeaGlass cellular sensors are designed to be robust, low-maintenance, and deployable in vehicles for long durations. The data they generate is used to learn a city's network properties to find anomalies consistent with cell-site simulators. We installed SeaGlass sensors into 15 ridesharing vehicles across two cities, collecting two months of data in each city. Using this data, we evaluated the system and show how SeaGlass can be used to detect portable cell-site simulators.