Cybersecurity researcher Birhanu Eshete scores prestigious NSF CAREER award

April 5, 2023

Eshete’s project could have wide-ranging implications for one of artificial intelligence’s most stubborn problems.

Assitant Professor stands for a portrait in a hallway corridor with his hands in his pockets
Assistant Professor of Computer and Information Science Birhanu Eshete. Photo by Lou Blouin

The young talent in UM-Dearborn’s College of Engineering and Computer Science continues to impress big national funders. Along with fellow Assistant Professors DeLean Tolbert Smith and Fred Feng, Birhanu Eshete can now count himself among the CECS faculty who’ve recently landed the National Science Foundation’s CAREER Award, a prestigious nationwide grant given annually to about 500 early-career faculty who are emerging leaders in their fields. 

You don’t get the nod for a CAREER Award with boilerplate ideas, and Eshete will be going after some big fish with his five-year $619,838 grant. Specifically, he’ll be exploring a new approach to machine learning’s “robustness” problem, which continues to be one of the biggest challenges in artificial intelligence, primarily because it’s so intimately tied up with the safety and security aspects of AI systems.  So what exactly is robustness? Within AI disciplines, robustness can actually refer to a few different things. If we’re talking about an autonomous vehicle, for example, we’d say its navigation system is robust if it can respond correctly to slightly novel situations it’s never seen before, like stopping at a stop sign even if there’s a minor glare or a little snow on it. 

Alongside safety, the other area where robustness is a big deal is cybersecurity, which is Eshete’s specialty. In this space, robustness primarily refers to whether a machine learning system can protect itself from a range of attacks. For example, if you’re a medical researcher building an AI-powered tool that can diagnose cancer by looking at a patient’s medical scans, your system was almost surely trained on real people’s medical information. Because of this, protecting people’s privacy is a big concern, especially as other doctors and hospitals start using your tool in the real world. Eshete says if your model isn’t sufficiently protected, it could be susceptible to an “inference attack” in which the model can be probed to identify a particular person whose data was used to train your model. In another kind of attack, Eshete says a malicious actor could provide scans that look legitimate, but contain “noise” specifically designed to manipulate the algorithm’s “decision boundary.” By intentionally “poisoning” the system with bad data, the model’s high-stakes medical diagnoses could become inaccurate.

Eshete says AI practitioners  typically use two kinds of strategies to make models robust against attacks. You can make the model harder to attack; or, you can try to “clean” the inputs of malicious “noise” before they get into the system, sort of like an antivirus pre-scan of email attachments. But Eshete thinks both of these approaches are ultimately limited, as attackers will invariably find novel ways to thwart new defenses. Instead, he’s  proposing a more fundamental solution that borrows insights from key parts of his work on nation-state cyberattacks. Such highly sophisticated attacks evolve over longer periods of time, beginning with an infiltration point that lets a hacker gradually probe more deeply into other parts of the network, where real damage can be done. Eshete developed techniques for describing these pathways of attack, which is known in the cybersecurity world as attack provenance. “It basically gives you a narrative of how the attack unfolded, and within that you’ll find all kinds of juicy information that can help with attack detection and forensic analysis,” he says. For example, such information helped Eshete develop anomaly detection systems, which thwart attacks by using machine learning to develop models of what normal network activity looks like, so unusual activities stick out as potential threats.

For his CAREER award project, he’ll be using this idea of provenance capture and analysis as the basis for a robust cybersecurity defense strategy for AI models. Here’s how it would work: First, as a machine learning model is trained on its initial data, Eshete would capture a narrative of the model’s evolution as “training provenance,” which helps establish the typical trajectory of what happens during training. Then, as the model goes live and users are feeding it new real-world inputs, he would similarly track what happens as an input moves all the way through the model to a final predictive output, thus capturing the equivalent of a “thought process” that led to a decision. This is called “prediction provenance.”

Importantly, each kind of provenance builds a signature for how the model normally functions, and Eshete says some critical insights can be gathered when you retrain the model with new inputs, which is a typical periodic maintenance procedure for machine learning systems. “When you retrain a model for which training provenance has been established, but with newly acquired and hence potentially poisonous data, if there is enough deviation from the stabilized training provenance, then that is a reason to suspect a data poisoning attack,” Eshete says. Similarly, if a particular prediction provenance sufficiently resembles a known malicious prediction provenance signature, Eshete knows he may be dealing with an input aimed at misleading the model. If the prediction provenance is more in line with the clean provenance signature, he’s more likely dealing with a benign input.

Interestingly, Eshete’s core idea could have implications far beyond the cybersecurity realm. As we wrote about in a story earlier this year, AI’s so-called “black box” problem continues to be a major hurdle to building safe, trustworthy AI systems. The problem stems from the fact that most machine learning systems are completely opaque about how they come to their conclusions — we get the decision but not the “thought process.” Eshete’s methods of creating prediction provenance could therefore provide new insights into how models are making their decisions — giving us the potential to build more transparent, repairable, trustworthy and less-biased AI systems.

Eshete will be working on the project through 2028, which will also include an educational component for Taylor High School students in the final two years. Right now, though, he’s taking a moment to enjoy the big vote of confidence that comes with being a CAREER awardee. “It does feel like a validation of your ideas and your career trajectory,” Eshete says. “And I’m grateful to the Office of Research and my colleagues and the campus in general, who’ve been very supportive and optimistic about research on our campus. It’s a very big undertaking to apply for a grant like this — at least it was for me. You can’t operate in a vacuum, and this kind of support is invaluable for moving research forward.”

###

Want to read more about Eshete’s work? Check out our stories “A dispatch from the cybersecurity arms race” and “Should we view cyberattacks as acts of war?” which spotlight Eshete’s research on nation-state attacks. Story by Lou Blouin.