Helping scientists become better coders

November 17, 2021

A new NSF-funded project from two UM-Dearborn computer scientists is taking aim at an often overlooked challenge in scientific research.

A collage graphic featuring an African American woman scientist in a lab coat, coding at a laptop.
A collage graphic featuring an African American woman scientist in a lab coat, coding at a laptop.

Nowadays, the need for software solutions in scientific research for things like modeling, prediction and analysis is essentially ubiquitous across fields. Whether you’re a cancer researcher or a mechanical engineer, you’re almost certainly using software in your research, and often, researchers have to actually create their own software tools or modify existing open-source ones for their specific needs. While it’d be ideal if scientists had professional programmers to help with this, Associate Professor of Computer and Information Science Marouane Kessentini says in the real world, academic labs often take on these programming responsibilities themselves and do their best with the resources they have. A big company might be able to call on the talents of a professional software engineer, he says, but a physicist is going to consider themselves lucky if one of their PhD students has taken a few coding classes.

The software tools scientists create, even if ad hoc, can certainly help advance their work. But they often run into challenges once they want to take their research beyond the lab. “It’s not enough to publish a paper, you need to have your tools available so people can evaluate them and build on your work,” Kessentini says, referring to the expectations for “reproducibility” in scientific research. “And if your code is messy and not well-structured, it’s going to be very difficult for someone else to even understand it. Or, if your hope is to take your idea to industry, how are you going to convince them to use your prototype if your software is full of security vulnerabilities or hard to integrate because it’s of poor quality?”

In a research environment where everyone is hungry for new software tools, quickly adopted code that doesn’t meet basic quality and security standards can even wreak havoc within a scientific community. For example, several years ago, astrophysics researchers started an open-source project to develop a core software package for their field. Within just a few years, Astropy was being used by hundreds of researchers and became one of the most widely adopted open-source scientific platforms in the astrophysics community. But it was later discovered to be full of hundreds of quality issues, including a major security vulnerability, which had to be patched for scientists to be able to safely use the software.

For sure, Kessentini and his colleague Birhanu Eshete aren’t blaming scientists for not being better coders. Nor are they suggesting that scientists should start including grant support for professional software programmers in every project. Instead, they think there’s a way for scientists to help themselves. For years, Kessentini has been a leading researcher in refactoring, a field that involves fixing quality issues in legacy software source code, and has created some pretty amazing tools to help companies automate what is otherwise a tedious and expensive process. Down the hall, Eshete, who’s a cybersecurity expert, has been pouring an increasing amount of research energy into automated threat detection. So the two got to thinking, what if they could make an easy-to-use tool to help scientists automatically fix the quality and security issues in their code, without having to hire — or become — professional programmers?

Headshots of Associate Professor Marouane Kessentini and Assistant Professor Birhanu Eshete
Headshots of Associate Professor Marouane Kessentini and Assistant Professor Birhanu Eshete

Through a new NSF-funded project, Kessentini (who's principal investigator for the grant) and Eshete are now working on building a platform that could do just that. Envisioned as a cloud-based service, scientists would simply subscribe to the platform, which would then analyze a researcher’s code for quality and security issues. After a scan, the software would suggest fixes, which researchers could implement with a few mouse clicks. They’re even building in a chatbot, so inquisitive scientists can ask questions and get more information before they implement fixes.

Moreover, Kessentini and Eshete hope their platform can finally establish some quality and security standards for open-source scientific programming tools. For example, one cool feature of their platform is a benchmarking tool, which allows researchers to see how their code compares to others in their research community. They’re also offering “badges,” which allow scientists to advertise high-quality, “vetted” projects to potential collaborators or industry partners. This, Eshete says, could be particularly helpful for stopping problematic software from proliferating rapidly across a research community.

Kessentini and Eshete say they’re working with some initial scientific partners to fine-tune the features of their platform and hope to launch it for wider use in the next year. By the end of the three-year project, they hope to have hundreds of scientists subscribing to the platform, providing a new way for researchers to advance their best work.

###

Story by Lou Blouin. If you’re a member of the media and would like to interview Associate Professor Marouane Kessentini or Assistant Professor Birhanu Eshete about this project, drop us a line at [email protected] and we’ll put you in touch.