Nowadays, the need for software solutions in scientific research for things like modeling, prediction and analysis is essentially ubiquitous across fields. Whether you’re a cancer researcher or a mechanical engineer, you’re almost certainly using software in your research, and often, researchers have to actually create their own software tools or modify existing open-source ones for their specific needs. While it’d be ideal if scientists had professional programmers to help with this, Associate Professor of Computer and Information Science Marouane Kessentini says in the real world, academic labs often take on these programming responsibilities themselves and do their best with the resources they have. A big company might be able to call on the talents of a professional software engineer, he says, but a physicist is going to consider themselves lucky if one of their PhD students has taken a few coding classes.
The software tools scientists create, even if ad hoc, can certainly help advance their work. But they often run into challenges once they want to take their research beyond the lab. “It’s not enough to publish a paper, you need to have your tools available so people can evaluate them and build on your work,” Kessentini says, referring to the expectations for “reproducibility” in scientific research. “And if your code is messy and not well-structured, it’s going to be very difficult for someone else to even understand it. Or, if your hope is to take your idea to industry, how are you going to convince them to use your prototype if your software is full of security vulnerabilities or hard to integrate because it’s of poor quality?”
In a research environment where everyone is hungry for new software tools, quickly adopted code that doesn’t meet basic quality and security standards can even wreak havoc within a scientific community. For example, several years ago, astrophysics researchers started an open-source project to develop a core software package for their field. Within just a few years, Astropy was being used by hundreds of researchers and became one of the most widely adopted open-source scientific platforms in the astrophysics community. But it was later discovered to be full of hundreds of quality issues, including a major security vulnerability, which had to be patched for scientists to be able to safely use the software.
For sure, Kessentini and his colleague Birhanu Eshete aren’t blaming scientists for not being better coders. Nor are they suggesting that scientists should start including grant support for professional software programmers in every project. Instead, they think there’s a way for scientists to help themselves. For years, Kessentini has been a leading researcher in refactoring, a field that involves fixing quality issues in legacy software source code, and has created some pretty amazing tools to help companies automate what is otherwise a tedious and expensive process. Down the hall, Eshete, who’s a cybersecurity expert, has been pouring an increasing amount of research energy into automated threat detection. So the two got to thinking, what if they could make an easy-to-use tool to help scientists automatically fix the quality and security issues in their code, without having to hire — or become — professional programmers?