If robots could talk (or at least make intelligible sounds)

February 22, 2023

During a recent NSF-funded project, Assistant Professor Alireza Mohammadi stumbled on an interesting discovery that could make it easier to understand what robots might be trying to tell us.

A collage graphic featuring a humanoid robot surrounded by representations of sound waves
Graphic by Violet Dashi. Images by Martin and sanee via Adobe Stock

It's fair to say that Assistant Professor of Electrical and Computer Engineering Alireza Mohammadi had his hands full figuring out a public-facing component for his very technical research, which is often an expectation of NSF-funded projects these days. As we wrote about in a story last June, he’s exploring whether robotic motion algorithms can be used to describe protein folding, a complex challenge that’s a hot topic in science, and one which involves a lot of complicated mathematics. But he came up with a pretty cool idea for broadening the impact of the work and sharing it with local STEM K-12 classrooms: Mohammadi planned to take those complex robotic mathematical algorithms and use them to generate music.

A headshot of Alireza Mohammadi
Assistant Professor Alireza Mohammadi

Mohammadi thought that would be fairly straightforward, but when he started looking into how he'd actually do it, he immediately ran into a challenge. “If you consider sound, it’s one-dimensional from a mathematical perspective. But if you have a robotic system, and the robot is moving around, there are many variables, like all the angles, positions and speed of its linkages, which are constantly changing with time,” Mohammadi explains. “Consider, for example, a protein backbone chain with 82 degrees of freedom, which we modeled as a hyper-redundant robotic mechanism. There are 82 different values changing over time. So how do we create a meaningful relationship between these two entities and essentially map 82 dimensions down to a one-dimensional sound signal? What would that sound like? And would it be meaningful to us in any significant way?”

Things started to get even more interesting when he realized this “sonification” challenge is actually an unsolved problem in robotics, and one that’s of particular interest to engineers and psychologists who are interested in human-robot interaction. The idea of using sound to interpret the function of machines is actually not a new idea. Think, for example, how auto mechanics use sound to diagnose complex mechanical problems in vehicles. Or how the crackling of Geiger counters is actually an amplification of an electrical signal that’s naturally generated by electrons released from the breakdown of unstable atoms. With robots, however, the sonification challenge is often a little trickier, precisely because they typically don’t generate as much natural sound, or at least sounds that are intuitively intelligible to humans. If, for example, you’re an engineer interested in building a companion robot for an elderly person or a child with severe autism, how do you communicate something like comfort? The subtle whirring of its gears and hydraulics may not be enough. To get around this problem of natural sound, engineers often ascribe external non-verbal sounds that are easier for us to interpret, like clicks or tones. When you add in additional variables like changes in rhythm, pitch, timbre and amplitude to communicate, say, changes in speed, direction or intensity, you’re getting close to something resembling a simple robotic auditory language that we humans can understand.

Mohammadi says this approach can work pretty well for simple robots that have just a few “degrees of freedom,” the term engineers use to describe the total number of independent ways a robot can move. (For example, taking account of all its joints, a typical robotic arm may have seven or so degrees of freedom). But for more complex robotics, like the “snake” robots with 80-plus linkages that Mohammadi is working with, manually programming relationships between changes in motion and sound quickly becomes unwieldy. So Mohammadi began to wonder, what if there was a way to create a more organic, direct relationship between the movements and sound, the way that mechanical systems of cars naturally generate useful auditory byproducts? As mentioned above, mechanical robotic movements don’t often generate enough useful sound on their own, but what about the things that govern those movements? Could you take the motion algorithms that control robotic movements and translate them directly into sound? Could you essentially make a robot’s programming sing?

That’s what Mohammadi is exploring now in an unexpected spinoff of his NSF-project. He’s currently exploring types of mathematical functions that could help translate the multiple dimensions of robotic motion into the one dimension that characterizes our experience of sound. For example, a heat map uses a mathematical concept called a scalar field to relate temperature, which is one dimensional, to space, which is three dimensional. A motion heat map can add a fourth dimension of time. Mohammadi is hoping similar concepts could be used to render one-dimensional sounds that represent the dozens of variables inherent in complicated robotic movements. For instance, in his protein folding project, he says the evolution of robotic movement over time can generally be described as a progression from “disorder to order.” Could a similar sonic environment that progresses from atonal chaos to ordered rhythm and harmony convey useful information about what the robot is doing? Mohammadi is hoping so.

The applications could go far beyond generating attention-grabbing sounds for Mohammadi’s forthcoming classroom presentations. For starters, this more organic approach could provide a simpler, more elegant sonification programming solution for engineers who are building robots that use sound for communicating with humans. Moreover, it could give us the ability to add sound reinforcement to more complex machines, like surgical robots or robots used to clear minefields. “Right now, when a surgeon is using a da Vinci robot, the cognitive load is placed all on the surgeon’s ability to interpret visual signals from the instrument’s cameras,” he says. “If we could add a layer of sound that would give feedback about the robot's movements, it could lighten that cognitive load and make surgeries more efficient and accurate.” 

###

Story by Lou Blouin