Researchers develop a way to hear photos using artificial intelligence

Artificial intelligence has been used by Northeastern University researchers to create a method for extracting audio from both still images and muted films.

The study is referred to as Side Eye.

Professor of electrical and computer engineering at Northeastern University Kevin Fu remarked, “Most cameras today feature what’s called image stabilization hardware. It turns out that when you talk near a camera lens with some of these features, the camera lens will move very slightly, altering the pixels in the image and what’s known as modulating your voice.

According to the study team, these minute gestures can essentially be translated into crude audio that Side Eye artificial intelligence can then accurately translate into specific words.

“Thousands of samples can be obtained every second. Why does this matter? It essentially means that you receive a very primitive microphone, according to Fu.

“Things like understanding what is the gender of the speaker, not on camera, but in the room while the photograph or video is being taken, that’s nearly 100% accurate,” he said.

So what applications are there for such technology?

“For instance, in legal cases or in investigations of either proving or disproving somebody’s presence, it gives you evidence that can be backed up by science of whether somebody was likely in the room speaking or not,” Fu added.

He stated, “This is yet another tool we can use to add authenticity to evidence, possibly to investigations, but also trying to solve criminal applications.”

Tagged Artificial Intelligence