As smartphone manufacturers are improving the ear speakers in their devices, it can become easier for malicious actors to leverage a particular side-channel for eavesdropping on a targeted user’s conversations, according to a team of researchers from several universities in the United States.
The attack method, named EarSpy, is described in a paper published just before Christmas by researchers from Texas A&M University, Temple University, New Jersey Institute of Technology, Rutgers University, and the University of Dayton.
EarSpy relies on the phone’s ear speaker — the speaker at the top of the device that is used when the phone is held to the ear — and the device’s built-in accelerometer for capturing the tiny vibrations generated by the speaker.
Previous research focused on vibrations generated by a phone’s loudspeakers, or it involved an external component for capturing data. However, an individual is more likely to use the ear speaker rather than the loudspeaker when receiving sensitive information in a phone call.
The obvious choice for eavesdropping on a conversation would be for an attacker to plant a piece of malware that can record calls through the phone’s microphone. However, Android security has improved significantly and it has become increasingly difficult for malware to obtain the required permissions.
On the other hand, accessing raw data from the motion sensors in a smartphone does not require any special permissions. Android developers have started placing some restrictions on sensor data collection, but the EarSpy attack is still possible, the researchers said.
A piece of malware planted on a device could use the EarSpy attack to capture potentially sensitive information and send it back to the attacker.
The researchers discovered that attacks such as EarSpy are becoming increasingly feasible due to the improvements made by smartphone manufacturers to ear speakers. They conducted tests on the OnePlus 7T and the OnePlus 9 smartphones — both running Android — and found that significantly more data can be captured by the accelerometer from the ear speaker due to the stereo speakers present in these newer models compared to the older model OnePlus phones, which did not have stereo speakers.
The experiments conducted by the academic researchers analyzed the reverberation effect of ear speakers on the accelerometer by extracting time-frequency domain features and spectrograms. The analysis focused on gender recognition, speaker recognition, and speech recognition.
In the gender recognition test, whose goal is to determine whether the target is male or female, the EarSpy attack had a 98% accuracy. The accuracy was nearly as high, at 92%, for detecting the speaker’s identity.
When it comes to actual speech, the accuracy was up to 56% for capturing digits spoken in a phone call.
“[This] accuracy still exhibits five times greater accuracy than a random guess, which implies that vibration due to the ear speaker induced a reasonable amount of distinguishable impact on accelerometer data,” the researchers said.