A researcher managed to bypass Google’s ReCaptcha v2 and has decided to make the discovery public after Google failed to patch it for several months.
Dubbed ReBreakCaptcha, the logic vulnerability was discovered last year, and the security researcher says that it remains unpatched. Further, he explains that his exploit works against ReCaptcha’s audio challenges and abuses the Google Speech Recognition API to do so.
The exploit works in three stages, as it first needs to get the correct challenge type (Audio Challenge), then to perform recognition (namely to convert the audio challenge’s audio file and send it to Google’s Speech Recognition API), and finally to perform verification (to ensure the Speech Recognition result and attempt to bypass the ReCaptcha).
ReCaptcha v2 presents users with an “I’m not a robot” checkbox, which usually prompts an image challenge for verification purposes. When opting for the audio challenge, users are required to click a Play button and enter the words they hear. They can also download the challenge. Sometimes, however, instead of an audio challenge, users might get a text challenge, but they can get the audio challenge when clicking “Reload Challenge.”
After getting the audio challenge, one can download the audio file and send it to Google Speech Recognition API, though it would have to be converted to the “wav” format first. The result that the Speech Recognition sends back can be used as the solution for the audio challenge, all with a simple copy-paste command.
The researcher’s proof of concept code, written in Python, is available via GitHub. It was designed to leverage all of the three stages present above, and uses the SpeechRecognition Python library (an implementation of Google Speech Recognition API) for performing speech recognition.
What the researcher didn’t say was whether he reported the vulnerability to Google or what response he received from the company, if he did.
Given that ReBreakCaptcha relies on automating getting the audio challenge, downloading the file, sending it to Speech Recognition and returning that result to ReCaptcha, it is bound to fail sometimes, and some of those who tested it say that indeed it does, more often than expected.