A researcher has demonstrated that lossy image compressors can be used to hide arbitrary code inside PDF documents. The method could be highly efficient for malicious actors because security products are designed to ignore such data.
It’s not uncommon for cybercriminals to hide malicious code in PDF files. The malicious code is usually designed to exploit vulnerabilities in the application that is used to open the document, in most cases Adobe Reader.
Exploits can be hidden inside PDF files by using data compressors, such as Lempel–Ziv–Welch (LZW) and Deflate, and even image compressors, such as CCITTFaxDecode and JBIG2Decode. Security products are designed to scan PDF files for payloads compressed using these algorithms.
On the other hand, antiviruses and PDF forensic tools usually ignore data compressed with lossy compressors such as JPXDecode and DCTDecode. Lossy compression uses inexact approximations for representing the encoded content, which leads to a certain amount of information being discarded.
Lossy compression is efficient for images, but not for code, which is why security solutions assume that lossily compressed data can’t contain malicious code.
However, CSIS researcher Dénes Óvári has demonstrated that hiding malicious code in a JPEG image compressed with the DCTDecode lossy compressor is possible. The experts has determined that while encoding a color JPEG image would result in data loss that would lead to corrupted code, a high-quality grayscale JPEG image could do the trick.
Óvári has developed a proof-of-concept where he encoded a piece of JavaScript code as a grayscale image and embedded it in a PDF document. The image was inserted into an Image object filtered with DCTDecode, that was then referenced by a JavaScript action entry. This ensures that the code is executed when the PDF file is opened.
“Although this is not a security breach in itself (an exploit still needs to be used inside the stream for malicious activity), the fact that the usage of DCTDecode for this purpose has seemingly been ruled out by the industry means that even known threats could be hidden in this way from anti-virus scanners and/or researchers,” Óvári wrote in a research paper published on Virus Bulletin.
“In order to provide users with maximum protection, the DCTDecode stream must no longer be overlooked: in PDF reader implementations, the referencing of uncompressed image data as parameters from objects expecting binary data should be prohibited. We should also perhaps re-examine the handling of other file formats in which data in JPEG format is assumed always to be lossily compressed, while a greyscale mode is still available,” the expert added.

Eduard Kovacs (@EduardKovacs) is a contributing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.
More from Eduard Kovacs
- Intel Boasts Attack Surface Reduction With New 13th Gen Core vPro Platform
- Dole Says Employee Information Compromised in Ransomware Attack
- High-Severity Vulnerabilities Found in WellinTech Industrial Data Historian
- CISA Expands Cybersecurity Committee, Updates Baseline Security Goals
- Exploitation of 55 Zero-Day Vulnerabilities Came to Light in 2022: Mandiant
- Organizations Notified of Remotely Exploitable Vulnerabilities in Aveva HMI, SCADA Products
- Waterfall Security, TXOne Networks Launch New OT Security Appliances
- Hitachi Energy Blames Data Breach on Zero-Day as Ransomware Gang Threatens Firm
Latest News
- Microsoft: No-Interaction Outlook Zero Day Exploited Since Last April
- US to Adopt New Restrictions on Using Commercial Spyware
- Hackers Earn Over $1 Million at Pwn2Own Exploit Contest
- GoAnywhere Zero-Day Attack Hits Major Orgs
- Australia Dismantles BEC Group That Laundered $1.7 Million
- ‘Grim’ Criminal Abuse of ChatGPT is Coming, Europol Warns
- Webinar Tomorrow: Understanding Hidden Third-Party Identity Access Risks
- GitHub Rotates Publicly Exposed RSA SSH Private Key
