Security Experts:

Academics Devise Side-Channel Attack Targeting Multi-GPU Systems

A group of academic researchers has devised a side-channel attack targeting architectures that rely on multiple graphics processing units (GPUs) for resource-intensive computational operations.

Used in high performance computing and cloud data centers, multi-GPU machines are shared between multiple users, meaning that the protection of applications and data flowing through them is critical.

“These systems are emerging and increasingly important computational platforms, critical to continuing to scale the performance of important applications such as deep learning. They are already offered as cloud instances offering opportunities for an attacker to spy on a co-located victim,” the researchers say in their paper.

For their experiments, academics from Pacific Northwest National Laboratory, Binghamton University, and University of California, along with an independent contributor, used NVIDIA’s Pascal-based DGX-1 system containing two GPUs linked using a combination of custom interconnect (NVLink) and PCIe connections.

By reverse-engineering the cache hierarchy, the researchers showed that it was possible for an attacker on one GPU to cause cache contention on the other. They also demonstrated that the attacker could “recover the cache hit and miss behavior of another workload,” essentially allowing for the fingerprinting of an application running on the remote GPU.

[ READ: New Variant of Spectre Attack Bypasses Intel and Arm Hardware Mitigations ]

When reverse-engineering the sharing of caches, the researchers discovered that one GPU can remotely access the caches of others, which allowed them to develop eviction sets – “collections of memory addresses hashing to the same cache set” – from both GPUs.

Next, they worked towards aligning the different discovered eviction sets, to ensure that they can create the contention “at the same physical set from both processes,” and built a “high quality, high bandwidth, prime-and-probe covert channel across GPUs” that reached 3.95 megabytes per second (MBps), with a very low error rate (1.3%).

As part of the microarchitectural covert attacks across the two GPUs, the researchers showed that it was possible for a trojan process that runs on one of the GPUs to send secrets to a spy process running on the other GPU.

Furthermore, the academics demonstrated proof-of-concept side-channel attacks where they recovered the memorygram of the accesses of a remote victim and used it to fingerprint applications on the victim GPU and to identify “the number of neurons in a hidden layer of a machine learning model.”

Basically, the researchers trained a deep learning network to accurately identify applications based on their memorygram and say that this can be used as a base for future attacks that not only identify a target application, but also infer information about it.

[ READ: Researchers Disclose New Side-Channel Attacks Affecting All AMD CPUs ]

“This attack can be used to identify and reverse engineer the scheduling of applications on a multi-GPU system (simply by spying on all other GPUs in a GPU-box), and identify a target GPUs that are running a specific victim application, and even identify the kernels running on each GPU,” the researchers say.

The academics note that, while GPUs do have some defenses to prevent side-channel attacks on a single GPU, they are not set up to mitigate this new type of assaults, which are conducted from user level and do not require system level features necessary in other attacks.

The researchers also point out that all experiments were conducted in a quiet environment, but that noise would be added to the attack in a real scenario, where other concurrent applications would be running on the GPUs. They have also proposed a series of noise mitigation techniques.

As protection measures, the academics propose adapting current protections against covert and side-channel attacks on both CPUs and GPUs to cross-GPU attacks as well, such as static or dynamic partitioning of shared resources.

“Making these cross-GPU data transfers more coarse-grained in normal applications will significantly increase the detection accuracy of high-bandwidth attacks, leading to more efficient defenses,” the researchers note.

Related: AMD Updates Spectre Mitigations Following Intel Research

Related: Researchers Show First Side-Channel Attack Against Apple M1 Chips

view counter