How it works? Prime and Probe
All the illustrations used in this post was made with Excalidraw - If you want to use high quality, hand drawn looking illustrations in your project, consider using Excalidraw
If you haven't read the previous installment of How it works? Flush and Reload I highly recommend reading it because I'll be reusing the terminologies I explain there such as Inclusive Cache, Copy on Write, Timers and would make understanding the concept discussed here more easier. If you understand how caching works, you can dive right ahead.
If you like watching a video as opposed to reading, I suggest watching the video by Chester Rebeiro, IIT Madras - Prime+Probe created as a part of NPTEL online course.
In the last article we saw how flushing a cache line and timing a reload of a shared data can help us piece together secret stored in victim's address space when the victim process is being executed on the adjacent thread that share the same cache as the attacker thread.
In this article we'll see a similar timing attack that works better than the Flush and Reload. This cache side channel attack is called Prime and Probe. Just like Flush and Reload this is a two phase attack by the Attacker to read a secret from the victim's address space.
Now we will still need the victim to execute a device that leaks this secret in the micro-architectural state. Let us take the same device we saw in the Flush and Reload article
while(secret_copy > 0) {
if (secret % 2) {
some_shard_library_function();
}
secret_copy = secret_copy >> 2;
}
In this device, the victim is executing a shared function based on the bits of the secret data. We look at our old friend Copy on Write here - to increase performance, the operating system implements a concept called Copy on Write. If a piece of code is being shared by multiple processes, all process will load this function from the same memory location for read. If a process wants to modify what is stored in this location, a copy is made lazily, altered and this process will use this modified copy for all the future work. This way Operating System saves a lot of replication that would have otherwise be necessary to execute programs.
Now let us look at core of how Prime and Probe works
As mentioned before, this is a two phase attack, with the phases being:
- Prime Phase: In this phase, the attacker floods the cache with dummy data from the attacker's address phase such that the entire cache is now filled up with attacker's data and any changes to cache would need some existing data to be evicted from the cache to bring in new data.
- Probe Phase: In this phase the attacker tries to read the data that was previously cached and simultaneously time the access.
When the victim reads the secret bitwise and encounters a 1, it tries to load the some_shard_library_function but because of inclusive cache hierarchy, this function needs to be loaded into cache filled with the victim's data. This means at least data in one cache line needs to be evicted to make room for this function. Now when the victim tries to read this data in the Probe Phase, he finds out that it take longer than usual to load the data signifying that cache line has been evicted and hence victim has executed the some_shard_library_function at least once since the last prime phase.
Visualizing the process
As compared to Flush and Reload, prime and probe can monitor a larger chunk of cache more efficiently hence leading to a higher resolution.
Again like any other side channel attack, timing between the phases plays a significant part in trying to piece the secret efficiently. You need to have the window narrow enough to match a single pass of victim's loop. If it is less, you'll read the data before eviction interpreting a 1 as a 0. If the window is too large, you cannot say for certain your observation corresponds with a single pass of victim's loop. The victim might have executed the some_shard_library_function() multiple time between a prime and probe.
An ideal scenario would look something like this:
The prime and probe are perfectly aligned with a single pass of the loop in victim's device code
The unlikely scenarios also do exist such as:
There have been research to make Prime and Probe an effective techniques. Papers such as Last-Level Cache Side-Channel Attacks are Practical by Liu et al. shows cases where Prime and Probe can be put to use extremely effectively to extract secret from victim's address space. Unlike Flush and Reload, this doesn't rely on having an instruction like CLFLUSH that is used to programmatically evict cache line but rather exploits the fundamental hardware design of caches. Putting an end to this would likely lead to a performance penalty, one too big to bear.
Thank you for reading the second article of How it Works? I will be back with more content pertaining to microprocessor and computer architecture in the future.
References:
- CPU Security: Basics Part - II by Mr. Sanket Kadam, a CPU Verification Engineer at Nvidia,
- Wikipedia: Side Channel Attacks article
- Flush and Reload Attack - NPTEL open course by Chester Rebeiro, IIT Madras
- Information Security - 5 - Secure Systems Engineering playlist by NPTEL-NOC IITM
- Cache Side Channel Attack: Exploitability and Countermeasures talk from Black Hat Asia 2017
Comments
Post a Comment