Meltdown - Exploiting hardware features, not bugs

In the mordern world, we face a lot of security issues with the electronic devices that we use in day-to-day life. These holes in security are “usually” due to the bugs present in the software, and these can be exploted by hackers to gain access sensitive information (passwords, credit card info, etc.). Normally, these exploits are achieved by running a rogue program in the background on a target device. Vulnarabilities can normally be patched with software updates. However, sometimes, vulnerabilites don’t take advantage of programmer mistakes. Sometimes, the make use of “features”. Among such vulnarabilities is “Meltdown”.

A breif history of Meltdown

Meltdown was first discovered in June, 2017 by a group of researchers working for Google. The researchers were working on Google’s “Project Zero” product when they happened to stumble upon two severe security vulnarabilities. One of them was “Meltdown”, and the other was “Spectre”. The magnitude of this issue was so high that some of the higher officials could not believe it for a while. They even kept this flaw a secret from the public until something could be done about it. This was because this exploit affected basically almost all modern electronic devices, including televisions, tablets, phones, and PCs. As such, it wasn’t until January 3rd, 2018 that the group made the vulnarability public, and Meltdown was clasiffied as CVE-2017-5754 (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-5754). Metldown and Spectre are two vulnarabilities that target micro processors, and this is noteworthy because such a vulnarabily was never previously found. Both Spectre and Meltdown are similar in some aspects, but are diiferent in others. Meltdown is most prominent in Intel and ARM processors, for instance. Hackers using “Meltdown” take advantage of processor features known as “Speculative execution” and “alternate branching”. Due to this, Meltdown is also known as a “Variant Rogue Data Cache Load” vulnarability (https://en.wikipedia.org/wiki/Meltdown_(security_vulnerability).

Who discovered Meltdown specifically

Meltdown was independently discovered and reported by three teams:

  • Jann Horn (Google Project Zero),
  • Werner Haas, Thomas Prescher (Cyberus Technology),
  • Daniel Gruss, Moritz Lipp, Stefan Mangard, Michael Schwarz (Graz University of Technology)

Now, so far, this might sound scary, but there are software patches for this vulnarability. The downside is that it reduces performance. These pathes were made right before the eploit was made public. Still, the only true way to fix Meltdown is a redesign of the hardware, which is quite expensive, and yet to be done (https://meltdownattack.com/?fbclid=IwAR2NiDOAZN1vwBFT2H0uNUnG5wSWTp5mbvjg6bU3p5q8XJ6cDVunQ4G0ydQ#faq-fix).

The specifics and example

Before delving into the details of the workings of Meltdown, it is important to review several aspects of computer hardware and take note of a few particular features used by modern CPUs. The reason being is that Meltdown is only possible only when all of these properties come together.

There are several types of memory present in a computer:

  • Main memory, in the form of RAM, is a means of internal storage that houses the data that is needed for the software on the PC to run. The types of data include kernel code, temporary program data, and, most importantly, sensitive data such as passwords. Of course, the PC needs to protect the more important bits of information so that attackers, or even the user for that matter, doesn’t have access to it. Someone gaining access to passwords is pretty bad. For this reason, RAM memory is sections into 2 parts. Privalaged, or private, memory addresses, are where senstive data is stored and gated off behind security measures present in the CPU. Non-privalaged, or public memory addresses, make up the rest of the memory that is avaiable, and its this memory that, say, C programs use to perform their tasks. If you’ve every had a segmentation fault occur in a C program, one of the reasons could have been would be that the program tried to access private memory addresses. Therefore, as an attacker, you would want to find out what data is stored in the private memory addresses.
  • CPU cache memory is the other type of memory present in a computer. This memory is located inside of the CPU, and it is the fastest type of memory in a PC. Essentially, this is working memory for the CPU. It is where the CPU stores data that is related to the instructions that it is currently exected. Everytime that the CPU accesses data from RAM, it copies this data into the CPU cache, so that it can execute the instructions associated with that data much faster. The cache memory is key for Meltdown because it is what goes stored in the CPU cache that allows attackers to, more-or-less, read what is located in private memory.

Now that we understand the two types of memory present in a computer, we need to understand a feature of modern processors that ties this all together. This feature is called speculative exectution. Speculative execution is a means of reducing idle time of a CPU, essentially making it execute more instructions per CPU cycle, thereby greatly speeding up the process. Consider the following lines of code:

if (foo() == true) {
    bar();
}

Normally, one would expect that before the function “bar” will execute, the function “foo” must be performed, and there must be a check that verifies that the retun value of “foo” is true. However, speculative execution is a feature that basically give the CPU the ability to “guess” what the return value of “foo” will be, and make executions based on that guess. What this means is that the CPU might speculate that there is a high likelyhood that “foo” will return true, so, in that case, it will execute both “foo” and “bar” at the same time, before actually checking that “foo” returns true. But, critically, the CPU doesn’t check whether or not “foo” and “bar” are actually allowed to execute until the instruction has already been processed, because it just doesn’t do that during speculative execution. So what does this mean? Well, this is kind of the CPU doing an “execute first, ask questions later” sort of thing, and if you can execute a few special lines of code at the same time, regardless of whether your allowed to or not, you can create some interseting side effects, hence Meltdown.

This is one example of the Meltdown exploit (taken from: https://www.youtube.com/watch?v=bs0xswK0eZk&vl=en) that shows a simplified version of how the exploit may be carried out. We know that sensitive information is stored in private memory on the RAM. We know what the starting address of this memory is (let’s say 1000). We know that we can copy memory from the RAM and store it into the CPU cache. What if there was a way for use to store private memory into the CPU cache and then find out what’s in the CPU cache? That is the idea of Meltdown.

At the start of a program, the attacker will load up an image file into memory. The contents of this image is not important. The next few lines of code look like this:

if (ReadMemoryAtPrivateAddress(123456) == 'P')
{
    ReadPixelFromImageAt(1, 1);
}

Now, if the byte that is stored at the set private memory address is actually ‘P’, what will happen is that the CPU will perform speculative execution, meaning that since it can correctly guess that the byte is ‘P’, it will assume that the if statement will evaluate to true, and it will also read a pixel from the loaded image file. What will happen is that this pixel that is read will be stored in the CPU cache. Even though the read private memory function will through an error after execution, the reading of the pixel will still remain in the CPU cache. At that point, the program will catch the exception and continue executing some more code.

for (Row and Column in Image)
{
    TimerStart();
    ReadPixelFromImageAt(Row, Column)
    TimerEnd();
    ReadTimer();
}

The next think the attacker’s program will do loop through all of the pixels in the program, and read them 1-by-1, while also keeping track of the time that it takes to read each pixel. The trick here is that the pixel that was previously loaded into the CPU cache from the speculative execution earlier is going to cause subsequent calls to read that particular pixel to be much, much faster, as that pixel is already stored, and can be drawn from, the CPU cache, rather than having to be fetched from main memory. If, while looping through all of the pixels in a file, one pixel is read much faster than the others, than we know that speculative execution kicked in. That means that when private memory was read at that location, the guess made by the CPU was correct, and in fact, the ltter ‘P’ was located at that private memory address. So, essentially, the attacker was able to deduce what was the contents of that private address. At this point, the attacker would just repeat the process with a different address and/or a different byte (of which there are 256 combinations), and keep doing these checks until they have made a map of the contents of the private memory. At that point they could pick up key bits of information, such as passwords. That is Meltdown in a nutshell.

Here is another example of Meltdown that, rather than using speculative execution to perform the exploit, it uses out-of-order execution. Out-of-order executation is similar to speculative execution, in that multiple lines of code will be executed at the same time. The exception is that here, the processor doesn’t guess if a condition is going to end up true or false. Rather it will execute multiple lines of code that don’t depend on each in order to increase the number of instructions processed per cycle. For instance, instead of…

X = 4 * 5;
Y = 3 + 10;

… executing one right after the other, both of thes lines will be processed at the same time. That being said, consider this scenario (taken from https://hackernoon.com/a-simplified-explanation-of-the-meltdown-cpu-vulnerability-ad316cd0f0de).

  • Attacker allocates 256 pages (1 page = 4096 bytes) of memory and fills it with random data. The attacker allocates 256 pages, because each page represents one of the 256 possible combinates for that a byte of data can be represented. Essentially, this will help identify what bytes are stored in private memory. More on that later.
  • Attacker then launches a sequence of code instructions.

    1. Instruction that throws and exception
    2. Instruction that reads a byte from a set private memory address (this byte will be refered to as “secret” byte)
    3. Instruction that multiplies secrete byte by 4096 (i.e.: the size of a page)
    4. Instruction that uses that multiple to index into one of the 256 pages, and loades that page into the CPU cache by reading one byte from the page
  • At this point, if the CPU performs out-of-order execution, it is possible that instructions 2 - 4 would have been completed before instruction 1 threw an exception. This means that a secret byte was read, and then a pgae from the allocated memory was stored in cache. Here is where Meltdown comes together.

    1. Instruction that catches the exception to keep the CPU cache from flushing.
    2. Instruction that loops through all of the pages in the allocated memory and tries to read one byte from each page.
    3. Instruction that times how long it takes to read one byte from each page.
  • The trick here is that in the previous 4 instructions, one page of the 256 would have already been loaded in the CPU cache, but the rest would be still stored in RAM. Remember how CPU cache memory reads are much, much faster than RAM. This means that, as a byte from each page is read, there will be one page whose byte will be read much faster than any other page. What this tells is is that this byte this page represents (remmber there are 256 pages, each representing 1 byte combination) is the byte that is found in private memory, because to get to that particular page, we had to multiply the same byte, the “secret” byte, representation by 4096. At this point, the attack would:

    1. Set the private memory address to the next increment.
    2. Repeat steps 1 - 7 to figure out the next byte in the private address space.

By repeating these steps, an attacker can essentially read out all of the data that is stored in the private memory addressess and eventually pick out passsords, or other information of interest. According to the above source, this operation can be down at a rate of 500 kB/s.

What can be done?

There have been patches that were released by various operating system developers at the cost of performance. Since this is a hardware issue, CPU manufacturers would have to come up with new designs for future CPUs. There is nothing reall that a normal user of a PC can do to further protect themselves from Meltdown, other than to not run untrusted software.