1.-- title: “Average Memory Access Time (AMAT)” short_title: “Average Memory Access Time”¶

Learning Outcomes¶

Define hit rate, hit time, miss rate, and miss penalty.
Use the average memory access time (AMAT) formula to compare multi-level cache designs.

🎥 Lecture Video

Because performance is the major reason for a memory hierarchy, it is important to measure the time to service hits or misses. We therefore define the following terminology in Table 1:

Table 1:Key cache terminology

Request Outcome	Rate	Time
Cache Hit	Hit rate: fraction of access that hit in the cache.	Hit time: time (latency) to access cache memory, including the time needed to determine whether the access is a hit or a miss.
Cache miss	Miss rate: 1 - hit rate.	Miss penalty: Time to replace a line with the corresponding line from a lower level of the memory hierarchy.

Because the cache is smaller and built using faster memory parts, the hit time will be much smaller than the miss penalty, which includes the time to access the next level in the hierarchy.

Average Memory Access Time¶

The time to access data for both hits and misses affects performance. Designers sometimes use average memory access time (AMAT) as a way to compare cache designs. From P&H 5.4:

Average memory access time is the average time to access memory considering both hits and misses and the frequency of different accesses.

\text{AMAT} = \text{Hit Time} + \text{Miss Rate} \times \text{Miss Penalty}

(1)

We will use the following assumptions in this course:

On a cache miss, the total time to retrieve data is the sum of hit time plus miss penalty.
The miss rate of a lower-level cache (e.g., L2) is the fraction of misses from a higher-level cache (e.g., L1) that also miss in this lower-level cache.

Solution to Exercise 2 #

Based on AMAT assumptions, the miss rate of the L2 cache is the fraction of misses from the L1 cache that also miss in the L2 cache.

We can use Equation (1) recursively:

\begin{aligned} \text{AMAT} &= \text{L1 Hit Time} + \text{L1 Miss Rate} \times \text{L1 Average Miss Penalty} \\ &= \text{L1 Hit Time} + \text{L1 Miss Rate} \times \bigl(\text{L2 Hit Time} + \text{L2 Miss Rate} \times \text{L2 Miss Penalty}\bigr) \\ &= \text{L1 Hit Time} + \text{L1 Miss Rate} \times \bigl(5 + 0.15 \cdot 200\bigr) \\ &= 1 + 0.05 \cdot 35 \\ &= 2.75 \text{ cycles} \\ \end{aligned}

Now, L1 miss penalty includes L2 cache hit and L2 cache hit miss, as shown in Figure 4.

Figure 4:Two-layer cache performance analysis. 95% of the time, we incur 1 cycle delay to access the L1 cache. 5% of the time, we miss the L1 cache. Of this L1 miss scenario, 85% of the time we incur 6 cycles of delay (to access both the L1 and L2 cache). 15% of the time we incur 206 cycles of delay (to access the L1 cache, the L2 cache, and memory).

The L1 and L2 cache design is 4 times as fast as the L1-only cache design!

Reducing Miss Rate¶

We mentioned that AMAT is used to compare cache designs. The key performance hit to AMAT is miss rate. This can be measured over multiple program benchmarks, each with different memory access patterns.

In this section, we have seen one way to optimize cache performance by introducing multilevel caches to reduce miss penalty.

To optimize cache performance:

Introduce multilevel caches to reduce miss penalty. We just saw this.
Get a larger cache. This is limited by cost and physical technology capabilities. Furthermore, bigger caches are slower. We would love for higher caches (like L1 cache) to have a hit time of less than the cycle time.
Place lines of the cache in a way that maximizes temporal and spatial locality as needed for the average program.
1. Larger cache line size to reduce miss rate, but can increase the miss penalty.
2. Higher associativity to reduce miss rate, but can increase hit time.¡

The last group of techniques is the core of cache design and placement policies. Up next!

Footnotes¶

Hashemi et al. “Learning Memory Access Patterns.” 2018 arXiV:1803.02329
↩