CS 466 - Notes
Notes for 11/19/2001
Unified Cache vs Split Cache
- Unified Cache - Instructions and Data cached together
- Split Cache - Two Caches, one for Instructions, one for Data
- instructions normally have a lower miss rate that data
(instructions are accessed in a more consistent manner than data)
- removes misses due to conflicts between instruction blocks and data blocks
- removes structural hazard in pipelines for data load/store and instruction fetches
-
Figure 5.7, Cache performances by Size, Miss Rates
Cache is Direct Mapped, 32 byte blocks, Running SPEC92, 75% of accesses are
instruction reads
Size | Instruction Cache | Data Cache | Weighted Ave | Unified Cache
|
---|
1 | 3.06% | 24.61% | 8.45% | 13.34%
|
2 | 2.26% | 20.57% | 6.84% | 9.78%
|
4 | 1.78% | 15.94% | 5.32% | 7.24%
|
8 | 1.10% | 10.19% | 3.37% | 4.57%
|
16 | 0.64% | 6.47% | 2.10% | 2.89%
|
32 | 0.39% | 4.82% | 1.50% | 1.99%
|
64 | 0.15% | 3.77% | 1.06% | 1.35%
|
128 | 0.02% | 2.88% | 0.74% | 0.95%
|
Weighted Ave = (75% * Instr Cache Rate) + (25% * Data Cache Rate)
-
Example p.384
-
Determine Lower Average Memory Access Time for 32KB Unified Cache or
for a 16KB Instruction Cache and 16KB Data Cache.
- hit time = 1 clock cycle
- miss penalty = 50 clock cycles
- miss rate - use above table
- effective hit time for a unified cache on a piped machine
due to the structural hazard (assuming a stall of 1 clock cycle)
- instruction hit time = 1 clock cycle
- data hit time = 2 clock cycles (includes stall)
- Ave Memory Access Time = Hit Time + Miss Rate * Miss Penalty
- Ave Memory Access Time (Split Cache) =
%Instr * (Hit Time + Instr Miss Rate * Miss Penalty) +
%Data * (Hit Time + Data Miss Rate * Miss Penalty)
- Ave Memory Access Time (Split Cache)
= 75% * (1 + 0.64% * 50) + 25% * (1 + 6.47% * 50)
= 2.05
- Ave Memory Access Time (Unified Cache with Structural Hazard) =
%Instr * (Instr Hit Time + Miss Rate * Miss Penalty) +
%Data * (Data Hit Time + Miss Rate * Miss Penalty)
- Ave Memory Access Time (Unified Cache with Structural Hazard)
= 75% * (1 + 1.99% * 50) + 25% * (2 + 1.99% * 50)
= 2.25
Improving Cache Performance
- Determining Proper Block Size
- Larger Block Size reduces Compulsory Misses - due to Spatial Locality
- Larger Block Size may increase conflict and capacity misses
- Larger Block Size increases Miss Penalty (longer transfer time)
Miss Penalty = access time + transfer time
- Example pp 394, 395
| Best Miss Rates and Ave Access Time
per Cache Size
|
---|
Block Size | Miss Penalty | 1K | 4K | 16K | 64K | 256K
|
---|
16 | 42
|
32 | 44 | 13.34% 6.870cc | 4.186cc | 2.263cc
|
64 | 48 | | 7.00% | 2.64% | 1.509cc | 1.245cc
|
128 | 56 | | | | 1.02% | 0.49%
|
256 | 72 | | | | | 0.49%
|
- Higher Associativity
- General Rule #1 - 8-way is generally as good as Full Assoc. for removing misses
- General Rule #2 - Direct Mapped of size N has approx. the same miss rate as a 2-way cache of size N/2
- A longer clock cycle is normally needed for higher associativity
- 2-way clock cycle time = 1.10 * 1-way clock cycle time
- 4-way clock cycle time = 1.12 * 1-way clock cycle time
- 8-way clock cycle time = 1.14 * 1-way clock cycle time
- Ave Access Time = Hit Time + Miss Rate * Miss Penalty
Hit Time increase as Associativity increases
Miss Rate decreases as Associativity increases
Miss Penalty Remains the same as Associativity increases
- For Cache Sizes below 32KB, 8-way is best
- For Cache Sizes greater than or equal to 32KB, 4-way is best
- Victim Caches
- A small fully associative cache between the cache (normally
direct mapped) and memory
- deals with high conflict rates from direct mapped caches
- Access both regular cache and victim cache concurrently
- reduces conflict miss rate by 20 - 95%
- General Policy on Two level Caches
- Ave Access Time = Hit Time L1 + Miss Rate L1 * Miss Penalty L1
- Miss Penalty L1 = Hit TIme L2 + Miss Rate L2 * MIss Penalty L2
- L1 - Level 1
- L2 - Level 2
- Extend formula to as many levels as needed.