

# Improving DRAM Performance by Parallelizing Refreshes with Accesses

Kevin Chang<sup>†</sup>, Donghyuk Lee<sup>†</sup>, Zeshan Chisti<sup>§</sup>, Alaa Alameldeen<sup>§</sup>, Chris Wilkerson<sup>§</sup>, Yoongu Kim<sup>†</sup>, Onur Mutlu<sup>†</sup>

<sup>†</sup>Carnegie Mellon University, <sup>§</sup>Intel Labs

## Background and Problems

- DRAM cells require **periodic refresh** to prevent data loss from leakage

- Problems:**

- 1. System performance degradation**

**All-bank refresh (REF<sub>ab</sub>)**: memory controllers refresh **every bank** within a rank, blocking the rank from servicing memory requests



- 2. DRAM scaling**

As DRAM density increases (more cells), refresh latency is expected to increase



- Per-bank refresh (REF<sub>pb</sub>)**: refresh one bank at a time, following a **strict sequential round-robin order**

**Advantage:** enable DRAM to serve requests in non-refreshing banks while another bank is refreshing



- Our goal:** improve system performance over existing refresh schemes by mitigating refresh penalty
- Our key idea:** hide refresh latency by parallelizing refresh operations with memory accesses to avoid delaying memory requests

## Results

### Methodology

- 8 OoO cores, 4GHz, 3-wide issue
- 64KB L1, 512KB private L2 cache slide/core
- Memory controller: 64-entry request queue, FR-FCFS scheduling
- DRAM: DDR3-1333, 2 channels, 2 ranks/channel, 8 banks/rank, 8 subarrays/bank
- Simulation: cycle-level x86 multi-core simulator
- Workloads: TPC, STREAM, SPEC CPU2006

