# RTL Review Findings

Systematic review of each Leaf processor module.
Started: 2026-05-29

## Methodology

Each VHDL module is analyzed individually. Issues are categorized as:

| Severity | Description |
|----------|-------------|
| **BUG** | Functionally incorrect — affects instruction execution or ISA compliance |
| **WARN** | Potentially problematic — may cause failures in certain scenarios |
| **INFO** | Style, clarity, or non-critical improvement |

---

## Open Issues

### WARN: Reset distribution asymmetry

`rtl/leaf.vhdl:79-105`, `rtl/wb_ctrl.vhdl:122`

The `reset` signal feeding the core is generated by the `wb_ctrl` FSM (`reset <= '1' when curr_state = START else '0'`). `clk_ctrl` and `counters` receive `rst_i` directly.

Sequence: `rst_i` deasserts → `wb_ctrl` moves from START to IDLE → `reset` goes low → core exits reset.

This introduces a **1-cycle delay** between `rst_i` and `reset` for the core, while `counters` is released immediately. `clk_ctrl` is also released immediately (`or rst_i` gate on clock gate output).

**Impact**: Low — the core starts executing 1 cycle after the counters.

**Suggestion**: Either document this behavior explicitly, or unify reset distribution (use `rst_i` in all modules with a dedicated reset synchronizer separate from the Wishbone FSM).

---

### WARN: COP interface lacks handshake signals

`rtl/leaf.vhdl:27-30`, `rtl/core.vhdl:32-35`

The coprocessor interface has `cop_adr_o` (6 bits), `cop_dat_o`, `cop_we_o`, `cop_dat_i`. No `cop_ack_i`, `cop_err_i`, or `cop_ready_i`.

**Impact**: A multi-cycle coprocessor cannot stall the pipeline. The interface is effectively single-cycle.

**Suggestion**: Add `cop_stall_i` (or `cop_ready_i`) for the core to freeze the pipeline while waiting. Alternatively, document that the interface is single-cycle only.

---

### INFO: Gated clock via transparent latch

`rtl/clk_ctrl.vhdl:28-33`

Clock gating uses a transparent latch (enable sample on falling edge) + AND gate. Classic technique, but:

- FPGAs generally don't synthesize latches well (LUT + routing, not dedicated latch)
- ASIC flows prefer dedicated clock gating cells (ICG)
- Yosys needs configuration to handle generated clocks

**Suggestion**: Use clock enables on registers (`if clk_en = '1'` in each clocked process) instead. Avoids the latch and doesn't create an additional clock domain.

---

### INFO: Bus error reporting uses current enable signals

`rtl/wb_ctrl.vhdl:124-126`

```vhdl
imrd_err <= imrd_en when curr_state = ERROR else '0';
dmrd_err <= dmrd_en when curr_state = ERROR else '0';
dmwr_err <= dmwr_en when curr_state = ERROR else '0';
```

The transaction type that caused the error is not latched — uses the *current* value of `imrd_en`/`dmrd_en`/`dmwr_en` in the cycle when `curr_state = ERROR`. If the enable changes between the error and the ERROR state, the error may not be correctly attributed.

**Impact**: Low in practice — enables rarely change during a bus error — but fragile.

**Suggestion**: Latch the transaction type (`error_source`) when `err_i` is received.

---

### INFO: No bus timeout

`rtl/wb_ctrl.vhdl:61-111`

The Wishbone FSM waits indefinitely for `ack_i` or `err_i` in `READ_INSTR`, `READ_DATA`, and `WRITE_DATA` states. No timeout counter.

**Impact**: A Wishbone slave that never responds locks up the processor forever.

**Suggestion**: Add an external timeout watchdog or document the limitation.

---

### INFO: `sel_o` active when bus is idle

`rtl/wb_ctrl.vhdl:117`

```vhdl
sel_o <= dmwr_be when curr_state = WRITE_DATA else (others => '1');
```

`sel_o` is `1111` even when `cyc_o` is low. Wishbone slaves ignore `sel_o` when `cyc_o` is low, so not a functional error.

---

### INFO: Flush condition includes `not pcwr_en_i`

`rtl/if_stage.vhdl:71`

`flush_o <= taken_i or imrd_err_i or not pcwr_en_i` — `flush_o` is asserted when `pcwr_en_i` is low (pipeline stall). The fetched instruction in the pipeline register receives `flush_o = 1` and is discarded by ID/EX. On resume, the instruction must be re-fetched — wasting 1 cycle.

**Suggestion**: Don't update the pipeline register during stalls (gate `out_pipe_proc` with `pcwr_en_i`).

---

### INFO: Speculative fetch wasted on taken branches

When `taken_i = '1'`:
1. `pc_reg` receives `target_i(XLEN-1 downto 2)`
2. `imrd_addr_o` (`pc_reg & "00"`) changes to the target address
3. The previous Wishbone transaction (sequential) is already in flight — completes with discarded data
4. `wb_ctrl` returns to IDLE, sees `imrd_en_o = 1` with new address, starts correct fetch

Functionally correct, but wastes 1 bus transaction per taken branch. Inherent to the 2-stage pipeline without branch prediction.

---

### INFO: Don't-care values in datapath during flush/invalid opcode

`rtl/main_ctrl.vhdl:50-63, 77`

The immediate generator injects don't-care values (`'-'`) during flush and unknown opcodes. These propagate to ALU and shifter inputs, reaching `numeric_std` conversions — producing simulation warnings (`metavalue detected`). During `flush = '1'`, `exec_ctrl_o` and `regwr_en_o` are zeroed, so metavalues on `imm_o` are never functionally captured. Yosys ignores don't-cares.

**Not a functional bug** — only visual pollution in simulation.

---

## WONTFIX

### Invalid CSR accesses don't generate traps

`rtl/main_ctrl.vhdl:148`, `rtl/csrs.vhdl:116`

- `main_ctrl` treats all `SYSTEM_OPCODE` as valid
- The CSR read path returns zero for unknown addresses

**Intentional** for this simple core — invalid CSR reads 0, writes ignored. Acceptable for Leaf's scope.

---

## Validation Notes

- `make run` fails because the default target depends on `verif/tests/dump/out.bin`, which doesn't exist in the repository.
- `verif/tests/addi` was fixed — no longer diverges, matches Spike signature.
- `make -C verif/tests run TEST=...` doesn't match the current Makefile; the correct invocation is `make -C verif/tests/<test> run`.

---

## Planned Improvements

### `mcountinhibit` (CSR `0x320`) — counter inhibit

Add the `mcountinhibit` WARL register to allow software to pause `mcycle` and `minstret`.

**Implementation**:
1. `csrs.vhdl`: add `mcountinhibit_reg` (bits 0 and 2 writable, others hardwired to 0), write process, read in `read_csr`, `mcountinhibit_o` port
2. `id_stage.vhdl`: add `mcountinhibit_o` port — wire-through from csrs
3. `core.vhdl`: add `mcountinhibit_o` port — wire-through from id_stage
4. `counters.vhdl`: add `inhibit_i` port — `inhibit_i(0)` freezes `cycle_reg`, `inhibit_i(2)` freezes `instret_reg`
5. `leaf.vhdl`: connect `core.mcountinhibit_o` → `counters.inhibit_i`
6. `leaf_pkg.vhdl`: add `CSR_ADDR_MCOUNTINHIBIT` constant, update component declarations

---

### `mtimecmp` — internal timer interrupt

The core does not generate `tm_irq` internally. The `time` counter (CSR `0xC01`/`0xC81`) exists and is readable by software, but without an `mtimecmp` register the timer interrupt depends on external hardware.

**Future implementation**:
1. Add `mtimecmp` to CSR space (address `0x321`, alongside `mtime` at `0xC01`)
2. Hardware comparator: `tm_irq <= '1' when timer >= mtimecmp`
3. Option 1: implement in `csrs.vhdl` with internal register and comparator
4. Option 2: external module connected via Wishbone, with `tm_irq` as output

---

## Modules Reviewed

All 17 RTL modules reviewed:

`if_stage`, `main_ctrl`, `reg_file`, `csrs_logic`, `csrs`, `id_stage`, `alu_ctrl`, `alu`, `br_detector`, `ex_block`, `dmls_block`, `core`, `wb_ctrl`, `leaf`, `leaf_pkg`, `clk_ctrl`, `counters`

Bug fixes applied to: `csrs.vhdl` (mret path, write_mcause condition, mtval priority), `if_stage.vhdl` (30-bit pc_reg), `counters.vhdl` (instret/timer counters), `id_stage.vhdl` (dmld_fault wiring).

All ports standardized to `_i`/`_o` convention and `XLEN` generic across every module.
