anuraagw.me

registers

Registers: What to Emulate

Analysis of which CSRs and tile control/debug registers the emulator actually needs, based on what firmware accesses in practice.

Must Emulate (firmware breaks without these)

CSR: cfg0 (0x7C0)

Every firmware binary (BRISC, NCRISC, all TRISCs) writes this at startup via csrrs/csrrc. Always write-only in current firmware (rd=zero), so read-back value doesn’t matter today, but store it correctly anyway.

Firmware sequence (configure_csr()):

csrrs zero, 0x7c0, 2       # set bit 1 (DisBp)
csrrs zero, 0x7c0, 1<<18   # set bit 18 (DisTrisCache)
csrrc zero, 0x7c0, 2       # clear bit 1
csrrs zero, 0x7c0, 8       # set bit 3 (DisLowCash)

Bit fields:

BitNameDefaultEffect
0DisLdBufByp0Load waits for store queue empty
1DisBp0Disable branch predictor (no effect in emulator)
3DisLowCash0Disable L0 data cache
18DisTriscCache0Disable .ttinsn fusion (no effect in emulator)
24DisLowCachePeriodicFlush0Disable random L0 flush
30EnBFloat0BF16 mode for Zfh instructions
31EnBFloatRTNE0BF16 rounding mode (0=RTZ, 1=RTNE)

For the emulator, only bits 30-31 have observable effects (they change FPU behavior). The rest control caches and branch prediction that don’t exist in the emulator.

SOFT_RESET_0 (0xFFB121B0)

Core launch sequencer. This must actually control which cores execute.

Boot sequence:

  1. Host writes 0x47800 (all cores in reset)
  2. Host writes 0x47000 (release BRISC only)
  3. BRISC firmware writes 0x00000 (release all cores)

Bit assignments:

BitTarget
0,1,7Unpackers
2-5Packers 0-3
6Mover
8TDMA-RISC
9Scalar Unit + THCON
10FPU + SFPU + SrcA
11RISCV B (BRISC)
12RISCV T0 (TRISC0)
13RISCV T1 (TRISC1)
14RISCV T2 (TRISC2)
15-17SrcA/SrcB ownership, Packer-Dst
18RISCV NC (NCRISC)
19-22SrcA data columns
23Auto TTSync

Key values:

  • SOFT_RESET_ALL = 0x47800 — all 5 RISC-V cores held in reset
  • SOFT_RESET_BRISC_ONLY_RUN = 0x47000 — TRISCs + NCRISC in reset, BRISC released
  • SOFT_RESET_NONE = 0x00000 — all cores running

For the emulator, bits 11-14 and 18 (the five RISC-V cores) are the ones that matter. Bits 0-10 and 15-23 control coprocessor blocks and can be tracked but don’t need to gate execution.

RESET_PC Registers

Written by host during firmware upload to set each core’s boot address.

AddressRegisterWho writes
0xFFB12228TRISC0_RESET_PCHost
0xFFB1222CTRISC1_RESET_PCHost
0xFFB12230TRISC2_RESET_PCHost
0xFFB12234TRISC_RESET_PC_OVERRIDEBRISC (writes 0b111)
0xFFB12238NCRISC_RESET_PCHost
0xFFB1223CNCRISC_RESET_PC_OVERRIDEBRISC (writes 0x1)

The OVERRIDE registers are 1-bit (NCRISC) or 3-bit (TRISCs) enables. When set, the core uses the programmed RESET_PC instead of the default reset vector.

Implementation: when a core is released from reset (SOFT_RESET_0 bit cleared) and its override bit is set, start execution at the corresponding RESET_PC value.

WALL_CLOCK (0xFFB121F0 / 0xFFB121F8)

TRISC firmware spins in riscv_wait(600) reading these at startup. If they return 0, TRISCs hang forever. Must be monotonically increasing.

AddressRegisterBehavior
0xFFB121F0WALL_CLOCK_0Low 32 bits of 64-bit counter. Reading this latches WALL_CLOCK_1_AT.
0xFFB121F4WALL_CLOCK_1High 32 bits (live, may change between reads)
0xFFB121F8WALL_CLOCK_1_ATHigh 32 bits latched at time of WALL_CLOCK_0 read

There is also an alias at 0xFFB11024 (WALL_CLOCK_L in the TDMA region) which BRISC writes with value 63 during device_setup. The write likely initializes or configures the clock.

Implementation: track a global cycle counter. On read of WALL_CLOCK_0, return low 32 bits and snapshot high 32 bits into WALL_CLOCK_1_AT. WALL_CLOCK_1 returns live high bits. The counter should increment with instruction execution (doesn’t need to be cycle-accurate, just monotonically increasing).

Write-Sink No-ops (firmware writes, never reads)

These registers are written during BRISC device_setup() but control clock gating which is meaningless in an emulator. Accept writes, discard them.

AddressRegisterValue written
0xFFB12240DEST_CG_CTRL0
0xFFB12244CG_CTRL_EN0
0xFFB12190RISCV_TDMA_REG_CLK_GATE_EN0x3F

Return-Zero Stubs (specified, not used by current firmware)

These are defined in the spec or in ckernel.h but no firmware binary in the current disassemblies reads them. Implement as simple registers that return 0 (or a sensible default). User kernels or LLK code may eventually use them.

Standard RISC-V counters

CSRAddressNotes
mcycle0xB00Cycle counter low. Could return wall clock for correctness.
mcycleh0xB80Cycle counter high.
minstret0xB02Instructions retired low. Could track actual count.
minstreth0xB82Instructions retired high.

Worth implementing properly since user kernels might use them for profiling. Returning the wall clock counter for mcycle and an instruction counter for minstret would be faithful.

Tensix custom CSRs

CSRAddressNotes
tt_cfg_qstatus0xBC0Queue status. 0 = queues empty (safe for emulation).
tt_cfg_bstatus0xBC1Backend busy. 0 = not busy (safe for emulation).
tt_cfg_sstatus0-70xBC2-0xBC9Stream status (T0/T1/T2) or scratch (B/NC).
intp_restore_pc0xBCAInterrupt return PC. Only matters with interrupt emulation.

For tt_cfg_qstatus and tt_cfg_bstatus, returning 0 means “not busy” which is correct for an emulator that executes coprocessor ops synchronously.

The tt_cfg_sstatus registers are scratch space for BRISC/NCRISC. For TRISCs they reflect stream state which would need real stream emulation to be useful.

Defer Entirely (profiler/debug only)

Not needed for functional emulation. Implement only if adding profiling or debug tool support.

AddressRegisterUsed by
0xFFB120B4FPU_STICKY_BITSLLK math layer (not startup firmware)
0xFFB12054DBG_BUS_CTRLHost read_risc_pc() debug function
0xFFB1205CDBG_BUS_RD_DATAHost read_risc_pc() debug function
0xFFB12000-0x124PERF_CNT_*Profiler builds only
0xFFB12218PERF_CNT_MUX_CTRLProfiler builds only
0xFFB12070CG_CTRL_HYST0Power management (dead code in rvir path)
0xFFB12074CG_CTRL_HYST1Power management (dead code in rvir path)
0xFFB1207CCG_CTRL_HYST2Power management (dead code in rvir path)
0xFFB121D0ECC_CTRLNot accessed by firmware
0xFFB121D4ECC_STATUSNot accessed by firmware
0xFFB121E0WATCHDOG_TIMERNot accessed by firmware

Summary

PriorityCountRegisters
Must work8cfg0, SOFT_RESET_0, 4x RESET_PC, 2x RESET_PC_OVERRIDE, WALL_CLOCK_0/1_AT
Write sinks3DEST_CG_CTRL, CG_CTRL_EN, CLK_GATE_EN
Return-zero stubs10mcycle/h, minstret/h, qstatus, bstatus, sstatus0-7, intp_restore_pc
Defer~12Perf counters, debug bus, FPU sticky, ECC, watchdog, etc.