Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

30 veds sail feedback #31

Merged
merged 5 commits into from
Jun 14, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 8 additions & 11 deletions body.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ The 32-bit `sctrstatus` register grants access to CTR status information and is
[width="100%",cols="15%,85%",options="header",]
|===
|Field |Description
|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. Incremented on new transfers recorded (see <<Behavior>>), and decremented on qualified returns when `mctrctl`.RASEMU=1 (see <<RAS (Return Address Stack) Emulation Mode>>). For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value.
|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. It is incremented after new transfers are recorded (see <<Behavior>>), though there are exceptions when `mctrctl`.RASEMU=1, see <<RAS (Return Address Stack) Emulation Mode>>. For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value.
|FROZEN |Inhibit transfer recording. See <<Freeze>>.
|===

Expand All @@ -238,7 +238,7 @@ and software should ignore but preserve any fields that it does not recognize.
[NOTE]
[%unbreakable]
====
_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical entry preceding the WRPTR entry ((WRPTR-1) % depth), where depth = 2^(DEPTH+4)^._
_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical buffer entry preceding the WRPTR entry. More generally, the physical buffer entry Y associated with logical entry X (X < depth) can be determined using the formula Y = (WRPTR - X - 1) % depth, where depth = 2^(DEPTH+4)^. Logical entries >= depth are read-only 0._
====
[NOTE]
[%unbreakable]
Expand All @@ -251,7 +251,7 @@ _When restoring CTR state, `sctrstatus` should be written before CTR entry state
[NOTE]
[%unbreakable]
====
_Exposing the WRPTR provides a more efficient means for synthesizing CTR entries. If a qualified control transfer is emulated, the emulator can simply increment the WRPTR, then write the synthesized record to entry 0. If a qualified function return is emulated while RASEMU=1, the emulator can clear `ctrsource`.V for entry 0, then decrement the WRPTR._
_Exposing the WRPTR provides a more efficient means for synthesizing CTR entries. If a qualified control transfer is emulated, the emulator can simply increment the WRPTR, then write the synthesized record to logical entry 0. If a qualified function return is emulated while RASEMU=1, the emulator can clear `ctrsource`.V for logical entry 0, then decrement the WRPTR._

_Exposing the WRPTR may also allow support for Linux perf's https://lwn.net/Articles/802821[[.underline]#stack stitching#] capability._
====
Expand Down Expand Up @@ -287,7 +287,7 @@ _Smctr/Ssctr depends upon implementation of S-mode because much of CTR state is
Control transfer records are stored in a CTR buffer, such that each buffer entry stores information about a single transfer. The CTR buffer entries are logically accessed via the indirect register access mechanism defined by the
https://github.com/riscv/riscv-indirect-csr-access/releases[[.underline]#Sscsrind#]
extension. The `siselect` index range 0x200 through 0x2FF is reserved for CTR
entries 0 through 255. When `siselect` holds a value in this range, `sireg` provides access to <<_control_transfer_record_source_ctrsource, `ctrsource`>>, `sireg2` provides access to <<_control_transfer_record_target_ctrtarget, `ctrtarget`>>, and `sireg3` provides access to <<_control_transfer_record_source_ctrdata, `ctrdata`>>. `sireg4`, `sireg5`, and `sireg6` are read-only 0.
logical entries 0 through 255. When `siselect` holds a value in this range, `sireg` provides access to <<_control_transfer_record_source_ctrsource, `ctrsource`>>, `sireg2` provides access to <<_control_transfer_record_target_ctrtarget, `ctrtarget`>>, and `sireg3` provides access to <<_control_transfer_record_source_ctrdata, `ctrdata`>>. `sireg4`, `sireg5`, and `sireg6` are read-only 0.

When `vsiselect` holds a value in 0x200..0x2FF, the `vsireg*` registers provide access to the same CTR entry register state as the analogous `sireg*` registers. There is not a separate set of entry registers for V=1.

Expand Down Expand Up @@ -441,8 +441,7 @@ CTR records qualified control transfers. Control transfers are qualified if the
* `sctrstatus`.FROZEN is not set
* The transfer completes/retires

Such qualified transfers update the <<_entry_registers, Entry Registers>> at logical entry 0. As a result, older entries are pushed down the stack: the record previously in entry 0
moves to entry 1, the record in entry 1 moves to entry 2, and so on. If the CTR buffer is full, the oldest recorded entry (previously at entry depth-1) is lost.
Such qualified transfers update the <<_entry_registers, Entry Registers>> at logical entry 0. As a result, older entries are pushed down the stack; the record previously in logical entry 0 moves to logical entry 1, the record in logical entry 1 moves to logical entry 2, and so on. If the CTR buffer is full, the oldest recorded entry (previously at entry depth-1) is lost.

Recorded transfers will set the `ctrsource`.V bit to 1, and will update all implemented record fields.

Expand Down Expand Up @@ -688,11 +687,9 @@ The CC value is only valid when the Cycle Count Valid (CCV) bit is set. If CCV=
When the optional `mctrctl`.RASEMU bit is implemented and set to 1, transfer recording behavior is altered to emulate the behavior of a return-address stack (RAS).

* Indirect and direct calls are recorded as normal
* Function returns pop the most recent call, by invalidating entry 0 (setting `ctrsource`.V=0)
and decrementing the WRPTR, such that (invalidated) entry 0 moves to
entry depth-1, and entries 1..depth-1 move to 0..depth-2.
* Co-routine swaps affect both a return and a call. Entry 0 is
overwritten.
* Function returns pop the most recent call, by invalidating logical entry 0 (by setting `ctrsource`.V=0) and decrementing the WRPTR, such that (invalidated) logical entry 0 moves to logical entry depth-1, and logical entries 1..depth-1 move to 0..depth-2.
* Co-routine swaps affect both a return and a call. Logical entry 0 is
overwritten, and WRPTR is not modified.
* Other transfer types are inhibited
* Transfer type filtering bits (`__x__ctrctl`[47:32]) and external trap enable bits (`__x__ctrctl`.__x__TE) are ignored

Expand Down
Loading