From df7f80db0a9ab84e9a55003e535b831deab9480c Mon Sep 17 00:00:00 2001 From: beeman Date: Tue, 11 Jun 2024 16:44:51 -0700 Subject: [PATCH 1/5] clarify that siselect values map to CTR logical entries --- body.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/body.adoc b/body.adoc index 2d84b99..0c9b95d 100644 --- a/body.adoc +++ b/body.adoc @@ -287,7 +287,7 @@ _Smctr/Ssctr depends upon implementation of S-mode because much of CTR state is Control transfer records are stored in a CTR buffer, such that each buffer entry stores information about a single transfer. The CTR buffer entries are logically accessed via the indirect register access mechanism defined by the https://github.com/riscv/riscv-indirect-csr-access/releases[[.underline]#Sscsrind#] extension. The `siselect` index range 0x200 through 0x2FF is reserved for CTR -entries 0 through 255. When `siselect` holds a value in this range, `sireg` provides access to <<_control_transfer_record_source_ctrsource, `ctrsource`>>, `sireg2` provides access to <<_control_transfer_record_target_ctrtarget, `ctrtarget`>>, and `sireg3` provides access to <<_control_transfer_record_source_ctrdata, `ctrdata`>>. `sireg4`, `sireg5`, and `sireg6` are read-only 0. +logical entries 0 through 255. When `siselect` holds a value in this range, `sireg` provides access to <<_control_transfer_record_source_ctrsource, `ctrsource`>>, `sireg2` provides access to <<_control_transfer_record_target_ctrtarget, `ctrtarget`>>, and `sireg3` provides access to <<_control_transfer_record_source_ctrdata, `ctrdata`>>. `sireg4`, `sireg5`, and `sireg6` are read-only 0. When `vsiselect` holds a value in 0x200..0x2FF, the `vsireg*` registers provide access to the same CTR entry register state as the analogous `sireg*` registers. There is not a separate set of entry registers for V=1. From 4d446f9f7739e221b9d1e71fdc8cc163f7a057fe Mon Sep 17 00:00:00 2001 From: beeman Date: Thu, 13 Jun 2024 17:37:22 -0700 Subject: [PATCH 2/5] clarify WRPTR behavior, and clean up references to logical entries --- body.adoc | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/body.adoc b/body.adoc index 0c9b95d..493ea83 100644 --- a/body.adoc +++ b/body.adoc @@ -228,7 +228,7 @@ The 32-bit `sctrstatus` register grants access to CTR status information and is [width="100%",cols="15%,85%",options="header",] |=== |Field |Description -|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. Incremented on new transfers recorded (see <>), and decremented on qualified returns when `mctrctl`.RASEMU=1 (see <>). For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value. +|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. Incremented after new transfers are recorded (see <>), though there are exceptions when `mctrctl`.RASEMU=1, see <>). For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value. |FROZEN |Inhibit transfer recording. See <>. |=== @@ -251,7 +251,7 @@ _When restoring CTR state, `sctrstatus` should be written before CTR entry state [NOTE] [%unbreakable] ==== -_Exposing the WRPTR provides a more efficient means for synthesizing CTR entries. If a qualified control transfer is emulated, the emulator can simply increment the WRPTR, then write the synthesized record to entry 0. If a qualified function return is emulated while RASEMU=1, the emulator can clear `ctrsource`.V for entry 0, then decrement the WRPTR._ +_Exposing the WRPTR provides a more efficient means for synthesizing CTR entries. If a qualified control transfer is emulated, the emulator can simply increment the WRPTR, then write the synthesized record to logical entry 0. If a qualified function return is emulated while RASEMU=1, the emulator can clear `ctrsource`.V for logical entry 0, then decrement the WRPTR._ _Exposing the WRPTR may also allow support for Linux perf's https://lwn.net/Articles/802821[[.underline]#stack stitching#] capability._ ==== @@ -441,8 +441,7 @@ CTR records qualified control transfers. Control transfers are qualified if the * `sctrstatus`.FROZEN is not set * The transfer completes/retires -Such qualified transfers update the <<_entry_registers, Entry Registers>> at logical entry 0. As a result, older entries are pushed down the stack: the record previously in entry 0 -moves to entry 1, the record in entry 1 moves to entry 2, and so on. If the CTR buffer is full, the oldest recorded entry (previously at entry depth-1) is lost. +Such qualified transfers update the <<_entry_registers, Entry Registers>> at logical entry 0. As a result, older entries are pushed down the stack; the record previously in logical entry 0 moves to logical entry 1, the record in logical entry 1 moves to logical entry 2, and so on. If the CTR buffer is full, the oldest recorded entry (previously at entry depth-1) is lost. Recorded transfers will set the `ctrsource`.V bit to 1, and will update all implemented record fields. @@ -688,11 +687,9 @@ The CC value is only valid when the Cycle Count Valid (CCV) bit is set. If CCV= When the optional `mctrctl`.RASEMU bit is implemented and set to 1, transfer recording behavior is altered to emulate the behavior of a return-address stack (RAS). * Indirect and direct calls are recorded as normal -* Function returns pop the most recent call, by invalidating entry 0 (setting `ctrsource`.V=0) -and decrementing the WRPTR, such that (invalidated) entry 0 moves to -entry depth-1, and entries 1..depth-1 move to 0..depth-2. -* Co-routine swaps affect both a return and a call. Entry 0 is -overwritten. +* Function returns pop the most recent call, by invalidating logical entry 0 (by setting `ctrsource`.V=0) and decrementing the WRPTR, such that (invalidated) logical entry 0 moves to logical entry depth-1, and logical entries 1..depth-1 move to 0..depth-2. +* Co-routine swaps affect both a return and a call. Logical entry 0 is +overwritten, and WRPTR is not modified. * Other transfer types are inhibited * Transfer type filtering bits (`__x__ctrctl`[47:32]) and external trap enable bits (`__x__ctrctl`.__x__TE) are ignored From 837d7e1cf6c96d8120046c1b59a580ee7ce3d0e4 Mon Sep 17 00:00:00 2001 From: beeman Date: Thu, 13 Jun 2024 17:39:39 -0700 Subject: [PATCH 3/5] wording tweak --- body.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/body.adoc b/body.adoc index 493ea83..1b6aec4 100644 --- a/body.adoc +++ b/body.adoc @@ -228,7 +228,7 @@ The 32-bit `sctrstatus` register grants access to CTR status information and is [width="100%",cols="15%,85%",options="header",] |=== |Field |Description -|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. Incremented after new transfers are recorded (see <>), though there are exceptions when `mctrctl`.RASEMU=1, see <>). For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value. +|WRPTR |WARL field that indicates the physical CTR buffer entry to be written next. It is incremented after new transfers are recorded (see <>), though there are exceptions when `mctrctl`.RASEMU=1, see <>. For a given CTR depth (where depth = 2^(DEPTH+4)^), WRPTR wraps to 0 on an increment when the value matches depth-1, and to depth-1 on a decrement when the value is 0. Bits above those needed to represent depth-1 (e.g., bits 7:4 for a depth of 16) are read-only 0. On depth changes, WRPTR holds an unspecified but legal value. |FROZEN |Inhibit transfer recording. See <>. |=== From 9dbb5b75b14316bdd090a2e786fba61c6416832d Mon Sep 17 00:00:00 2001 From: beeman Date: Fri, 14 Jun 2024 09:49:04 -0700 Subject: [PATCH 4/5] add formulat for logical to physical entry translation --- body.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/body.adoc b/body.adoc index 1b6aec4..28c4747 100644 --- a/body.adoc +++ b/body.adoc @@ -238,7 +238,7 @@ and software should ignore but preserve any fields that it does not recognize. [NOTE] [%unbreakable] ==== -_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical entry preceding the WRPTR entry ((WRPTR-1) % depth), where depth = 2^(DEPTH+4)^._ +_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical buffer entry preceding the WRPTR entry. More generally, the physical buffer entry Y associated with logical entry X can be determined using the formula Y = (WRPTR - X - 1) % depth, where depth = 2^(DEPTH+4)^._ ==== [NOTE] [%unbreakable] From 1503711b4bde91e00dfb3489537515646757e145 Mon Sep 17 00:00:00 2001 From: beeman Date: Fri, 14 Jun 2024 10:09:47 -0700 Subject: [PATCH 5/5] account for depth in logical to physical translation formula desc --- body.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/body.adoc b/body.adoc index 28c4747..9c5569f 100644 --- a/body.adoc +++ b/body.adoc @@ -238,7 +238,7 @@ and software should ignore but preserve any fields that it does not recognize. [NOTE] [%unbreakable] ==== -_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical buffer entry preceding the WRPTR entry. More generally, the physical buffer entry Y associated with logical entry X can be determined using the formula Y = (WRPTR - X - 1) % depth, where depth = 2^(DEPTH+4)^._ +_Logical entry 0, accessed via `sireg*` when `siselect`=0x200, is always the physical buffer entry preceding the WRPTR entry. More generally, the physical buffer entry Y associated with logical entry X (X < depth) can be determined using the formula Y = (WRPTR - X - 1) % depth, where depth = 2^(DEPTH+4)^. Logical entries >= depth are read-only 0._ ==== [NOTE] [%unbreakable]