Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pan fix #6169

Merged
merged 10 commits into from
Aug 3, 2023
Merged

Pan fix #6169

merged 10 commits into from
Aug 3, 2023

Conversation

jenswi-linaro
Copy link
Contributor

Fixes the missing PSTATE.PAN bit when returning with syscall_sys_return(). Did crash as expected without PR #6161.

@gagachang
Copy link
Contributor

Dear @jenswi-linaro

I have a question about ldelf_syscall_* functions.
Take ldelf_syscall_open_bin() for example, it seems that the const TEE_UUID *uuid is located within user memory.
If the kernel wants to access it, it needs user-access functions.
Am I correct?

I got the following stack trace when I tried CFG_PAN=y.
I seems that the exception is raised when kernel accesses uuid.

E/TC:0 0 Core data-abort at address 0x40013fa8 (translation fault)
E/TC:0 0 cpu	#0
E/TC:0 0 cause	000000000000000d epc	000000008e0639e2
E/TC:0 0 tval	0000000040013fa8 status	8000000200006100
E/TC:0 0 ra	000000008e064124 sp	0000000000000000
E/TC:0 0 gp	0000000000000000 tp	0000000000000aff
E/TC:0 0 t0	0000000000000001 t1	0000000000000000
E/TC:0 0 t2	0000000000000000 s0	000000008e0d78d0
E/TC:0 0 s1	000000008e0d77c0 a0	000000008e0d7960
E/TC:0 0 a1	0000000000000025 a2	0000000040013fa8
E/TC:0 0 a3	0000000000000050 a4	0000000040013fa8
E/TC:0 0 a5	0000000040013fa8 a5	0000000040013fa8
E/TC:0 0 a6	000000008e074060 a7	0000000000000000
E/TC:0 0 s2	000000008e061254 s3	000000008e0d7810
E/TC:0 0 s4	000000008e0d7810 s5	000000008e0d7830
E/TC:0 0 s6	000000008e099ac0 s7	000000008e0d7830
E/TC:0 0 s8	000000008e00bb5c s9	000000008e0b5220
E/TC:0 0 s10	000000008e0d79d8 s11	000000008e0d7a7f
E/TC:0 0 t3	0000000000000000 t4	0000000000000000
E/TC:0 0 t5	0000000000000000 t6	0000000000000000
E/TC:0 0 TEE load address @ 0x8e000000
E/TC:0 0 Call stack:
E/TC:0 0  0x8e0639e2 uuid2str at optee_os/lib/libutils/ext/snprintk.c:229
E/TC:0 0  0x8e064120 kprintf at optee_os/lib/libutils/ext/snprintk.c:427
E/TC:0 0  0x8e063958 __vsnprintf at optee_os/lib/libutils/ext/snprintk.c:143
E/TC:0 0  0x8e0638f6 vsnprintk at optee_os/lib/libutils/ext/snprintk.c:131
E/TC:0 0  0x8e064d7c trace_vprintf at optee_os/lib/libutils/ext/trace.c:163
E/TC:0 0  0x8e064b26 trace_printf at optee_os/lib/libutils/ext/trace.c:108
E/TC:0 0  0x8e0145ac ldelf_syscall_open_bin at optee_os/core/kernel/ldelf_syscalls.c:142
E/TC:0 0  0x8e00067a scall_do_call at optee_os/core/arch/riscv/kernel/arch_scall_rv.S:32
E/TC:0 0  0x8e001b82 thread_scall_handler at optee_os/core/arch/riscv/kernel/thread_arch.c:126
E/TC:0 0  0x8e001daa thread_user_ecall_handler at optee_os/core/arch/riscv/kernel/thread_arch.c:175
E/TC:0 0  0x8e0021fe thread_exception_handler at optee_os/core/arch/riscv/kernel/thread_arch.c:251
E/TC:0 0  0x8e0022da thread_trap_handler at optee_os/core/arch/riscv/kernel/thread_arch.c:297
E/TC:0 0  0x8e000358 trap_from_user at optee_os/core/arch/riscv/kernel/thread_rv.S:132

@jenswi-linaro
Copy link
Contributor Author

I assume you have removed the enter_user_access() and exit_user_access() from scall_handle_ldelf().
Yes, some fixing is needed for the ldelf syscalls too. I'm currently looking at that, I've also run into the uuid abort when testing some small changes.

@gagachang
Copy link
Contributor

I assume you have removed the enter_user_access() and exit_user_access() from scall_handle_ldelf().

Thanks for hint.
After deeper debugging I found it is our trusted-firmware's issue, not OP-TEE.
Now the exception is gone.

@seonghp
Copy link
Contributor

seonghp commented Jul 13, 2023

Oh, IIUC so it was not fault of the QEMU (it's always easy to blame others), but the SPSR_EL1.PAN was not set properly before eret to EL1 after handling return() or panic() system calls, so the PAN exceptions were not raised. Thank you for catching this.

@gagachang
Copy link
Contributor

gagachang commented Jul 13, 2023

Did you guys encounter PAN exception on xtest 1033 ?

It seems that in the pass value test, TA provides user buffer to system PTA.
(https://github.com/OP-TEE/optee_test/blob/master/ta/supp_plugin/ta_entry.c#L60)
The system PTA will crash when it accesses that user buffer.

@jenswi-linaro
Copy link
Contributor Author

Did you guys encounter PAN exception on xtest 1033 ?

It seems that in the pass value test, TA provides user buffer to system PTA. (https://github.com/OP-TEE/optee_test/blob/master/ta/supp_plugin/ta_entry.c#L60) The system PTA will crash when it accesses that user buffer.

Oh, another SPSR fabrication problem hiding that. I'll take care of that too in the PR.

@gagachang
Copy link
Contributor

gagachang commented Jul 17, 2023

Oh, another SPSR fabrication problem hiding that. I'll take care of that too in the PR.

Thanks!

I just found that xtest 2001 has the same problem.
socket_open(), socket_send(), socket_recv() and socket_ioctl() also access user buffers.

@jenswi-linaro
Copy link
Contributor Author

Yeah, I just noticed that too. I'm taking care of that. Then there's also xtest 6001, I'm looking at that at the moment.

@jenswi-linaro
Copy link
Contributor Author

Rewrote "core: arm64: preserve PSTATE.PAN when making SPSR" to address another SPSR fabrication issue. Now we have a common function to fabricate an SPSR based on PSTATE. Also added a few more patches fixing more PAN problems discovered with the SPSR fabrication problem out of the way.

This update fixes all issues with PAN I'm currently aware of. I'm planning to add more fine-grained PAN for ldelf syscalls too, but that will go into another pull request.

@jenswi-linaro
Copy link
Contributor Author

CI timeout should be fixed by OP-TEE/build#665

@jenswi-linaro
Copy link
Contributor Author

All checks has passed now.

@jenswi-linaro
Copy link
Contributor Author

Please help to review these PAN fixes.

@seonghp
Copy link
Contributor

seonghp commented Jul 19, 2023

Thank you for the work! I will take time to further review the PR.

At the first glance, regarding the commit "core: arm64: preserve PSTATE.PAN when making SPSR", IIUC the function handle_user_mode_panic(), also setup the SPSR, too, but the commit seems to not take into account this case.

@seonghp
Copy link
Contributor

seonghp commented Jul 19, 2023

Regarding "core: add user buffer to tee_invoke_supp_plugin_rpc()", do we have any call-sites where buf_core is not NULL? Is data_core argument left for this added for future-proofing?

@jenswi-linaro
Copy link
Contributor Author

At the first glance, regarding the commit "core: arm64: preserve PSTATE.PAN when making SPSR", IIUC the function handle_user_mode_panic(), also setup the SPSR, too, but the commit seems to not take into account this case.

Thanks, I'll fix that.

Regarding "core: add user buffer to tee_invoke_supp_plugin_rpc()", do we have any call-sites where buf_core is not NULL? Is data_core argument left for this added for future-proofing?

Yes, that's for future-proofing.

@jenswi-linaro
Copy link
Contributor Author

Update to take care of the handle_user_mode_panic() issue. Let me know if I should squash in the "review" commits.

Comment on lines 383 to 384
daif = (ai->regs->spsr >> SPSR_32_AIF_SHIFT) & SPSR_32_AIF_MASK;
/* XXX what about DAIF_D? */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I am not sure about is that if it is okay to remove this line. It seems like el0_sync_abort does not touch the memory at the offset THREAD_ABG_REG_SPSR, but I am just wondering if I am missing something. Other than this, it looks good to me and I think it can be squashed into previous commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THREAD_ABT_REG_SPSR is updated at:

mrs x2, spsr_el1
/* Store spsr, sp_el0 */
stp x2, x3, [sp, #THREAD_ABT_REG_SPSR]

However, there is a problem. We might need to create the new SPSR based on the EL0 SPSR since for instance (some of?) the DAIF bits are masked when taking an exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we need to keep the previous EL0 SPSR value here. If the abort_handler() hits the handle_user_mode_panic() then it will never return to the EL0 iiuc, and the previous spsr value has been already printed (in get_fault_type()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed back to the old way of setting the SPSR, except that I preserve the PAN bit in the same way as the DAIF bits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as sticking to the old way would introduce less side effects.

@jenswi-linaro
Copy link
Contributor Author

Squashed in the updates and added another sentence to "core: arm64: preserve PSTATE.PAN when making SPSR"

@gagachang
Copy link
Contributor

Applied parts of this PR and tested on RISC-V architecture with CFG_PAN=y.
Now we can pass xtest without PAN exception.
Thanks @jenswi-linaro !

+-----------------------------------------------------
26795 subtests of which 0 failed
104 test cases of which 0 failed
0 test cases were skipped

@jenswi-linaro
Copy link
Contributor Author

I noticed that we don't have the PAN bit set while at S-EL0 so SPSR_EL1.PAN isn't set. So we need to copy the PSTATE.PAN bit instead.

@seonghp
Copy link
Contributor

seonghp commented Jul 26, 2023

I am wondering if this PR is on radar of other maintainers for review.

@jforissier
Copy link
Contributor

@seonghp it is on my to-do list and hopefully others can review too, but note this is summer time and some people are on vacation 😉

@seonghp
Copy link
Contributor

seonghp commented Jul 26, 2023

Good news! Thank you for the confirmation 👍

Copy link
Contributor

@jforissier jforissier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small fixes in the commit descriptions:

  • For "core: fix reading result in ldelf_dlopen()": s/start to a use/start to use a/
  • For "core: syscall_storage_obj_rename(): fix direct user memory access": s/lead a data/lead to a data/
  • For "core: arm64: preserve PSTATE.PAN when making SPSR": s/only go to/only going to/ plus see one comment in the code below.

With that and for the whole series:

Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>

One more thing: how come the CI wasn't failing without these fixes? The QEMUv8_check job has a line with CFG_PAN=y and QEMU is started with -cpu max, so PAN is supposed to be active. Is it only because the test cases are not exercising the proper parts of the code?

{
uint64_t spsr = SPSR_64(SPSR_64_MODE_EL1, SPSR_64_MODE_SP_EL0, 0);

spsr = SPSR_64(SPSR_64_MODE_EL1, SPSR_64_MODE_SP_EL0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated initialization

@jenswi-linaro
Copy link
Contributor Author

One more thing: how come the CI wasn't failing without these fixes? The QEMUv8_check job has a line with CFG_PAN=y and QEMU is started with -cpu max, so PAN is supposed to be active. Is it only because the test cases are not exercising the proper parts of the code?

It's because "core: arm64: preserve PSTATE.PAN when making SPSR" fixes a few bugs that otherwise effectively disables PAN in some cases.

Adds check_user_access() to simplify checking if a user mode memory
buffer may be accessed as expected.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Adds default implementations for copy_to_user_private() and
copy_to_user() when CFG_WITH_USER_TA=n.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Adds a user buffer to tee_invoke_supp_plugin_rpc() so direct user memory
access can be used when called with a buffer in user memory instead of
core memory. tee_invoke_supp_plugin_rpc() can still take a core memory
buffer as an argument if needed.

PTA_SYSTEM_SUPP_PLUGIN_INVOKE in the system PTA is updated to pass the
memref as a user memory buffer instead of a core memory buffer.

This fixes a direct privileged memory access to user space memory.

Fixes: 4e15432 ("core: Apply finer-grained PAN")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
The commit 52e7b1a ("core: use user-access functions in ldelf
interaction") start to use a bounce buffer to initialize the argument
for LDELF_DL_ENTRY_DLSYM. However, it also reads the result of
LDELF_DL_ENTRY_DLSYM from the bounce buffer. This is an error since the
result of LDELF_DL_ENTRY_DLSYM still remains on the stack used by ldelf.
So fix this by reading the result from the ldelf stack.

Fixes: 52e7b1a ("core: use user-access functions in ldelf interaction")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Replaces direct user memory accesses in the socket PTA with
copy_to_user() and copy_from_user(). This avoids PAN errors when PAN is
active.

Fixes: 4e15432 ("core: Apply finer-grained PAN")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Fixes a direct user memory access in syscall_storage_obj_rename() which
can lead to a data abort if PAN is enabled.

Fixes: 84f7897 ("core: use user-access functions for storage svc")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Updates the create(), read(), and write() function pointers in struct
ts_store_ops to take a user space buffer in addition to the previous
core buffer. Core buffers are normal secure memory while user space
buffers should only be accessed using the user_access.h functions.

The different FS storage implementations are updated accordingly.

Note that the RPMB FS storage implementation resorts to using
enter_user_access() and exit_user_access() due to internal complexities.

Fixes: 4e15432 ("core: Apply finer-grained PAN")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Adds the wrapper function read_pan() to read PSTATE.PAN, also adds a
SPSR_64_PAN define.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
Adds the helper function feat_pan_implemented() to extract the
implemented PAN version. No version is 0 so this function can be used
tested as a boolean too.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
When setup_unwind_user_mode() prepares to resume execution after
syscall_sys_return() or when a thread is suspended a new SPSR is
fabricated base on the current PSTATE.

Until now when remaining in S-EL1 to fabricate an SPSR only the
PSTATE.DAIF bits had to be taken into account. However, with PSTATE.PAN
there's yet another bit to consider. Since PSTATE has a few more bits
and more may be added as AArch64 evolves this problem is only going to
get worse. So implement this in a single internal C function to replace
current open codes C and assembly versions.

The AArch64 assembly versions of thread_rpc() are renamed to
thread_rpc_spsr() to indicate that SPSR is passed in the second argument
instead of having it open coded internally in the assembly function.

New C wrapper functions are added to preserve the old thread_rpc()
interface as needed.

handle_user_mode_panic() is still basing its created SPSR on the saved
SPSR from S-EL0, but now PAN bit is copied too.

Fixes: 6fa59c9 ("arm64: Introduce permissive PAN implementation")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org>
@jenswi-linaro
Copy link
Contributor Author

Comments addressed, tag applied.

@jforissier
Copy link
Contributor

@gagachang @seonghp any comment?

@gagachang
Copy link
Contributor

@gagachang @seonghp any comment?

LGTM, I tested this PR on RISC-V and worked well.

@seonghp
Copy link
Contributor

seonghp commented Aug 2, 2023

LGTM. Thanks for your work!

@jforissier jforissier merged commit 0e84f8a into OP-TEE:master Aug 3, 2023
6 checks passed
@jenswi-linaro jenswi-linaro deleted the pan_fix branch August 4, 2023 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants