Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seams that task sheduler handles timeouts incorrectly. #411

Open
asamuta opened this issue Sep 27, 2024 · 3 comments
Open

Seams that task sheduler handles timeouts incorrectly. #411

asamuta opened this issue Sep 27, 2024 · 3 comments
Labels
bug Something isn't working hardware New hardware or architecture support request

Comments

@asamuta
Copy link

asamuta commented Sep 27, 2024

System components timeout issue
Call to system functions like tx_event_flags_get, tx_semaphore_get etc. with timeout != TX_WAIT_FOREVER results in thread wake before timeout in some conditions.

Description
In my program I am using threadx with the netxduo on with STM32H723 processor.
From the high level the program creates server socket and receives commands from the network. There is a command to make a USART transaction and return the response back via the socket. The transaction is just USART send/receive with timeout.
There is "DeviceMaster" thread, it makes USART transactions. USART is configured to transmit/receive data using DMA and USART idle interrupt.

DMA TX ISR handle:

...
tx_event_flags_set(&deviceMaster.eventFlags, DM_WRITE_FLAG, TX_OR);
...

USART ISR:

    /* Check for IDLE line interrupt */
    if (LL_USART_IsEnabledIT_IDLE(UART_INSTANCE) && LL_USART_IsActiveFlag_IDLE(UART_INSTANCE)) {
        LL_USART_ClearFlag_IDLE(UART_INSTANCE);                     /* Clear IDLE line flag */
        ...
        if (tx_event_flags_set(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR) != TX_SUCCESS) {
            Error_Handler();
        }
    }

Transaction processing in the thread pseudo code, simplified:

// Checks if previous send completed
tx_event_flags_get(&deviceMaster.eventFlags, DM_WRITE_FLAG, TX_OR_CLEAR, &flags, 10);
sendDataToUsart();
// wait for remote device response with timeout=10
tx_event_flags_get(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR_CLEAR, &flags, 10);
getDataFromUsart()

Problem
Sometimes call tx_event_flags_get(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR_CLEAR, &flags, 10); takes 1-2 ticks instead of 10 (ends before timeout). I have investigated the behavior a little bit using debug out and TraceX. What I have fonud is that ethernet ISR happened when the "DeviceMaster" thread suspended wakes not only ethernet-related thread but the "DeviceMaster" as well. Attaching TraceX file.

TraceX good/expected flow
Event/Description

1017 - Transaction start marker; start DMA transfer
1018 - check for DM_WRITE_FLAG
1019 - DM_WRITE_FLAG set in USART TX DMA
1020 - Get time to check later (tx_get_time)
1021 - Wait for DM_READ_FLAG flag
1022 -  "DeviceMaster" suspended
...
1024 - DM_READ_FLAG set in the USART RX ISR
1025 - system resumes "DeviceMaster"
1026 - Transaction end marker

TraceX problem flow

1036 - Transaction start marker; start DMA transfer
1037 - check for DM_WRITE_FLAG
1038 - DM_WRITE_FLAG set in USART TX DMA
1039 - Get time to check later (tx_get_time)
1040 - Wait for DM_READ_FLAG flag
1041 - "DeviceMaster" suspended
1042 - "NetX ip instance" has started
...
1047 - set event
1048 - scheduler resumes "NetX ip instance"
1049 - scheduler resumes "DeviceMaster", but it still should wait for the timeout.
1050 - "DeviceMaster" gets console mutex to print timeout error
1051 - "DeviceMaster" puts console mutex back
1052 - "DeviceMaster" gets time to check how many ticks the thread was suspended.
1053 - Custom TraceX event that indicates the error and the number of ticks the thread was suspended = 2 ticks, not 10

Assumptions about root cause
I have taken a quick look at the threadx sources and assume that the following logic in tx_timer_expiration_process.c file probably is not correct. this and this

if (current_timer -> tx_timer_internal_remaining_ticks > TX_TIMER_ENTRIES) 
{
...
} else {
 /* Timer did expire.  */
}                    

Not sure, but looks like in case if tx_timer_internal_remaining_ticks < 32 the system considers the timer is expired. But in my case the timer value is just 10...

Expected behavior
The system scheduler wakes threads when the timeout happened not before.

Impact
Timeouts work incorrectly across the system

TraceX files
CustomEvents.zip
Ticks to microsecond is 544 in this case.

@asamuta asamuta added bug Something isn't working hardware New hardware or architecture support request labels Sep 27, 2024
@asamuta asamuta changed the title Seams that task sheduler incorrectly handles timeouts incorrectly. Seams that task sheduler handles timeouts incorrectly. Sep 27, 2024
@amgross
Copy link

amgross commented Sep 29, 2024

Hi @asamuta ,
The code you showed in threadx timer looks OK, current_timer is timer that expired:

current_timer = expired_timers;

And the check you are showing is to check if it expired due to real timeout or due to 32 ticks finished (in threaedx architecture, if some timer defined to more than TX_TIMER_ENTRIES, it will be in expired state each TX_TIMER_ENTRIES ticks, and then re-trigger till getting to 'real' expiration).

For me (as none threadx expert) it is hard to understand your problematic flow. If you can create minimal reproducible example it may be easier for you and for other users to debug it.

@asamuta
Copy link
Author

asamuta commented Sep 29, 2024

Hi @amgross,
The high level description of the problem:

result = tx_event_flags_get(&deviceMaster.eventFlags, DM_READ_FLAG, TX_OR_CLEAR, &flags, 10);

doesn't really wait 10 tick and wakes the thread after 2 ticks with result == 0x7 witch is

#define TX_NO_EVENTS                    ((UINT) 0x07)

And it hapens when another flag is set while my thread is waiting for its flag. (Ethernet ISR set a flag that is used by netxduo in my case).

@amgross
Copy link

amgross commented Sep 30, 2024

Sorry, without full code with minimal reproducible example (probably main which create one thread that one catch the event flag and the other releases it) I can't help, maybe someone else can

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hardware New hardware or architecture support request
Projects
None yet
Development

No branches or pull requests

2 participants