TL;DR
On multiprocessors/multicores engines, more than one RT SCHED_FIFO threads may be scheduled on more than one execution unit. So thread with priority 60 and thread with priority 40 may run simultaneously on 2 different cores.
This may be counter-intuitive, especially when simulating embedded systems that (often as today) run on single core processors and rely on strict priority execution.
See my other answer in this post for summary
Original problem description
I have difficulties even with very simple code to make Linux respect the priority of my threads with scheduling policy SCHED_FIFO.
- See MCVE at the end of the question.
- See modified MCVE in answer
This situation comes from the need to simulate an embedded code under a Linux PC in order to perform integration tests
The main
thread with fifo priority 10
will launch the thread divisor
and ratio
.
divisor
thread should get priority 2
so that the ratio
thread with priority 1
will not evaluate a/b before b gets a decent value ( this is a completely hypothetical scenario only for the MCVE, not a real life case with semaphores or condition variables ).
Potential Prerequiste: You need to be root or BETTER to setcap the program so that to can change the scheduling policy and priority
sudo setcap cap_sys_nice+ep main
johndoe@VirtualBox:~/Code/gdb_sched_fifo$ getcap main
main = cap_sys_nice+ep
First experiments were done under Virtualbox environment with 2 vCPUs(gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git) were code behaviour was almost OK
under normal execution but NOK
under GDB.
Other experiments on Native Ubuntu 20.04 show very frequent NOK
behaviours even in normal execution with I3-1005 2C/4T (gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU gdb (Ubuntu 9.1-0ubuntu1) 9.1 )
Compile basically:
johndoe@VirtualBox:~/Code/gdb_sched_fifo$ g++ main.cc -o main -pthread
Normal execution sometimes OK sometimes not if no root or no setcap
johndoe@VirtualBox:~/Code/gdb_sched_fifo$ ./main
Problem with setschedparam: Operation not permitted(1) <<-- err msg if no root or setcap
Result: 0.333333 or Result: Inf <<-- 1/3 or div by 0
Normal execution OK (e.g with setcap )
johndoe@VirtualBox:~/Code/gdb_sched_fifo$ ./main
Result: 0.333333
Now if you want to debug this program you get again an the error message.
(gdb) run
Starting program: /home/johndoe/Code/gdb_sched_fifo/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7f929a6a9700 (LWP 2633)]
Problem with setschedparam: Operation not permitted(1) <<--- ERROR MSG
Result: inf <<--- DIV BY 0
[New Thread 0x7f9299ea8700 (LWP 2634)]
[Thread 0x7f929a6a9700 (LWP 2633) exited]
[Thread 0x7f9299ea8700 (LWP 2634) exited]
[Inferior 1 (process 2629) exited normally]
This is explained in this question gdb appears to ignore executable capabilities ( allmost all answers may be relevant ).
So in my case I did
sudo setcap cap_sys_nice+ep /usr/bin/gdb
- create a ~/.gdbinit with
set startup-with-shell off
And as a result I got:
(gdb) run
Starting program: /home/johndoe/Code/gdb_sched_fifo/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff6e85700 (LWP 2691)]
Result: inf <<-- NO ERR MSG but DIV BY 0
[New Thread 0x7ffff6684700 (LWP 2692)]
[Thread 0x7ffff6e85700 (LWP 2691) exited]
[Thread 0x7ffff6684700 (LWP 2692) exited]
[Inferior 1 (process 2687) exited normally]
(gdb)
So conclusion and question
- I thought the only problem came from GDB
- Testing on another (non-virtual) target showed even worse results under normal execution
I saw other questions related to RT SCHED_FIFO not respected but I find that the answers have no or unclear conclusions. My MCVE is also much smaller with fewer potential side-effects
Linux SCHED_FIFO not respecting thread priorities
SCHED_FIFO higher priority thread is getting preempted by the SCHED_FIFO lower priority thread?
Comments brought some pieces of answer but I am still not convinced ... ( ... it should work like this )
The MCVE:
#include <iostream>
#include <thread>
#include <cstring>
double a = 1.0F;
double b = 0.0F;
void ratio(void)
{
struct sched_param param;
param.sched_priority = 1;
int ret = pthread_setschedparam(pthread_self(),SCHED_FIFO,¶m);
if ( 0 != ret )
std::cout << "Problem with setschedparam: " << std::strerror(errno) << '(' << errno << ')' << "
" << std::flush;
std::cout << "Result: " << a/b << "
" << std::flush;
}
void divisor(void)
{
struct sched_param param;
param.sched_priority = 2;
pthread_setschedparam(pthread_self(),SCHED_FIFO,¶m);
b = 3.0F;
std::this_thread::sleep_for(std::chrono::milliseconds(2000u));
}
int main(int argc, char * argv[])
{
struct sched_param param;
param.sched_priority = 10;
pthread_setschedparam(pthread_self(),SCHED_FIFO,¶m);
std::thread thr_ratio(ratio);
std::thread thr_divisor(divisor);
thr_ratio.join();
thr_divisor.join();
return 0;
}
See Question&Answers more detail:
os