4.4: Scheduler
We have already learned a lot of details about the Linux scheduler inner workings, so there is not so much left for us. To make the whole picture complete in this chapter we will take a look at 2 important scheduler entry points:
- scheduler_tick() function, which is called at each timer interrupt.
- schedule() function, which is called each time when the current task needs to be rescheduled.
The third major thing that we are going to investigate in this chapter is the concept of context switch. A context switch is the process that suspends the current task and runs another task instead - this process is highly architecture specific and closely correlates with what we have been doing when working with RPi OS.
scheduler_tick
This function is important for 2 reasons:
- It provides a way for the scheduler to update time statistics and runtime information for the current task.
- Runtime information then is used to determine whether the current task needs to be preempted, and if so
schedule()function is called.
As well as most of the previously explored functions, scheduler_tick is too complex to be fully explained - instead, as usual, I will just highlight the most important parts.
-
The main work is done inside CFS method task_tick_fair. This method calls entity_tick for the
sched_entitycorresponding to the current task. When looking at the source code, you may be wondering why instead of just callingentry_tickfor the currentsched_entry,for_each_sched_entitymacro is used instead?for_each_sched_entitydoesn’t iterate over allsched_entryin the system. Instead, it only traverses thesched_entryinheritance tree up to the root. This is useful when tasks are grouped - after updating runtime information for a particular task,sched_entrycorresponding to the whole group is also updated. -
entity_tick does 2 main things:
- Calls update_curr, which is responsible for updating task’s
vruntimeas well as runqueue’smin_vruntime. An important thing to remember here is thatvruntimeis always based on 2 things: how long task has actually been executed and tasks priority. - Calls check_preempt_tick, which checks whether the current task needs to be preempted. Preemption happens in 2 cases:
- If the current task has been running for too long (the comparison is made using normal time, not
vruntime). link - If there is a task with smaller
vruntimeand the difference betweenvruntimevalues is greater than some threshold. link
In both cases the current task is marked for preemption by calling resched_curr function.
- Calls update_curr, which is responsible for updating task’s
We have already seen in the previous chapter how calling resched_curr leads to TIF_NEED_RESCHED flag being set for the current task and eventually schedule being called.
That’s it about schedule_tick now we are finally ready to take a look at the schedule function.
schedule
We have already seen so many examples were schedule is used, so now you are probably anxious to see how this function actually works. You will be surprised to know that the internals of this function are rather simple.
- The main work is done inside __schedule function.
__schedulecalls pick_next_task which redirect most of the work to the pick_next_task_fair method of the CFS scheduler.- As you might expect
pick_next_task_fairin a normal case just selects the leftmost element from the red-black tree and returns it. It happens here. __schedulecalls context_switch, which does some preparation work and calls architecture specific __switch_to function, where low-level arch specific task parameters are prepared to the switch.__switch_tofirst switches some additional task components, like, for example, TLS (Thread-local Store) and saved floating point and NEON registers.- Actual switch takes place in the assembler cpu_switch_to function. This function should be already familiar to you because I copied it almost without any changes to the RPi OS. As you might remember, this function switches callee-saved registers and task stack. After it returns, the new task will be running using its own kernel stack.
Conclusion
Now we are done with the Linux scheduler. The good thing is that it appears to be not so difficult if you focus only on the very basic workflow. After you understand the basic workflow you probably might want to to make another path through the schedule code and pay more attention to the details, because there are so many of them. But for now, we are happy with our current understanding and ready to move to the following lesson, which describes user processes and system calls.
Previous Page
4.3 Process scheduler: Forking a task