CFS scheduler to appear in Linux kernel 2.6.23

The Linux kernel process scheduler, as you know it, has been completely ripped out and replaced with a completely new one called Completely Fair Scheduler (CFS). How fair it will be, remains to be seen, but in the meantime here's what its original creator Ingo Molnar has to say on the subject:

80% of CFS's design can be summed up in a single sentence: CFS basically models an "ideal, precise multi-tasking CPU" on real hardware.

"Ideal multi-tasking CPU" is a (non-existent :-)) CPU that has 100% physical power and which can run each task at precise equal speed, in parallel, each at 1/nr_running speed. For example: if there are 2 tasks running then it runs each at 50% physical power - totally in parallel.

On real hardware, we can run only a single task at once, so while that one task runs, the other tasks that are waiting for the CPU are at a disadvantage - the current task gets an unfair amount of CPU time. In CFS this fairness imbalance is expressed and tracked via the per-task p->wait_runtime (nanosec-unit) value. "wait_runtime" is the amount of time the task should now run on the CPU for it to become completely fair and balanced.

CFS is very young, having been only about three months in the development, but I suppose we should trust Ingo when he says:

This project is a complete rewrite of the Linux task scheduler. My goal is to address various feature requests and to fix deficiencies in the vanilla scheduler that were suggested/found in the past few years, both for desktop scheduling and for server scheduling workloads.

And also credit where credit's due, Con Kolivas was the first one to prove via RSDL/SD that 'fair scheduling' is possible and that it results in better desktop scheduling.

More info:
The Completely Fair Scheduler
Schedulers: the plot thickens

Plain text icon sched-design-CFS.txt5.77 KB



So yet another scheduler change in a 'stable' kernel series?

Anonymous, are you suggesting that an entirely new unstable series should be created just for this, creating a Linux 2.8? Yes, process scheduling is one of the most fundamental parts of the OS, but with a top/bottom-half design (which Linux has), the policy of the scheduler (how it makes its decisions) can be changed independently of the procedure (how to changes processes). Changing the policy has considerably less effect in terms of introducing bugs in the code and it can be tested rather easily in comparison.

Having worked on a policy/procedure-separated scheduler, and having studied the previous Linux 2.6 scheduling code, I don't think that changing the scheduling policy, even if it does require a few extra fields to the process control block, really warrants a new unstable series. Compared to the changes made in moving from 2.4 to 2.6 (driver models, memory models, and a handful of other critically low-level components), this is quite minor.

Ingo Molnar's the 'original creator' of CFS ? Shame, shame, shame.