Linux Kernel
Process Scheduling
9662541 張文軒
9665510 許晉榮
9665531 林奕翔
什麼是Scheduling??
簡單來說就是安排Process執行的順序
那幹嘛要安排順序呢?
要可以讓很多很多程式一起執行
要讓所有程Process都可以運作順暢 反應快速
不能讓Process插隊來 插隊去
要讓大家都公平
雖然要公平 但事情也事有輕重緩急
Linux process scheduling 用了什麼方法?
就三個字
優先權
Linux process scheduling 的法寶是啥 ?
Process preemption
Heuristic algorithm
2.6的新特性
Scheduler v2.4
User mode Kernel mode User mode
Scheduler v2.6
User mode Kernel mode User mode Kernel mode
Process A
Scheduler v2.4 Process B
Scheduler v2.6
為啥要讓人家插隊?
情非得已
大局為重
插隊的方法?
當然Process要是個可以執行
TASK_RUNNING
要有暗號
TIF_NEED_RESCHED
怎樣的情況下要讓人家插隊?
老大來了
高優先權的Process
位子佔用得太久了
Exceeds time quantum
什麼是Time quantum?
就是把時間分段
那為什麼要把時間分段
三個字
趕流行
那時間要分多少段?
一段又要多長?
這應該是NP-complete吧!?
太短會怎樣?
會很累
太長又會怎樣
會等很久
那到底要怎麼辦?
看看 Linux kernel 2.6 的作法吧
Description Static Nice Base time Interactive Sleep time
priority value quantum delta threshold
Highest static 100 -20 800 ms -3 299 ms
priority
High static 110 -10 600 ms -1 499 ms
priority
Default static 120 0 100 ms +2 799 ms
priority
Low static 130 +10 50 ms +4 999 ms
priority
Lowest static 139 +19 5 ms +6 1199 ms
priority
這個表有什麼意義嗎?
可以算出base time quantum
Base time quantum =
1)if static priority = 120
(140-static priority)*5
什麼是static priority?
這是要讓一般的process可以區分出優先權用地
既然有static priority,那有dynamic priority嗎?
當然有,而且這才是scheduler做
scheduling真正的依據
Dynamic priority =
Max(100, min(static priority –
bonus+5, 139))
Average sleep time Bonus
0 ms ~ 100 ms 0
100 ms ~ 200 ms 1
200 ms ~ 300 ms 2
300 ms ~ 400 ms 3
400 ms ~ 500 ms 4
500 ms ~ 600 ms 5
600 ms ~ 700 ms 6
700 ms ~ 800 ms 7
800 ms ~ 900 ms 8
900 ms ~ 1000 ms 9
1 second 10
這個表又有什麼玄機呢?
dynamic priority= static priority/4 -28
static priority/4 -28 is called interactive delta
為什麼要判斷他是interactive process?
一般來說,process可以分成
Interactive process
Batch process
Real-time process
CPU-bound or IO-bound process
要是可以適當的判斷,對於scheduling會有幫助
那到底scheduling是如何實作的?
Type Name Description
unsigned long thread_info- Stores the TIF_NEED_RESCHED flag, which is set if the
>flags scheduler must be invoked.
unsigned int thread_info- Logical number of the CPU owning the runqueue to which
>cpu the runnable process belongs.
unsigned long sleep_avg Average sleep time of the process.
int prio Dynamic priority of the process.
int static_prio Static priority of the process.
prio_array_t * array Pointer to the runqueue's prio_array_t set that includes the
process.
unsigned long policy The scheduling class of the process.
unsigned int time_slice Ticks left in the time quantum of the process.
unsigned long rt_priority Real-time priority of the process.
Type Name Description
spinlock_t lock Spin lock protecting the lists of processes.
unsigned long nr_running Number of runnable processes in the runqueue lists.
unsigned long nr_switches Number of process switches performed by the CPU.
unsigned long nr_uninterru Number of processes that were previously in the runqueue
ptible lists and are now sleeping in TASK_UNINTERRUPTIBLE state.
unsigned long expired_tim Insertion time of the eldest process in the expired lists.
estamp
task_t * curr Process descriptor pointer of the currently running process.
task_t * idle Process descriptor pointer of the swapper process for this
CPU.
Type Name Description
prio_array_t * active Pointer to the lists of active processes.
prio_array_t * expired Pointer to the lists of expired processes.
prio_array_t [2] arrays The two sets of active and expired processes.
atomic_t nr_iowait Number of processes that were previously in the runqueue
lists and are now waiting for a disk I/O operation to
complete.
什麼是active process list?
那什麼又是expired process list?
在說明之前,先來說說kernel 2.4
和2.6排程方法上的差異吧!
那分成active和expired process list 的用意為何?
其實,選擇process上的演算法,對於Scheduling
的效率影響很大
在kernel 2.4中,選擇process執行的演算法為O(n)
在kernel 2.6中,選擇process執行的演算法為O(1)
那如何這麼厲害的方法是怎樣達到的?
bit 2 priority 2
bit 0 priority 0
Queue of runnable tasks for priority 2
Bit Map
sched_find_first_bit()
schedule()
那在2.4中,為什麼無法達到此境界?
1. Initialize some local variables
2. Release the kernel lock
Get the lock of run queue
3. Set the status of previous process
4. Calculate goodness of each
process. And get the next process
to run.
All processes use
up it time quantum
5. Start a new epoch and reset
the time quantum Select one
process
6. Perform context switching
Select the same
process
7. Return
goodness(p, this_cpu, prev->active_mm)
每次都算,當然要很久拉
O(N)
所以,到底2.6總共改進了哪些功能?
Ingo Molnar 所設計的 New ultra-scalable O(1)
Robert Love 所設計的 Preemptible kernel bits
schedule
Heuristic algorithm
Multi-process support
那這些改進,真的有效用嗎?到底改進了哪些部分?
More efficient
More scalable
說了那麼多,怎麼還沒開始trace code??
Linux kernel 2.6.24
include/linux/Sched.h
kernel/Sched.c
include/linux/Sched.h
kernel/Sched.c
Scheduler怎麼開始?
kernel/Sched.c
Scheduling的核心作法?
一些相關地重要function
計算Process priority & time slice
kernel/Sched.c
kernel/Sched.c
如何把睡著的process抓來執行
kernel/Sched.c
嗚,說了這麼多,kernel 2.6還有什
麼特別功能嗎?
當然有,那就是對 multiprocessor
有更好的支援!!
一些跟scheduling跟
multiprocessor有關的function
kernel/Sched.c
kernel/Sched.c
說了那麼多,也該是總結了
大幅改進kernel 2.4的效能
New ultra-scalable
Preemptible kernel bits schedule
Heuristic algorithm
支援real-time
對於multiprocessor有更好的支援
Reference
• Understanding the Linux Kernel, 3rd Edition, Daniel P. Bovet,
Marco Cesati
• Linux 2.4/2.6 核心排程機制剖析
http://loda.zhupiter.com/Linux%202.4-
2.6%20KernelSchedulandRealTimeSupport_2.1.htm
• Linux 2.6調度系統分析
http://blog.chinaunix.net/u/7270/showart_300343.html
• Previous PowerPoint Slices
Q&A