FAQ Style Article

Linux Scheduler – CFS and Red Black Tree

In Linux Scheduler, Work In Progress on June 8, 2012 at 1:56 pm

This article explains how Red Black Tree data structure is used in Linux Scheduler.

What is a Red Black Tree?
It is a type of self-balancing binary search tree, a data structure used in computer science, typically to implement associative arrays.

What is binary tree?
A tree of nodes, where each node has two children – left and right, and one parent.

What are the properties of RB tree?
(a) It is self balancing: no path in the tree will ever be more than twice as long as any other.
(b) Operations (insertion, search and delete) occur in O(log n) time (where n is the number of nodes in the tree)

Where is it used in Linux?
It is used by CFS scheduler implementation and many other places.

How does CFS use RB tree?
To represent tasks in tree and to find out which task to run next.
Each task is stored in RB tree based on its virtual run time(vruntime).
Left most node in tree will be one with least vruntime.
When CFS needs to pick next task to run, it picks left most node.

Where is RB tree declared and defined in Linux?
linux/rbtree.h – Declarations of the RB Tree
linux/lib/rbtree.c – Implementation of RB tree functions

What are major structures related to RB tree
struct rb_root – RB tree itself
struct rb_node – Node of RB tree

What does rb_root structure contain?
Just single member, pointer to root node.
struct rb_root
{
struct rb_node *rb_node;
};

What does rb_node contain?
Three members
struct rb_node
{
unsigned long rb_parent_color; /* Color of the parent – needed for implementation of RB tree */
struct rb_node *rb_right; /* Right child */
struct rb_node *rb_left; /* Left child */
};

What are the major functions/macros provided by RB tree?
rb_erase(node, tree)
– Delete a node from tree

rb_link_node(node, parent, link)
– Insert a node to either right or left tree as indicated by link.

rb_insert_color(node, tree)
– Rebalance tree, called in conjunction with rb_link_node

rb_entry(node, type, member)
– Return the address of structure of which has set to

rb_next(node)
– Return the node next to

How does CFS use RB tree?
There are two related things
(a) Where and how CFS uses rb_node (RB node structure)?
(b) Where and how CFS uses rb_root (RB tree structure)?

Which CFS structure uses rb_root?
CFS main contain context structure is struct cfs_rq.
This represents CFS run queue (list of tasks that can be run on CPU in simplest case).
Defined in sched.c, cfs_rq contains information like number of tasks, amongst other things.

What is the relation between struct cfs_rq and struct rb_root?
struct cfs_rq contains member of type rb_root;
struct cfs_rq
{

struct rb_root tasks_timeline;
…..
}
cfs_rq.tasks_timeline is the root of RB tree representing runnable tasks in CFS run queue.

Which CFS structure(s) uses rb_node?
rb_node is used by struct cfs_rq and struct sched_entity.

What is struct sched_entity?
CFS declares struct sched_entity to represent scheduleable entity.
In a basic case, a schedulable entity consists of single task’s scheduling information.
Defined in sched.h, sched_entity structure contains weight of the task amongst other information

struct sched_entity contains member of type rb_node.
struct sched_entity
{
…..
struct rb_node run_node;
…..
};
This allows, sched_entity to be treated as if it is of type struct rb_node as far RB functions are concerned.

For what does cfs_rq use rb_node?
struct cfs_rq contains member of type struct rb_root *;
sturct cfs_rq
{
…..
struct rb_node *rb_leftmost;
…..
}
rb_leftmost points to left most mode in RB tree. More details on why this is needed later.

Which are key functions of CFS related to RB tree?

__enqueue_entity()
– Add a scheduling entity to RB tree, based on scheduling entity’s virtual run time.

__dequeue_entity()
– Removes a scheduling entity from RB tree

__pick_first_entity()
– Returns left most scheduling entity of CFS run queue

__pick_next_entity()
– Returns node next to one passed.

More about __enqueue_entity
Defined:
sched_fair.c
Prototype:
static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se);
Description:
Based on entity virtual run time(se->vruntime), RB node (se->run_node) will be inserted at right place, inside CFS runqueue tree (cfs_rq->tasks_timeline)
CFS Runqueue’s left most node (cfs_rq->rb_leftmost) is updated, if new node has least vruntime.

More about __dequeue_entity
Defined:
sched_fair.c
Prototype:
static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
Description:
Remove RB node (se->run_node) from CFS runqueue RB tree (cfs_rq->tasks_timeline)
If leftmost is being removed, update cfs_rq->rb_leftmost to next node of removed node.

More about __pick_first_entity
Defined:
sched_fair.c
Prototype:
static struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
Description:
Return scheduling entity of RB tree left most node (cfs_rq->rb_leftmost)

More about __pick_next_entity
Defined:
sched_fair.c
Prototype:
static struct sched_entity *__pick_next_entity(struct sched_entity *se)
Description:
Return scheduling entity of RB node next to one passed (se->run_node)

Leave a comment