linux kernel development 阅读笔记

#第三章 进程管理

#什么是进程

Linux has a unique implementation of threads: It does not differentiate between threads and processes. To Linux, a thread is just a special kind of process.

对 linux 来说 ,thread 和 process 并没有区别。
thread 只是特殊的 process

On modern operating systems, processes provide two virtualizations: a virtualized processor and virtual memory. The virtual processor gives the process the illusion that it alone monopolizes the system, despite possibly sharing the processor among hundreds of other processes.

Chapter 4, “Process Scheduling,” discusses this virtualization.

Virtual memory lets the process allocate and manage memory as if it alone owned all the memory in the system.

Virtual memory is covered in Chapter 12, “Memory Management.”

对 process 来说, process 好像独占cpu 和 memory。

#进程描述符

#什么是进程描述符

The task_struct is a relatively large data structure, at around 1.7 kilobytes on a 32-bit machine. This size, however, is quite small considering that the structure contains all the information that the kernel has and needs about a process. The process descriptor contains the data that describes the executing program—open files, the process’s address space, pending signals, the process’s state, and much more.

#进程描述符放在那里

The kernel stores the list of processes in a circular doubly linked list called the task list.

在kernel中 进程描述符存储在环形双向链表(task list)中。

进程描述符 process descriptor ,定义在 <linux/sched.h>
Understanding the Linux kernel 笔记(三) 第三章 进程 有贴出pd全部的定义
进程描述符在kernel的栈空间中所在的位置

The system identifies processes by a unique process identification value or PID.

PID 有个默认的最大值 32768

this is controlled in <linux/threads.h>

1
2
3
4
5
6
7
8
9
/*
* This controls the default maximum pid allocated to a process
*/
#define PID_MAX_DEFAULT 0x8000

/*
* A maximum of 4 million PIDs should be enough for a while:
*/
#define PID_MAX_LIMIT (sizeof(long) > 4 ? 4*1024*1024 : PID_MAX_DEFAULT)

the administrator may increase the maximum value via /proc/sys/kernel/pid_max

PID 最大值可通过改变 /proc/sys/kernel/pid_max 修改

1
2
# cat /proc/sys/kernel/pid_max
32768

x86 架构不能将 当前运行的process 的 PD 存在寄存器中,因为它的寄存器太少了,但是 ppc 架构的可以 ,ppc 架构的寄存器多。

所以还是IBM厉害,寄存器都给的这么奢侈~~

Contrast this approach with that taken by PowerPC (IBM’s modern RISC-based microprocessor), which stores the current task_struct in a register. Thus, current on PPC merely returns the value stored in the register r2. PPC can take this approach because, unlike x86, it has plenty of registers. Because accessing the process descriptor is a common and important job, the PPC kernel developers deem using a register worthy for the task.

x86 架构存储的是 thread_info <The thread_info structure is defined on x86 in <include/asm-x86_64/thread_info.h>>

1
2
3
4
5
6
7
8
9
10
11
12
struct thread_info {
struct task_struct *task;// a point to pd
/* main task structure */
struct exec_domain *exec_domain; /* execution domain */
__u32 flags; /* low level flags */
__u32 status; /* thread synchronous flags */
__u32 cpu; /* current CPU */
int preempt_count;

mm_segment_t addr_limit;
struct restart_block restart_block;
};

#进程状态

pd 结构体中的 状态变量

1
volatile long state;	/* -1 unrunnable, 0 runnable, >0 stopped */

一共5

  • TASK_RUNNING
    正在运行或在运行队列中

  • TASK_INTERRUPTIBLE
    睡眠状态

  • TASK_UNINTERRUPTIBLE
    这个状态不好理解,意思是即使有其需要的资源,他也不一定会运行

• TASK_UNINTERRUPTIBLE—This state is identical to TASK_INTERRUPTIBLE except that it does not wake up and become runnable if it receives a signal. This is used in situations where the process must wait without interruption or when the event is expected to occur quite quickly. Because the task does not respond to signals in this state, TASK_UNINTERRUPTIBLE is less often used than TASK_INTERRUPTIBLE.

This is why you have those dreaded unkillable processes with state D in ps(1). Because the task will not respond to signals, you cannot send it a SIGKILL signal. Further, even if you could terminate the task, it would not be wise because the task is supposedly in the middle of an important operation and may hold a semaphore.

  • __TASK_TRACED

  • __TASK_STOPPED

#如何改变进程状态

1
set_task_state(task, state);        /* set task 'task' to state 'state' */

在linux/sched.h 中set_task_state函数 定义为 set_mb

1
2
#define set_task_state(tsk, state_value)		\
set_mb((tsk)->state, (state_value))

set_mb在不同架构的处理器上的具体实现不一样

x86_64

include/asm-x86_64/system.h

1
#define set_mb(var, value) do { xchg(&var, value); } while (0)

ppc

include/asm-ppc/system.h

1
#define set_mb(var, value)	do { var = value; mb(); } while (0)

#进程上下文