build and run modules [LDD3 02]

模块化是Linux kernel driver的立身之本，几乎所有的device driver都是module driver，在kernel启动的时候动态加载，学会如何编写module driver是编写device driver的必修课。

#include <linux/init.h> 
#include <linux/module.h> 
MODULE_LICENSE("Dual BSD/GPL");

static int hello_init(void)
{
    printk(KERN_ALERT "Hello, world\n");
    return 0;
}
     
static void hello_exit(void)
{
    printk(KERN_ALERT "Goodbye, cruel world\n");
}

module_init(hello_init);
module_exit(hello_exit);

直接拿书里的例子看。

无论多么复杂的device driver，都是这个骨架，module driver的init和exit，就是driver和kernel接触的第一次和最后一次。module driver在init做完以后，其实就不干活了，躺在那里睡大觉。那么什么时候干活呢？driver的运行是基于event-driven模型的，就是说要有event过来，driver才会有事情做。这很合理，我写了一个USB 设备的驱动，比如键盘，如果没有人使用键盘，那键盘驱动干啥活。虽然依赖于event，但是在init的时候，driver也有很多事情可以做，比如根据设备类型进行注册，设置中断ISR，初始化设备等等，初始化做完以后，device和device driver的信息kernel就会记录下来，一旦有人使用了你的设备，驱动对应的function就会被调用。如果device driver被卸载，那么就要把init中申请的资源按序释放，这个非常重要，因为module driver的一大好处就是可以不用重启系统，实现device driver的安装和卸载。如果init/exit没有做好，那driver很有可能工作不起来，甚至把kernel搞死，那就呜呼哀哉了：（

这个sample code只是给人大概的框架和印象，module driver是如何安装/运行/卸载的。kernel mode driver和应用程序有很大的不同：

1. kernel mode driver无一例外都是事件驱动的，依赖于user mode或者kernel mode事件来触发操作，应用程序有的是，有的不是。

2. 应用程序的资源可以晚一点释放，kenrel mode driver则在exit的时候必须全部释放。

3. 应用程序可以调用别的库中的函数，module driver只能调用kernel中export出来的函数。（有些module driver可以主动export一些function出来给别的module用）

4. kernel 中没有float运算的支持，只能通过软件的方式来做。

5. 错误处理不同。应用程序出错，最多应用程序被kill，对系统和别的应用程序没有影响；如果kernel driver出了问题，至少当前的process挂了，严重点就是kernel panic，整个系统死掉，只能重启。

说到第五点，就要讨论user space和kernel space了，学过操作系统的人应该都了解。应用程序都运行在user space，kernel（包括所有的kernel driver）都运行在kernel space。这两种space会对应CPU不同的运行等级，现代CPU的运行等级有好几级，kernel code运行在最高等级，可以访问一切资源；user code运行在最低等级，只能访问有限的资源，而且对hardware的访问，都需要经过kernel。这种等级的限制，可以把user space的不可靠对系统造成的影响降到最低。另外，两种space也对应了两种不同的地址空间，user space对应了比较大的虚拟地址空间，比如32位上，每个应用程序在linux系统中有4G的地址空间，user space占了3G，kernel space只有高地址的1G，所有应用程序的高地址的这1G是共享的，意思就是每个人虚拟地址空间的1G内容都是一样的，都是当前正在运行的kernel，只有剩下的3G是每个应用程序私有的。

既然应用程序需要通过kernel访问硬件资源，那程序执行过程中必然有从user mode转到kernel mode的过程，怎么发生的呢？系统调用和硬件中断。系统调用发生时，用户空间的代码通过系统调用进到了kernel space，同一个process的执行特权等级被提高到了最高等级，可以访问一切资源；硬件中断发生时，kernel会暂停当前CPU的process，并把CPU的使用权交给对应的ISR，中断处理完以后恢复现场。其实系统调用也是中断的一种，是一种软中断，和各种各样的调试器，如gdb等设置断点的原理类似，这里就不展开了。

下面很重要的一个，是kernel中的并发。

最开始做device driver的时候，其实比较难理解并发，为什么？因为有一种固定思维，我写了一个device driver，它就是一个整体，在被执行的时候是一起执行的，这种理解是非常错误的。kernel是一个非常庞大的系统，每个driver在把自己交给kernel的时候就粉身碎骨了。以char device为例，一般可以支持open/close/read/write/mmap等操作，这些操作是由user mode应用程序发起的，程序调用了open，driver对应的probe会被调用；程序调用了close，driver的close会被调用等。driver不再是一个整体，而是由user mode发起的事件所驱动，并且由kernel适时的调用，driver只需要按照要求把callback实现好就行了。想明白这点，就比较容易在driver的角度理解并发了，既然是事件驱动，那事件有可能不止一个，因为可能有不止一个程序使用你的设备。

并发在很多情况下都会发生，比如：

1. linux本身支持多个process同时运行，这就是上文讲的，同时有多个程序访问你的设备。

2. ISR，中断随时可能发生，ISR没有单独的process来执行，它会在当前的process里执行，也许你的driver正在处理程序的请求，而这时设备中断发生了。

3. kernel里的utility，比如timer等，定时器触发，也会调用你的driver。

正是因为存在这么多并发的可能性，所以要求driver要可重入（reentrant）。意思就是，driver被多个process同时调用，但是没有副作用。这就要求driver developer在coding的时候，脑子里要有这个意识，哪些function是会被同时调的，就在这些function里做好全局数据的保护，使用kernel的锁机制来保证自己的数据不会因为同时访问而损坏。因为kernel允许抢占，所以CPU可能在任何时候被人抢占，等回来时，说不定数据已经被人修改，所以一定要用锁做好数据保护。

到这里的时候，这本书提到了应用程序调用栈和kernel调用栈的问题。应用程序的栈可以很大，因此可以在stack上申请很多memory；但是对于kernel，stack就很小了，之前我写driver，碰到stack超过1K就不行的情况。说到调用栈，就想到了函数调用过程中的栈变化，这个留到以后专门写一篇文章讲一讲。

compile and loading

driver的compile没啥可讲的，主要是Makefile。in-tree和out-of-tree，这两种方式的编译还是有区别的。in-tree就是source code放在kernel的source code里面，这种方式除了需要driver自己的Makefile，还有driver所在目录的Kconfig和Makefile。以drm gpu为例，除了把driver的source code放在drivers/gpu/drm目录下外，还需要修改drm目录下的Kconfig和Makfile，保证kernel在build的时候能够找到driver的路径。out-of-tree就是driver的code在kernel code以外，build driver的时候需要指定kernel的build路径，以及设置module路径为当前driver路径。另外，说到driver的makefile，要提下obj-m或者obj-y这个东西，m表明driver按照module方式build，y则把driver做成built-in，说白了m build出来的driver是以ko的方式存在，加载以后通过lsmod可以看到driver；y build出来的driver在kernel image里，没有ko，在系统起来以后，用lsmod看不到driver。这个值，如果是在out-of-tree，则一定是m；如果是在in-tree，应该使用变量（自己driver里的CONFIG_XXXX），这样可以通过menuconfig可以控制build方式。

module driver load and unload

load通过insmod/modprobe的方式，unload用rmmod。在insmod的过程中，需要把module driver里引用的kernel function的symbol处理掉，通过查找kernel的symbol table来实现。

version depedency

module driver的版本号要和kernel的版本号一致。因为kernel里的结构体或者函数会发生变化，如果driver使用了这些结构体或者函数，那么就要保证和kernel的定义一致，否则一旦运行起来，有可能就把kernel搞挂。所以，kernel在load module driver之前，都会有检查机制。driver在build的时候会链接vermagic.o，里面就会保存build 的时候用的kernel信息，也就是driver的版本号。如果和当前运行的kernel不一致，kernel就不会load driver。

这一章还提到了kernel的symbol table，以及module stack的概念。这里提一下module stack，意思就是module driver之间的依赖，A 依赖B， B依赖C，ABC就形成了一个module stack。其实现在复杂一些的device，都具有多个功能，这些功能需要依赖于kernel的一些module，这就形成了一个module stack。比如GPU，GPU都是挂在PCIE总线上，依赖于kernel的drm/kms等module driver，这样一个完成的GPU driver运行起来，就依赖于kernel的很多module driver，形成了一个庞大的stack。

module driver里可以指定信息的一些宏：

MODULE_LICENSE  //license that module driver follow
MODULE_AUTHOR //author of this module driver
MODULE_DESCRIPTION  //human read-able description about this module driver
MODULE_VERSION   //version of this module driver
MODULE_ALIAS   //module another name
MODULE_DEVICE_TABLE  //tell user space which device this driver support

还有两个特殊的宏，module_init和module_exit。module_init是driver被load的时候被调用，module_exit是在driver被unload或者系统关机的时候调用，其他的时候被调用都是error。如果module driver没有定义module_exit，那么意味着module driver不允许被卸载。

module parameters

insmod/modprobe的时候可以通过module parameter的方式给module driver传递参数。这些参数根据权限的设置，会在/sys/module下创建对应的parameter node，并且根据module driver中声明的权限，具有可读可写等属性。可写就意味着这个参数在driver运行时随时会发生变化，这就要求driver实时的检测参数的变化并作出反应。如果perm设置为0，则参数在/sys/module下没有entry，如果设置为S_IRUGO表示只读，如果为S_IRUGO | S_IWUSR表示只有root用户有修改的权限。

最后附上本章的keyword list：

insmod

modprobe

rmmod

User-space utilities that load modules into the running kernels and remove them.

#include <linux/init.h>

module_init(init_function);

module_exit(cleanup_function);

Macros that designate a module’s initialization and cleanup functions.

_ _init
_ _initdata

_ _exit
_ _exitdata

Markers for functions(__init and__exit)and data(__initdata and__exitdata) that are only used at module initialization or cleanup time. Items marked for ini- tialization may be discarded once initialization completes; the exit items may be discarded if module unloading has not been configured into the kernel. These markers work by causing the relevant objects to be placed in a special ELF sec- tion in the executable file.

#include <linux/sched.h>

One of the most important header files. This file contains definitions of much of the kernel API used by the driver, including functions for sleeping and numer- ous variable declarations.

struct task_struct *current;

The current process.

current->pid

current->comm

The process ID and command name for the current process.

obj-m

A makefile symbol used by the kernel build system to determine which modules should be built in the current directory.

/sys/module

/proc/modules

/sys/module is a sysfs directory hierarchy containing information on currently- loaded modules. /proc/modules is the older, single-file version of that informa- tion. Entries contain the module name, the amount of memory each module occupies, and the usage count. Extra strings are appended to each line to specify flags that are currently active for the module.

vermagic.o

An object file from the kernel source directory that describes the environment a module was built for.

#include <linux/module.h>

Required header. It must be included by a module source.

#include <linux/version.h>

A header file containing information on the version of the kernel being built.

LINUX_VERSION_CODE
Integer macro, useful to #ifdef version dependencies.

EXPORT_SYMBOL (symbol);
EXPORT_SYMBOL_GPL (symbol);

Macro used to export a symbol to the kernel. The second form exports without using versioning information, and the third limits the export to GPL-licensed modules.

MODULE_AUTHOR(author);
MODULE_DESCRIPTION(description);
MODULE_VERSION(version_string);
MODULE_DEVICE_TABLE(table_info);
MODULE_ALIAS(alternate_name);

Place documentation on the module in the object file.

module_init(init_function);
module_exit(exit_function);

Macros that declare a module’s initialization and cleanup functions.

#include <linux/moduleparam.h>

module_param(variable, type, perm);

Macro that creates a module parameter that can be adjusted by the user when the module is loaded (or at boot time for built-in code). The type can be one of bool, charp, int, invbool, long, short, ushort, uint, ulong, or intarray.

#include <linux/kernel.h>
int printk(const char * fmt, ...);

The analogue of printf for kernel code.

scutth

发布了32 篇原创文章 · 获赞 6 · 访问量 8万+

私信关注

build and run modules [LDD3 02]

猜你喜欢