Linux Drivers for Beginners
Linux Drivers for Beginners
This series on Linux device drivers aims to present the usually technical topic in a way that is more interesting to a wider cross-section of readers.
Device controllers are typically connected to the CPU through their respectively named buses (collection of physical lines) for example, the PCI bus, the IDE bus, etc. In todays embedded world, we encounter more micro-controllers than CPUs; these are the CPU plus various device
controllers built onto a single chip. This effective embedding of device controllers primarily reduces cost and space, making it suitable for embedded systems. In such cases, the buses are integrated into the chip itself. Does this change anything for the drivers, or more generically, on the software front? The answer is, not much except that the bus drivers corresponding to the embedded device controllers are now developed under the architecture-specific umbrella.
The device-specific portion of a device driver remains the same across all operating systems, and is more about understanding and decoding the device data sheets than software programming. A data sheet for a device is a document with technical details of the device, including its operation, performance, programming, etc. in short a device user manual. Later, I shall show some examples of decoding data sheets as well. However, the OS-specific portion is the one that is tightly coupled with the OS mechanisms of user interfaces, and thus differentiates a Linux device driver from a Windows device driver and from a MacOS device driver.
Verticals
In Linux, a device driver provides a system call interface to the user; this is the boundary line between the so-called kernel space and user-space of Linux, as shown in Figure 2. Figure 3 provides further classification.
Based on the OS-specific interface of a driver, in Linux, a driver is broadly classified into three verticals: Packet-oriented or the network vertical Block-oriented or the storage vertical Byte-oriented or the character vertical The CPU vertical and memory vertical, taken together with the other three verticals, give the complete overview of the Linux kernel, like any textbook definition of an OS: An OS performs 5 management functions: CPU/process, memory, network, storage, device I/O. Though these two verticals could be classified as device drivers, where CPU and memory are the respective devices, they are treated differently, for many reasons. These are the core functionalities of any OS, be it a micro-kernel or a monolithic kernel. More often than not, adding code in these areas is mainly a Linux porting effort, which is typically done for a new CPU or architecture. Moreover, the code in these two verticals cannot be loaded or unloaded on the fly, unlike the other three verticals. Henceforth, when we talk about Linux device drivers, we mean to talk only about the latter three verticals in Figure 3. Lets get a little deeper into these three verticals. The network vertical consists of two parts: a) the network protocol stack, and b)the network interface card (NIC) device drivers, or simply network device drivers, which could be for Ethernet, Wi-Fi, or any other network horizontals. Storage, again, consists of two parts: a) File-system drivers, to decode the various formats on different partitions, and b) Block device drivers for various storage (hardware) protocols, i.e., horizontals like IDE, SCSI, MTD, etc. With this, you may wonder if that is the only set of devices for which you need drivers (or for which Linux has drivers). Hold on a moment; you certainly need drivers for the whole lot of devices that interface with the system, and Linux does have drivers for them. However, their byte-oriented
cessibility puts all of them under the character vertical this is, in reality, the majority bucket. In fact, because of the vast number of drivers in this vertical, character drivers have been further subclassified so you have tty drivers, input drivers, console drivers, frame-buffer drivers, sound drivers, etc. The typical horizontals here would be RS232, PS/2, VGA, I2C, I2S, SPI, etc.
Multiple-vertical drivers
One final note on the complete picture (placement of all the drivers in the Linux driver ecosystem): the horizontals like USB, PCI, etc, span below multiple verticals. Why is that? Simple you already know that you can have a USB Wi-Fi dongle, a USB pen drive, and a USB-toserial converter all are USB, but come under three different verticals! In Linux, bus drivers or the horizontals, are often split into two parts, or even two drivers: a) device controller-specific, and b) an abstraction layer over that for the verticals to interface, commonly called cores. A classic example would be the USB controller drivers ohci, ehci, etc., and the USB abstraction, usbcore.
Summing up
So, to conclude, a device driver is a piece of software that drives a device, though there are so many classifications. In case it drives only another piece of software, we call it just a driver. Examples are file-system drivers, usbcore, etc. Hence, all device drivers are drivers, but all drivers are not device drivers.
Device Drivers, Part 2: Writing Your First Linux Driver in the Classroom
This article, which is part of the series on Linux device drivers, deals with the concept of dynamically loading drivers, first writing a Linux driver, before building and then loading it.
Shweta and Pugs reached their classroom late, to find their professor already in the middle of a lecture. Shweta sheepishly asked for his permission to enter. An annoyed Professor Gopi responded, Come on! You guys are late again; what is your excuse, today? Pugs hurriedly replied that they had been discussing the very topic for that days class device drivers in Linux. Pugs was more than happy when the professor said, Good! Then explain about dynamic loading in Linux. If you get it right, the two of you are excused! Pugs knew that one way to make his professor happy was to criticise Windows. He explained, As we know, a typical driver installation on Windows needs a reboot for it to get activated. That is really not acceptable; suppose we need to do it on a server? Thats where Linux wins. In Linux, we can load or unload a driver on the fly, and it is active for use instantly after loading. Also, it is instantly disabled when unloaded. This is called dynamic loading and unloading of drivers in Linux. This impressed the professor. Okay! Take your seats, but make sure you are not late again. The professor continued to the class, Now you already know what is meant by dynamic loading and unloading of drivers, so Ill show you how to do it, before we move on to write our first Linux driver.
To dynamically load or unload a driver, use these commands, which reside in the /sbindirectory, and must be executed with root privileges: lsmod lists currently loaded modules insmod <module_file> inserts/loads the specified module file modprobe <module> inserts/loads the module, along with any dependencies rmmod <module> removes/unloads the module Lets look at the FAT filesystem-related drivers as an example. Figure 2 demonstrates this complete related process of experimentation. The m module files would be fat.ko, vfat.ko, etc., in thefat (vfat for older kernels) directory under /lib/modules/`uname -r`/kernel/fs. If they are in compressed .gz format, you need to uncompress them with gunzip, before you can insmod them.
The vfat module depends on the fat module, so fat.ko needs to be loaded first. To automatically perform decompression and dependency loading, use modprobe instead. Note that you shouldnt specify the .ko extension to the modules name, wh using the modprobecommand. rmmod is used when to unload the modules.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
static int __init ofd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofd registered"); return 0; } static void __exit ofd_exit(void) /* Destructor */ { printk(KERN_INFO "Alvida: ofd unregistered"); } module_init(ofd_init); module_exit(ofd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Our First Driver");
Given above is the complete code for our first driver; lets call it ofd.c. Note that there is nostdio.h (a user-space header); instead, we use the analogous kernel.h (a kernel space header). printk() is the equivalent of printf(). Additionally, version.h is included for the module version to be compatible with the kernel into which it is going to be loaded. TheMODULE_* macros populate module-related information, which acts like the modules signature.
To build a Linux driver, you need to have the kernel source (or, at least, the kernel headers) installed on your system. The kernel source is assumed to be installed at /usr/src/linux. If its at any other location on your system, specify the location in the KERNEL_SOURCE variable in thisMakefile.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# Makefile makefile of our first driver # if KERNELRELEASE is defined, we've been invoked from the # kernel build system and can use its language. ifneq (${KERNELRELEASE},) obj-m := ofd.o # Otherwise we were called directly from the command line. # Invoke the kernel build system. else KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} modules clean: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} clean endif
With the C code (ofd.c) and Makefile ready, all we need to do is invoke make to build our first driver (ofd.ko).
$ make make -C /usr/src/linux SUBDIRS=... modules make[1]: Entering directory `/usr/src/linux' CC [M] .../ofd.o Building modules, stage 2. MODPOST 1 modules CC .../ofd.mod.o LD [M] .../ofd.ko make[1]: Leaving directory `/usr/src/linux'
Summing up
Once we have the ofd.ko file, perform the usual steps as the root user, or with sudo.
# su # insmod ofd.ko # lsmod | head -10 lsmod should show you the ofd driver loaded.
While the students were trying their first module, the bell rang, marking the end of the session. Professor Gopi concluded, Currently, you may not be able to observe anything other than thelsmod listing showing the driver has loaded. Wheres the printk output gone? Find that out for yourselves, in the lab session, and update me with your findings. Also note that our first driver is a template for any driver you would write in Linux. Writing a specialised driver is just a matter of what gets filled into its constructor and destructor. So, our further learning will be to enhance this driver to achieve specific driver functionalities.
This article in the series on Linux device drivers deals with the kernels message logging, and kernel-specific GCC extensions.
Enthused by how Pugs impressed their professor in the last class, Shweta wanted to do so too. And there was soon an opportunity: finding out where the output of printk had gone. So, as soon as she entered the lab, she grabbed the best system, logged in, and began work. Knowing her professor well, she realised that he would have dropped a hint about the possible solution in the previous class itself. Going over what had been taught, she remembered the error output demonstration from insmod vfat.ko running dmesg | tail. She immediately tried that, and found the printk output there. But how did it come to be here? A tap on her shoulder roused her from her thoughts. Shall we go for a coffee? proposed Pugs. But I need to . I know what youre thinking about, interrupted Pugs. Lets go, Ill explain you all about dmesg.
/* system is unusable
*/
#define KERN_ALERT "<1>" /* action must be taken immediately */ #define KERN_CRIT "<2>" /* critical conditions */ #define KERN_ERR "<3>" /* error conditions */ #define KERN_WARNING "<4>" /* warning conditions RN_WARNING */ #define KERN_NOTICE "<5>" /* normal but significant condition */ #define KERN_INFO "<6>" /* informational */ #define KERN_DEBUG "<7>" /* debug-level messages */ Now depending on these log levels (i.e., the first three characters in the format string), thesyslog user-space e daemon redirects the corresponding messages to their configured locations. A typical destination is the log file /var/log/messages, for all log levels. Hence, all th printkoutputs are, by default, in that file. However, the they can be configured differently to a serial port (like /dev/ttyS0), for instance, or to all consoles, like what typically happens for KERN_EMERG. Now, /var/log/messages is buffered, and contains messages not only from the kernel, but also from various daemons running in user-space. Moreover, this file is often not readable by a normal user. Hence, a user space. userspace utility, dmesg, is provided to directly parse the kernel ring buffer, and dump it to standard output. Figure 1 shows snippets from the two.
Kernel C = pure C
Once back in the lab, Shweta remembered their professor mentioning that no /usr/includeheaders can be used for kernel programming. But Pugs had said that kernel C is just standard C with some GCC extensions. Why this conflict? Actually this is not a conflict. Standard C is pure C just the language. The headers are not part of it. Those are part of the standard libraries built in for C programmers, based on the concept of reusing code. Does that mean that all standard libraries, and hence, all ANSI standard functions, are not part of pure C? Yes, thats right. Then, was it really tough coding the kernel? Well, not for this reason. In reality, kernel developers have evolved their own set of required functions, which are all part of the kernel code. The printk function is just one of them. Similarly, many string functions, memory functions, and more, are all part of the kernel source, under various directories like kernel, ipc, lib, and so on, along with the corresponding headers under the include/linux directory. Oh yes! That is why we need to have the kernel source to build a driver, agreed Shweta. If not the complete source, at least the headers are a must. And that is why we have separate packages to install the complete kernel source, or just the kernel headers, added Pugs. In the lab, all the sources are set up. But if I want to try out drivers on my Linux system in my hostel room, how do I go about it? asked Shweta. Our lab has Fedora, where the kernel sources are typically installed under/usr/src/kernels/<kernelversion>, unlike the standard /usr/src/linux. Lab administrators must have installed it using the command-line yum install kernel-devel. I use Mandriva, and installed the kernel sources using urpmi kernel-source, replied Pugs. But I have Ubuntu, Shweta said. Okay! For that, just use apt-get utility to fetch the source possibly apt-get install linux-source, replied Pugs.
Summing up
The lab session was almost over when Shweta suddenly asked, out of curiosity, Hey Pugs, whats the next topic we are going to learn in our Linux device drivers class? Hmm most probably character drivers, threw back Pugs. With this information, Shweta hurriedly packed her bag and headed towards her room to set up the kernel sources, and try out the next driver on her own. In case you get stuck, just give me a call, smiled Pugs.
Ws of character drivers
We already know what drivers are, and why we need them. What is so special about character drivers? If we write drivers for byte-oriented operations (or, in C lingo, character oriented operations), then we refer to them oriented character-oriented as character drivers. Since the majority of devices are byte byte-oriented, the majority of device drivers are character device drivers. Take, for example, serial drivers, audio drivers, video drivers, camera drivers, and basic I/O drivers. In fact, all device drivers that are neither storage nor network device drivers are some type of a charact driver. Lets character look into the commonalities of these character drivers, and how Shweta wrote one of them.
As shown in Figure 1, for any user-space application to operate on a byte-oriented device (in hardware space), it should use the corresponding character device driver (in kernel space). Character driver usage is done through the corresponding character device file(s), linked to it through the virtual file system (VFS). What this means is that an application does the usual file operations on the character device file. Those operations are translated to the corresponding functions in the linked character device driver by the VFS. Those functions then do the final low-level access to the actual device to achieve the desired results. Note that though the application does the usual file operations, their outcome may not be the usual ones. Rather, they would be as driven by the corresponding functions in the device driver. For example, a write followed by a read may not fetch what has just been written to the character device file, unlike for regular files. Remember that this is the usual expected behaviour for device files. Lets take an audio device file as an example. What we write into it is the audio data we want to play back, say through a speaker. However, the read would get us audio data that we are recording, say through a microphone. The recorded data need not be the played-back data. In this complete connection from the application to the device, there are four major entities involved: 1. Application 2. Character device file 3. Character device driver 4. Character device The interesting thing is that all of these can exist independently on a system, without the other being present. The mere existence of these on a system doesnt mean they are linked to form the complete connection. Rather, they need to be explicitly connected. An application gets connected to a device file by invoking the open system call on the device file. Device file(s) are linked to the device driver by specific registrations done by the driver. The driver is linked to a device by its device-specific low-level operations. Thus we form the complete connection. With this, note that the character device file is not the actual device, but just a place-holder for the actual device.
+ int register_chrdev_region(dev_t first, unsigned int cnt, char *name); + int alloc_chrdev_region(dev_t *first, unsigned int firstminor, unsigned int cnt, char *name); The first API registers the cnt number of device file numbers, starting from first, with the given name. The second API dynamically figures out a free major number, and registers the cnt number of device file numbers starting from <the free major, firstminor>, with the given name. In either case, the /proc/devices kernel window lists the name with the registered major number. With this information, Shweta added the following into the first driver code: #include <linux/types.h> #include <linux/kdev_t.h> #include <linux/fs.h> static dev_t first; // Global variable for the first device number In the constructor, she added: if (alloc_chrdev_region(&first, 0, 3, "Shweta") < 0) { return -1; } printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MINOR(first)); In the destructor, she added: unregister_chrdev_region(first, 3); Its all put together, as follows:
MAJOR(first),
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
<linux/module.h> <linux/version.h> <linux/kernel.h> <linux/types.h> <linux/kdev_t.h> <linux/fs.h> first; // Global variable for the first device
static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"); if (alloc_chrdev_region(&first, 0, 3, "Shweta") < 0) { return -1; } printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MAJOR(first), MINOR(first)); return 0; } static void __exit ofcd_exit(void) /* Destructor */ { unregister_chrdev_region(first, 3); printk(KERN_INFO "Alvida: ofcd unregistered"); } module_init(ofcd_init); module_exit(ofcd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil
Kumar
Pugalia
<email_at_sarika-
Summing up
Additionally, before unloading the driver, she peeped into the /proc/devices kernel window to look for the registered major number with the name Shweta, using cat /proc/devices. It was right there. However, she couldnt find any device file created under /dev with the same major number, so she created them by hand, using mknod, and then tried reading and writing those. Figure 2 shows all these steps.
Please note that the major number 250 may vary from system to system, based on availability. Figure 2 also shows the results Shweta got from reading and writing one of the device files. That reminded her that the second step to connect the device file with the device driver which is linking the device file operations to the device driver functions was not yet done. She realised that she needed to dig around for more information to complete this step, and also to figure out the reason for the missing device files under /dev.
This article is a continuation of the series on Linux device drivers, and carries on the discussion on character drivers and their implementation.
In my previous article, I had mentioned that even with the registration for the <major, minor>device range, the device files were not created under /dev instead, Shweta had to create them manually, using mknod. However, on further study, Shweta figured out a way to automatically create the device files, using the udev daemon. She also learnt the second step to connect the device file with the device driver linking the device file operations to the device driver functions. Here is what she learnt.
struct class *cl = class_create(THIS_MODULE, "<device class name>"); Then, the device info (<major, minor>) under this class is populated by: device_create(cl, NULL, first, NULL, "<device name format>", ...); Here, the first is dev_t with the corresponding <major, minor>. The corresponding complementary or the inverse calls, which should be called in chronologically reverse order, are as follows: device_destroy(cl, first); class_destroy(cl); Refer to Figure 1 for the /sys entries created using chardrv as the <device class name>and mynull as the <device name format>. That also shows the device file, created by udev, based on the <major>:<minor> entry in the dev file.
In case of multiple minors, the device_create() and device_destroy() APIs may be put in the for loop, and the <device name format> string could be useful. For example, thedevice_create() call in a for loop indexed by i could be as follows:
File operations
Whatever system calls (or, more commonly, file operations) we talk of on a regular file, are applicable to device files as well. Thats what we say: a file is a file, and in Linux, almost everything is a file from the user user-space perspective. The difference lies in the kernel space, where the virtual file system (VFS) decodes the file type and transfers the file operations to the appropriate channel, like a filesystem module in case of a regular file or directory, and the corresponding device driver in case of a device file. Our discussion focuses on the second ding case. Now, for VFS to pass the device file operations onto the driver, it should have been informed about it. And yes, that is what is called registering the file operations by the driver with the VFS. This involves two steps. (The parenthesised code refers to the null driver code below.) First, lets fill in a file operations structure (struct file_operations pugs_fops) with the desired file operations (my_open, my_close, my_read, my_write, ) and initialise the character device structure (struct y_read cdev c_dev) with that, using cdev_init(). Then, hand this structure to the VFS using the call cdev_add(). Both cdev_init() andcdev_add() are declared in <linux/cdev.h>. Obviously, t actual file operations (my_open,my_close, my_read, my_write) the also had to be coded. So, to start with, lets keep them as simple as possible lets say, as easy as the null driver.
Following these steps, Shweta put the pieces together, attempting her first character device driver. Lets see what the outcome was. Heres the complete code ofcd.c:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
static dev_t first; // Global variable for the first device number static struct cdev c_dev; //Global variable for the char dev structure static struct class *cl; // Global variable for the device class static int my_open(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: open()\n"); return 0; } static int my_close(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: close()\n"); return 0; } static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); return 0; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); return len; } static struct file_operations pugs_fops = { .owner = THIS_MODULE, .open = my_open, .release = my_close, .read = my_read, .write = my_write }; static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"); if (alloc_chrdev_region(&first, 0, 1, "Shweta") < 0) { return -1; } if ((cl = class_create(THIS_MODULE, "chardrv")) == NULL) { unregister_chrdev_region(first, 1); return -1; } if (device_create(cl, NULL, first, NULL, "mynull") == NULL) {
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } cdev_init(&c_dev, &pugs_fops); if (cdev_add(&c_dev, first, 1) == -1) { device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } return 0; } static void __exit ofcd_exit(void) /* Destructor */ { cdev_del(&c_dev); device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); printk(KERN_INFO "Alvida: ofcd unregistered"); } module_init(ofcd_init); module_exit(ofcd_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Our First Character Driver");
Shweta repeated the usual build process, with some new test steps, as follows: 1. 2. 3. 4. 5. 6. Build the driver (.ko file) by running make. Load the driver using insmod. List the loaded modules using lsmod. List the major number allocated, using cat /proc/devices. null driver-specific experiments (refer to Figure 2 for details). Unload the driver using rmmod.
Summing up
Shweta was certainly happy; all on her own, shed got a character driver written, which works the same as the standard /dev/null device file. To understand what this means, check the<major, minor> tuple for /dev/null, and similarly, also try out the echo and cat commands with it. However, one thing began to bother Shweta. She had got her own calls (my_open, my_close,my_read, my_write) in her driver, but wondered why they worked so unusually, unlike any regular file system calls. What was unusual? Whatever was written, she got nothing when reading unusual, at least from the regular file operations perspective. How would she crack this problem? Watch out for the next article.
So, what was your guess on how Shweta would crack the problem? Obviously, with the help of Pugs. Wasnt it obvious? In our previous article, we saw how Shweta was puzzled by not being able to read any data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang not inside her head, but a real one at the door. And for sure, there was Pugs. How come youre here? exclaimed Shweta. I saw your tweet. Its cool that you cracked your first character driver all on your own. Thats amazing. So, what are you up to now? asked Pugs. Ill tell you, on the condition that you do not play spoil sport, replied Shweta. Pugs smiled, Okay, Ill only give you advice. And that too, only if I ask for it! I am trying to understand character device file operations, said Shweta. Pugs perked up, saying, I have an idea. Why dont you decode and then explain what youve understood about it? Shweta felt that was a good idea. She tailed the dmesg log to observe the printk output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.
static int my_open(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: open()\n"); return 0; } static int my_close(struct inode *i, struct file *f) { printk(KERN_INFO "Driver: close()\n"); return 0; } static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); return 0; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); return len; }
Based on the earlier understanding of the return value of the functions in the kernel, my_open()and my_close() are trivial, their return types being int, and both of them returning zero, means success. However, the return types of both my_read() and my_write() are not int, rather, it is ssize_t. On further digging through kernel headers, that turns out to be a signed word. So, returning a negative number would be a usual error. But a non-negative return value would have additional meaning. For the read operation, it would be the number of bytes read, and for the write operation, it would be the number of bytes written.
static char c; static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); buf[0] = c; return 1; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off)
{ printk(KERN_INFO "Driver: write()\n"); c = buf[len 1]; return len; } Almost there, but what if the user has provided an invalid buffer, or if the user buffer is swapped out. Wouldnt this direct access of the user-space buf just crash and oops the kernel? pounced Pugs. Shweta, refusing to be intimidated, dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access, and then updated them. With the complete understanding of the APIs, she rewrote the above code snippet as follows: static char c; static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: read()\n"); if (copy_to_user(buf, &c, 1) != 0) return -EFAULT; else return 1; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { printk(KERN_INFO "Driver: write()\n"); if (copy_from_user(&c, buf + len 1, 1) != 0) return -EFAULT; else return len; } Then Shweta repeated the usual build-and-test steps as follows:
1. Build the modified null driver (.ko file) by running make. 2. Load the driver using insmod. 3. Write into /dev/mynull, say, using echo -n "Pugs" > /dev/ mynull 4. Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C) 5. Unload the driver using rmmod. On cating /dev/mynull, the output was a non-stop infinite sequence of s, as my_read() gives the last one character forever. So, Pugs intervened and pressed Ctrl+C to stop the infinite read, and tried to explain, If this is to be changed to the last character only once, my_read() needs to return 1 the first time, and zero from the second time onwards. This can be achieved using off (the fourth parameter of my_read()).. Shweta nodded her head obligingly, just to bolster Pugs ego.
The basic assumption is that the architecture is 32-bit. For others, the memory map would change accordingly. For a 32-bit address bus, the address/memory map ranges from 0 (0x00000000) to 232 1 (0xFFFFFFFF). An architecture-independent layout of this memory map would be like whats shown in Figure 1 memory (RAM) and device regions (registers and memories of devices) mapped in an interleaved fashion. These addresses actually are architecture-dependent. For example, in an x86 architecture, the initial 3 GB (0x00000000 to0xBFFFFFFF) is typically for RAM, and the later 1GB (0xC0000000 to 0xFFFFFFFF) for device maps. However, if the RAM is less, say 2GB, device maps could start from 2GB (0x80000000). Run cat /proc/iomem to list the memory map on your system. Run cat /proc/meminfo to get the approximate RAM size on your system. Refer to Figure 2 for a snapshot.
Irrespective of the actual values, the addresses referring to RAM are termed as physical addresses, and those referring to device maps as bus addresses, since these devices are always mapped through some architecture-specific bus for example, the PCI bus in the x86 architecture, the AMBA bus in ARM architectures, the SuperHyway bus in SuperH architectures, etc. All the architecture-dependent values of these physical and bus addresses are either dynamically configurable, or are to be obtained from the data-sheets (i.e., hardware manuals) of the corresponding architecture processors/controllers. The interesting part is that in Linux, none of these are directly accessible, but are to be mapped to virtual addresses and then accessed through them thus making the RAM and device accesses generic enough. The corresponding APIs (prototyped in <asm/io.h>) for mapping and unmapping the device bus addresses to virtual addresses are:
device_bus_address,
unsigned
long
Once mapped to virtual addresses, it depends on the device datasheet as to which set of device registers and/or device memory to read from or write into, by adding their offsets to the virtual address returned by ioremap(). For that, the following are the APIs (also prototyped in<asm/io.h>):
ioread32(void *virt_addr); iowrite8(u8 value, void *virt_addr); iowrite16(u16 value, void *virt_addr); iowrite32(u32 value, void *virt_addr);
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
#include #include #include #include #include #include #include #include #include #include
<linux/module.h> <linux/version.h> <linux/kernel.h> <linux/types.h> <linux/kdev_t.h> <linux/fs.h> <linux/device.h> <linux/cdev.h> <linux/uaccess.h> <asm/io.h>
#define VRAM_BASE 0x000A0000 #define VRAM_SIZE 0x00020000 static static static static void __iomem *vram; dev_t first; struct cdev c_dev; struct class *cl;
static int my_open(struct inode *i, struct file *f) { return 0; } static int my_close(struct inode *i, struct file *f) { return 0; } static ssize_t loff_t *off) { int i; u8 byte; my_read(struct file *f, char __user *buf, size_t len,
if (*off >= VRAM_SIZE) { return 0; } if (*off + len > VRAM_SIZE) { len = VRAM_SIZE - *off; } for (i = 0; i < len; i++) {
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
byte = ioread8((u8 *)vram + *off + i); if (copy_to_user(buf + i, &byte, 1)) { return -EFAULT; } } *off += len; return len; } static ssize_t my_write(struct file *f, const char __user *buf, size_t len, loff_t *off) { int i; u8 byte; if (*off >= VRAM_SIZE) { return 0; } if (*off + len > VRAM_SIZE) { len = VRAM_SIZE - *off; } for (i = 0; i < len; i++) { if (copy_from_user(&byte, buf + i, 1)) { return -EFAULT; } iowrite8(byte, (u8 *)vram + *off + i); } *off += len; return len; } static struct file_operations vram_fops = { .owner = THIS_MODULE, .open = my_open, .release = my_close, .read = my_read, .write = my_write }; static int __init vram_init(void) /* Constructor */ { if ((vram = ioremap(VRAM_BASE, VRAM_SIZE)) == NULL) { printk(KERN_ERR "Mapping video RAM failed\n"); return -1; } if (alloc_chrdev_region(&first, 0, 1, "vram") < 0) { return -1; } if ((cl = class_create(THIS_MODULE, "chardrv")) == NULL)
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
{ unregister_chrdev_region(first, 1); return -1; } if (device_create(cl, NULL, first, NULL, "vram") == NULL) { class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } cdev_init(&c_dev, &vram_fops); if (cdev_add(&c_dev, first, 1) == -1) { device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); return -1; } return 0; } static void __exit vram_exit(void) /* Destructor */ { cdev_del(&c_dev); device_destroy(cl, first); class_destroy(cl); unregister_chrdev_region(first, 1); iounmap(vram); } module_init(vram_init); module_exit(vram_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Video RAM Driver");
Summing up
Shweta then repeated the usual steps: Build the vram driver (video_ram.ko file) by running make with a changed Makefile. Load the driver using insmod video_ram.ko. Write into /dev/vram, say, using echo -n "0123456789" > /dev/vram. Read the /dev/vram contents using od -t x1 -v /dev/vram | less. (The usual cat /dev/vram can also be used, but that would give all the binary content. od -t x1 shows it as hexadecimal. For more details, run man od.) 5. Unload the driver using rmmod video_ram. With half an hour still left for the end of the practical class, Shweta decided to walk around and possibly help somebody else with their experiments. 1. 2. 3. 4.
are outb, outw and outl, respectively. header <asm/io.h>) are as follows:
The
equivalent
functions/macros
(available
through
the
u8 inb(unsigned long port); u16 inw(unsigned long port); u32 inl(unsigned long port); void outb(u8 value, unsigned long port); void outw(u16 value, unsigned long port); void outl(u32 value, unsigned long port);
The basic question that may arise relates to which devices are I/O mapped and what the port addresses of these devices are. The answer is pretty simple. As per x86-standard, all these devices and their mappings are predefined. Figure 1 shows a snippet of these mappings through the kernel window /proc/ioports. The listing includes predefined DMA, the timer and RTC, apart from serial, parallel and PCI bus interfaces, to name a few.
corresponds to the respective bit of the registers. Also, note that the register addresses start from 0 and goes
up to 7. The interesting thing about this is that a data sheet always gives the register offsets, which then needs to be added to the base address of the device, to get the actual register addresses. Who decides the base address and where is it obtained from? Base addresses are typically board/platform specific, unless they are dynamically configurable like in the case of PCI devices. In this case, i.e., a serial device on x86, it is dictated by the x86 architectureand that precisely was the starting serial port address mentioned above0x3F8. Thus, the eight register offsets, 0 to 7, exactly map to the eight port addresses 0x3F8 to 0x3FF. So, these are the actual addresses to be read or written, for reading or writing the corresponding serial registers, to achieve the desired serial operations, as per the register descriptions. All the serial register offsets and the register bit masks are defined in the header<linux/serial_reg.h>. So, rather than hard-coding these values from the data sheet, the corresponding macros could be used instead. All the following code uses these macros, along with the following:
#define SERIAL_PORT_BASE 0x3F8 Operating on the device registers To summarise the decoding of the PC16550D UART data sheet, here are a few examples of how to do read and write operations of the serial registers and their bits.
u8 val; val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */); outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); Setting and clearing the Divisor Latch Access Bit (DLAB) in LCR: u8 val; val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */); /* Setting DLAB */ val |= UART_LCR_DLAB /* 0x80 */; outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); /* Clearing DLAB */ val &= ~UART_LCR_DLAB /* 0x80 */; outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */); Reading and writing the Divisor Latch: u8 dlab; u16 val; dlab = inb(SERIAL_PORT_BASE + UART_LCR); dlab |= UART_LCR_DLAB; // Setting DLAB to access Divisor Latch outb(dlab, SERIAL_PORT_BASE + UART_LCR); val = inw(SERIAL_PORT_BASE + UART_DLL /* 0 */); outw(val, SERIAL_PORT_BASE + UART_DLL /* 0 */);
Blinking an LED
To get a real experience of low-level hardware access and Linux device drivers, the best way would be to play with the Linux device driver kit (LDDK) mentioned above. However, just for a feel of low-level hardware access, a blinking light emitting diode (LED) may be tried, as follows:
Connect a light-emitting diode (LED) with a 330 ohm resistor in series across Pin 3 (Tx) and Pin 5 (Gnd) of the DB9 connector of your PC. Pull up and down the transmit (Tx) line with a 500 ms delay, by loading and unloading theblink_led driver, using insmod blink_led.ko and rmmod blink_led, respectively. Driver file blink_led.ko can be created from its source file blink_led.c by running make with the usual driver Makefile. Given below is the complete blink_led.c:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
#include <linux/serial_reg.h> #define SERIAL_PORT_BASE 0x3F8 int __init init_module() { int i; u8 data; data = inb(SERIAL_PORT_BASE + UART_LCR); for (i = 0; i < 5; i++) { /* Pulling the Tx line low */ data |= UART_LCR_SBC; outb(data, SERIAL_PORT_BASE + UART_LCR); msleep(500); /* Defaulting the Tx line high */ data &= ~UART_LCR_SBC; outb(data, SERIAL_PORT_BASE + UART_LCR); msleep(500); } return 0; } void __exit cleanup_module() { } MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia pugs_dot_com>"); MODULE_DESCRIPTION("Blinking LED Hack");
<email_at_sarika-
Looking ahead
You might have wondered why Shweta is missing from this article? She bunked all the classes! Watch out for the next article to find out why.
Introducing ioctl()
Input/Output Control (ioctl, in short) is a common operation, or system call, available in most driver categories. It is a one-bill-fits-all kind of system call. If there is no other system call that meets a particular requirement, then ioctl() is the one to use. Practical examples include volume control for an audio device, display configuration for a video device, reading device registers, and so on basically, anything to do with device input/output, or device-specific operations, yet versatile enough for any kind of operation (for example, for debugging a driver by querying driver data structures). The question is: how can all this be achieved by a single function prototype? The trick lies in using its two key parameters: command and argument. The command is a number representing an operation. The argument command is the corresponding parameter for the operation. Theioctl() function implementation does a switch case over the commmand to implement the corresponding functionality. The following has been its prototype in the Linux kernel for quite some time:
int ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg); However, from kernel 2.6.35, it changed to: long ioctl(struct file *f, unsigned int cmd, unsigned long arg); If there is a need for more arguments, all of them are put in a structure, and a pointer to the structure becomes the one command argument. Whether integer or pointer, the argument is taken as a long integer in kernelspace, and accordingly type-cast and processed.
ioctl() is typically implemented as part of the corresponding driver, and then an appropriate function pointer is initialised with it, exactly as in other system calls like open(), read(), etc. For example, in character drivers, it is the ioctl or unlocked_ioctl (since kernel 2.6.35) function pointer field in the struct file_operations that is to be initialised. Again, like other system calls, it can be equivalently invoked from user-space using the ioctl()system call, prototyped in <sys/ioctl.h> as:
int ioctl(int fd, int cmd, ...); Here, cmd is the same as what is implemented in the drivers ioctl(), and the variable argument construct (...) is a hack to be able to pass any type of argument (though only one) to the drivers ioctl(). Other parameters will be ignored.
Note that both the command and command argument type definitions need to be shared across the driver (in kernel-space) and the application (in user-space). Thus, these definitions are commonly put into header files for each space.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#ifndef QUERY_IOCTL_H #define QUERY_IOCTL_H #include <linux/ioctl.h> typedef struct { int status, dignity, ego; } query_arg_t; #define QUERY_GET_VARIABLES _IOR('q', 1, query_arg_t *) #define QUERY_CLR_VARIABLES _IO('q', 2) #define QUERY_SET_VARIABLES _IOW('q', 3, query_arg_t *) #endif
#include "query_ioctl.h" #define FIRST_MINOR 0 #define MINOR_CNT 1 static static static static dev_t dev; struct cdev c_dev; struct class *cl; int status = 1, dignity = 3, ego = 5;
static int my_open(struct inode *i, struct file *f) { return 0; } static int my_close(struct inode *i, struct file *f) { return 0; } #if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)) static int my_ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg)
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
#else static long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg) #endif { query_arg_t q; switch (cmd) { case QUERY_GET_VARIABLES: q.status = status; q.dignity = dignity; q.ego = ego; if (copy_to_user((query_arg_t *)arg, sizeof(query_arg_t))) { return -EACCES; } break; case QUERY_CLR_VARIABLES: status = 0; dignity = 0; ego = 0; break; case QUERY_SET_VARIABLES: if (copy_from_user(&q, (query_arg_t sizeof(query_arg_t))) { return -EACCES; } status = q.status; dignity = q.dignity; ego = q.ego; break; default: return -EINVAL; } return 0; } static struct file_operations query_fops = { .owner = THIS_MODULE, .open = my_open, .release = my_close, #if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35)) .ioctl = my_ioctl #else .unlocked_ioctl = my_ioctl #endif }; static int __init query_ioctl_init(void) { int ret; struct device *dev_ret; if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT,
&q,
*)arg,
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
"query_ioctl")) < 0) { return ret; } cdev_init(&c_dev, &query_fops); if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0) { return ret; } if (IS_ERR(cl = class_create(THIS_MODULE, "char"))) { cdev_del(&c_dev); unregister_chrdev_region(dev, MINOR_CNT); return PTR_ERR(cl); } if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "query"))) { class_destroy(cl); cdev_del(&c_dev); unregister_chrdev_region(dev, MINOR_CNT); return PTR_ERR(dev_ret); } return 0; } static void __exit query_ioctl_exit(void) { device_destroy(cl, dev); class_destroy(cl); cdev_del(&c_dev); unregister_chrdev_region(dev, MINOR_CNT); } module_init(query_ioctl_init); module_exit(query_ioctl_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Query ioctl() Char Driver");
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
#include "query_ioctl.h" void get_vars(int fd) { query_arg_t q; if (ioctl(fd, QUERY_GET_VARIABLES, &q) == -1) { perror("query_apps ioctl get"); } else { printf("Status : %d\n", q.status); printf("Dignity: %d\n", q.dignity); printf("Ego : %d\n", q.ego); } } void clr_vars(int fd) { if (ioctl(fd, QUERY_CLR_VARIABLES) == -1) { perror("query_apps ioctl clr"); } } void set_vars(int fd) { int v; query_arg_t q; printf("Enter Status: "); scanf("%d", &v); getchar(); q.status = v; printf("Enter Dignity: "); scanf("%d", &v); getchar(); q.dignity = v; printf("Enter Ego: "); scanf("%d", &v); getchar(); q.ego = v; if (ioctl(fd, QUERY_SET_VARIABLES, &q) == -1) { perror("query_apps ioctl set");
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93
} } int main(int argc, char *argv[]) { char *file_name = "/dev/query"; int fd; enum { e_get, e_clr, e_set } option; if (argc == 1) { option = e_get; } else if (argc == 2) { if (strcmp(argv[1], "-g") == 0) { option = e_get; } else if (strcmp(argv[1], "-c") == 0) { option = e_clr; } else if (strcmp(argv[1], "-s") == 0) { option = e_set; } else { fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]); return 1; } } else { fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]); return 1; } fd = open(file_name, O_RDWR); if (fd == -1) { perror("query_apps open"); return 2; } switch (option) { case e_get: get_vars(fd); break; case e_clr: clr_vars(fd); break;
94 95 96 97 98 99 100 101 102 } 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
Now try out query_app.c and query_ioctl.c with the following operations: Build the query_ioctl driver (query_ioctl.ko file) and the application (query_app file) by running make, using the following Makefile:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# If called directly from the command line, invoke the kernel build system. ifeq ($(KERNELRELEASE),) KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module query_app module: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules clean: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean ${RM} query_app # Otherwise KERNELRELEASE is defined; we've been invoked from the # kernel build system and can use its language. else
10:
Kernel-Space
This article, which is part of theseries on Linux device drivers, talks about kernel-space debugging in Linux.
Shweta, back from hospital, was relaxing in the library, reading various books. Ever since she learned of theioctl way of debugging, she was impatient to find out more about debugging in kernel-space. She was curious about how and where to run the kernel-space debugger, if there was any. This was in contrast with application/user-space debugging, where we have the OS running underneath, and a shell or a GUI over it to run the debugger (like gdb, and the data display debugger, ddd). Then she came across this interesting kernelspace debugging mechanism using kgdb, provided as part of the kernel itself, since kernel 2.6.26.
# To clean up properly # Configure the kernel same as the current # Start the ncurses based menu for further
See the highlighted selections in Figure 1, for how and where these options would be: KGDB: kernel debugging with remote gdb > CONFIG_KGDB KGDB: use kgdb over the serial console > CONFIG_KGDB_SERIAL_CONSOLE
Compile the kernel with debug info > CONFIG_DEBUG_INFO Compile the kernel with frame pointers > CONFIG_FRAME_POINTER Once configuration is saved, build the kernel (run make), and then a make install to install it, along with w adding an entry for the installed kernel in the GRUB configuration file. Depending on the distribution, the GRUB configuration file may be /boot/grub/menu.lst, /etc/grub.cfg, or something similar. Once installed, the kgdb-related kernel boot parameters need to be added to this new entry, as shown in the highlighted text in Figure 2.
kgdboc is for gdb connecting over the console, and the basic format is kgdboc= <serial_device>, <baud<baud rate> where: <serial_device> is the serial device file (port) on the system running the kernel to be debugged <baud-rate> is the baud rate of this serial port kgdbwait tells the kernel to delay booting till a gdb client connects to it; this parameter should be given only after kgdboc. With this, were ready to begin. Make a copy of the vmlinux kernel image for use on the gdbclient system. Reboot, and at the GRUB menu, choose the new kernel, and then it will wait forgdb to connect over the serial
port. All the above snapshots are with kernel version 2.6.33.14. The same should work for any 2.6.3x release of the rnel kernel source. Also, the snapshots for kgdb are captured over the serial device file /dev/ttyS0, i.e., the first serial port.
(gdb) file vmlinux (gdb) set remote interrupt-sequence Ctrl-C (gdb) set remotebaud 115200 aud (gdb) target remote /dev/ttyS0 (gdb) continue In the above commands, vmlinux is the kernel image copied from the system to be debugged.
Summing up
By now, Shweta was excited about wanting to try out kgdb. Since she needed two systems to try it out, she went to the Linux device drivers lab. There, she set up the systems and ran gdb as described above.
A basic listing of all detected USB devices can be obtained using the lsusb command, as root. Figure 2 shows this, with and without the pen drive plugged in. A -v option to lsusbprovides detailed information.
In many Linux distributions like Mandriva, Fedora, the usbfs driver is loaded as part of the default configuration. This enables the detected USB device details to be viewed in a more techno techno-friendly way through the /proc window, using cat /proc/bus/usb/devices. Figure 3 shows a typical snippet of the same, clipped around the pen drive-specific section. The listing basically contains one such section for each specific valid USB device detected on the system.
Coming back to the USB device sections (Figure 3), the first letter on each line represents the various parts of the USB device specification just explained. For example, D for device, C for configuration, I for interface, E for ication endpoint, etc. Details about these and various others are available in the kernel source, in Documentation/usb/proc_usb_info.txt.
here this would be done with the corresponding protocol layer the USB core in this case; instead of providing a user-space interface like a device file, it would get connected with the actual device in hardware-space. The USB core APIs for the same are as follows (prototyped in <linux/usb.h>): int usb_register(struct usb_driver *driver); void usb_deregister(struct usb_driver *); As part of the usb_driver structure, the fields to be provided are the drivers name, ID table for auto-detecting the particular device, and the two callback functions to be invoked by the USB core during a hot plugging and a hot removal of the device, respectively. Putting it all together, pen_register.c would look like what follows: 1 #include <linux/module.h> 2 #include <linux/kernel.h> #include <linux/usb.h>
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
static int pen_probe(struct usb_interface *interface, const struct usb_device_id *id) { printk(KERN_INFO "Pen drive (%04X:%04X) plugged\n", id->idVendor, id->idProduct); return 0; } static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen drive removed\n"); } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) }, {} /* Terminating entry */ }; MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .id_table = pen_table, .probe = pen_probe, .disconnect = pen_disconnect, }; static int __init pen_init(void) { return usb_register(&pen_driver); } static void __exit pen_exit(void) { usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("USB Pen Registration Driver");
42 43 44 45 46
Then, the usual steps for any Linux device driver may be repeated: Build the driver (.ko file) by running make. Load the driver using insmod. List the loaded modules using lsmod. Unload the driver using rmmod. But surprisingly, the results wouldnt be as expected. Check dmesg and the proc window to see the various logs and details. This is not because a USB driver is different from a character driver but theres a catch. Figure 3 shows that the pen drive has one interface (numbered 0), which is already associated with the usual (numbered usb-storage driver. Now, in order to get our driver associated with that interface, we need to unload the usb usb-storage driver (i.e., rmmod usb-storage) and replug the pen drive. Once thats done, the results would be as expected. Figure 5 shows a glimpse of the possible logs and a procwindow snippet. Repeat hot-plugging in and hotplugging hot plugging out the pen drive to observe the probe and disconnect calls in action.
Summing up
Finally! Something in action! a relieved Shweta said. But it seems like there are so many things (like the device ID table, probe, disconnect, etc.), yet to be understood to get a complete USB device driver in place. Yes, you are right. Lets take them up, one by one, with breaks, replied Pugs, taking a break himself. m
To be specific, the E: lines in the figure show examples of an interrupt endpoint of a UHCI Host Controller, and two bulk endpoints of the pen drive under consideration. Also, the endpoint numbers (in hex) are, respectively, 0x81, 0x01 and 0x82 the MSB of the first and third being1, indicating in endpoints, dicating represented by (I) in the figure; the second is an (O) or out endpoint. MxPS specifies the maximum packet size, i.e., the data size that can be transferred in a single go. Again, as expected, for the interrupt endpoint, it is 2 (<=8), and 64 for the bulk endpoints. Ivl specifies the interval in milliseconds to be given between two consecutive data packet transfers for proper transfer, and is more significant for the interrupt endpoints.
The interface class may or may not be the same as that of the device class. And depending on the number of endpoints, there would be as many E lines, details of which have already been discussed earlier. The * after the C and I represents the currently active configuration and interface, respectively. The P line provides the vendor ID, product ID, and the product revision. S lines are string descriptors showing up some vendor-specific descriptive information about the device. Peeping into cat /proc/bus/usb/devices is good in order to figure out whether a device has been detected or not, and possibly to get the first-cut overview of the device. But most probably this information would be required to write the driver for the device as well. So, is there a way to access it using C code? Shweta asked. Yes, definitely; thats what I am going to tell you about, next. Do you remember that as soon as a USB device is plugged into the system, the USB host controller driver populates its information into the generic USB core layer? To be precise, it puts that into a set of structures embedded into one another, exactly as per the USB specifications, Pugs replied. The following are the exact data structures defined in <linux/usb.h>, ordered here in reverse, for flow clarity:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
struct usb_device { struct usb_device_descriptor descriptor; struct usb_host_config *config, *actconfig; }; struct usb_host_config { struct usb_config_descriptor desc; struct usb_interface *interface[USB_MAXINTERFACES]; }; struct usb_interface { struct usb_host_interface *altsetting /* array *cur_altsetting; }; struct usb_host_interface { struct usb_interface_descriptor desc; struct usb_host_endpoint *endpoint /* array */; }; struct usb_host_endpoint { struct usb_endpoint_descriptor desc; };
*/,
So, with access to the struct usb_device handle for a specific device, all the USB-specific information about the device can be decoded, as shown through the /proc window. But how does one get the device handle? In fact, the device handle is not available directly in a driver; rather, the per-interface handles (pointers to struct usb_interface) are available, as USB drivers are written for device interfaces rather than the device as a whole. Recall that the probe and disconnect callbacks, which are invoked by the USB core for every interface of the registered device, have the corresponding interface handle as their first parameter. Refer to the prototypes below:
int (*probe)(struct usb_interface *interface, const struct usb_device_id *id); void (*disconnect)(struct usb_interface *interface); So, with the interface pointer, all information about the corresponding interface can be accessed and to get the container device handle, the following macro comes to the rescue: struct usb_device device = interface_to_usbdev(interface); Adding this new learning into last months registration-only driver gets the following code listing (pen_info.c): 1 #include <linux/module.h> #include <linux/kernel.h> 2 #include <linux/usb.h>
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
static struct usb_device *device; static int pen_probe(struct usb_interface *interface, usb_device_id *id) { struct usb_host_interface *iface_desc; struct usb_endpoint_descriptor *endpoint; int i; const struct
iface_desc = interface->cur_altsetting; printk(KERN_INFO "Pen i/f %d now probed: (%04X:%04X)\n", iface_desc->desc.bInterfaceNumber, id->idVendor, >idProduct); printk(KERN_INFO "ID->bNumEndpoints: %02X\n", iface_desc->desc.bNumEndpoints); printk(KERN_INFO "ID->bInterfaceClass: %02X\n", iface_desc->desc.bInterfaceClass); for (i = 0; i < iface_desc->desc.bNumEndpoints; i++) { endpoint = &iface_desc->endpoint[i].desc;
id-
printk(KERN_INFO "ED[%d]->bEndpointAddress: 0x%02X\n", i, endpoint->bEndpointAddress); printk(KERN_INFO "ED[%d]->bmAttributes: 0x%02X\n", i, endpoint->bmAttributes); printk(KERN_INFO "ED[%d]->wMaxPacketSize: 0x%04X (%d)\n", i, endpoint->wMaxPacketSize, endpoint>wMaxPacketSize); } device = interface_to_usbdev(interface); return 0; } static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen i/f %d now disconnected\n", interface->cur_altsetting->desc.bInterfaceNumber); } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) }, {} /* Terminating entry */ };
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .probe = pen_probe, .disconnect = pen_disconnect, .id_table = pen_table, }; static int __init pen_init(void) { return usb_register(&pen_driver); } static void __exit pen_exit(void) { usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>"); MODULE_DESCRIPTION("USB Pen Info Driver");
Then, the usual steps for any Linux device driver may be repeated, along with the pen drive steps: Build the driver (pen_info.ko file) by running make. Load the driver using insmod pen_info.ko. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Unplug the pen drive. Check the output of dmesg for the logs. Unload the driver using rmmod pen_info. Figure 2 shows a snippet of the above steps on Pugs system. Remember to ensure (in the output of cat /proc/bus/usb/devices) that the usual usb-storage driver is not the one associated with the pen drive interface, but rather the pen_info driver.
Summing up
Before taking another break, Pugs shared two of the many mechanisms for a driver to specify its device to the USB core, using the struct usb_device_id table. The first one is by specifying the <vendor id, product id> pair using the USB_DEVICE() macro (as done above). The second one is by specifying the device class/category using the USB_DEVICE_INFO() macro. In fact, many more macros are available in <linux/usb.h> for various combinations. Moreover, multiple of these macros could be specified in the usb_device_id table (terminated by a null entry), for matching with any one of the criteria, enabling to write a single driver for possibly many devices. Earlier, you mentioned writing multiple drivers for a single device, as well. Basically, how do we selectively register or not register a particular interface of a USB device?, queried Shweta. Sure. Thats next in line of our device?, discussion, along with the ultimate task in any device driver the data-transfer mechanisms, replied Pugs. transfer
Device Drivers, Part 13: Data Transfer to and from USB Devices
This article, which is part of the series on Linux device drivers, continues from the previous two articles. It details the ultimate step of data transfer to and from a USB device, using your first USB driver in Linux.
Pugs continued, To answer your question about how a driver selectively registers or skips a particular interface of a USB device, you need to understand the significance of the return value of the probe() callback. Note that the USB core would invoke probe for all the interfaces of a detected device, except the ones which are already registered thus, while doing it for the first time, it will probe for all interfaces. Now, if the probe returns 0, it means the driver has registered for that interface. Returning an error code indicates not registering for it. Thats all. That was simple, commented Shweta. Now, lets talk about the ultimate data transfers to and from a USB device, continued Pugs. But before that, tell me, what is this MODULE_DEVICE_TABLE? This has been bothering me since you explained the USB device ID table macros, asked Shweta, urging Pugs to slow down. Thats trivial stuff. It is mainly for the user-space depmod, he said. Module is another term for a driver, which can be dynamically loaded/unloaded. The macro MODULE_DEVICE_TABLEgenerates two variables in a modules read-only section, which is extracted by depmod and stored in global map files under /lib/modules/<kernel_version>. Two such files aremodules.usbmap and modules.pcimap, for USB and PCI device drivers, respectively. This enables auto-loading of these drivers, as we saw the usbstorage driver getting auto-loaded.
int usb_register_dev(struct usb_interface *intf, struct usb_class_driver *class_driver); void usb_deregister_dev(struct usb_interface *intf, struct usb_class_driver *class_driver); Usually, we would expect these functions to be invoked in the constructor and the destructor of a module, respectively. However, to achieve the hot-plug-n-play behaviour for the (character) device files corresponding to USB devices, these are instead invoked in the probe and disconnect callbacks, respectively.
The first parameter in the above functions is the interface pointer received as the first parameter in both probe and disconnect. The second parameter, struct usb_class_driver, needs to be populated with the suggested device file name and the set of device file operations, before invoking usb_register_dev. For the
actual ual usage, refer to the functions pen_probe andpen_disconnect in the code listing of pen_driver.c below. Moreover, as the file operations (write, read, etc.,) are now provided, that is exactly where we need to do the data transfers to and from the USB devi device. So, pen_write and pen_ read below show the possible calls to usb_bulk_msg() (prototyped in <linux/usb.h>) to do the transfers over the pen drives bulk end-points end 001 and 082, respectively. Refer to the E lines of the middle section in Figure 1 for the endpoint number listings of our pen drive.
Refer to the header file <linux/usb.h> under kernel sources, for the complete list of USB core API prototypes for other endpoint-specific data transfer functio specific functions like usb_control_msg(),usb_interrupt_msg(), etc. usb_rcvbulkpipe(), usb_sndbulkpipe(), and many such other macros, also defined in <linux/usb.h>, compute the actual endpoint bit mask to be passed to the various USB core APIs. bit-mask Note that a pen drive belongs to a USB mass storage class, which expects a set of SCSI like commands to be gs SCSI-like transacted over the bulk endpoints. So, a raw read/write as shown in the code listing below may not really do a data transfer as expected, unless the data is appropriately formatted. But still, this summarises the overall code formatted. flow of a USB driver. To get a feel of a real working USB data transfer in a simple and elegant way, one would need some kind of custom USB device, something like the one available here.
1 2 3 4 5 6 7 8
#include <linux/module.h> #include <linux/kernel.h <linux/kernel.h> #include <linux/usb.h> #define #define #define #define MIN(a,b) (((a) <= (b)) ? (a) : (b)) BULK_EP_OUT 0x01 BULK_EP_IN 0x82 MAX_PKT_SIZE 512
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
static struct usb_class_driver class; static unsigned char bulk_buf[MAX_PKT_SIZE]; static int pen_open(struct inode *i, struct file *f) { return 0; } static int pen_close(struct inode *i, struct file *f) { return 0; } static ssize_t pen_read(struct file *f, char __user loff_t *off) { int retval; int read_cnt;
*buf,
size_t cnt,
/* Read the data from the bulk endpoint */ retval = usb_bulk_msg(device, usb_rcvbulkpipe(device, BULK_EP_IN), bulk_buf, MAX_PKT_SIZE, &read_cnt, 5000); if (retval) { printk(KERN_ERR "Bulk message returned %d\n", retval); return retval; } if (copy_to_user(buf, bulk_buf, MIN(cnt, read_cnt))) { return -EFAULT; } return MIN(cnt, read_cnt); } static ssize_t pen_write(struct file *f, const char __user *buf, size_t cnt, loff_t *off) { int retval; int wrote_cnt = MIN(cnt, MAX_PKT_SIZE); if (copy_from_user(bulk_buf, buf, MIN(cnt, MAX_PKT_SIZE))) { return -EFAULT; } /* Write the data into the bulk endpoint */ retval = usb_bulk_msg(device, usb_sndbulkpipe(device, BULK_EP_OUT), bulk_buf, MIN(cnt, MAX_PKT_SIZE), &wrote_cnt, 5000); if (retval) { printk(KERN_ERR "Bulk message returned %d\n", retval); return retval; } return wrote_cnt; } static struct file_operations fops = { .open = pen_open,
.release = pen_close, .read = pen_read, .write = pen_write, }; static int pen_probe(struct usb_device_id *id) { int retval; usb_interface *interface, const struct
device = interface_to_usbdev(interface); class.name = "usb/pen%d"; class.fops = &fops; if ((retval = usb_register_dev(interface, &class)) < 0) { /* Something prevented us from registering this driver */ err("Not able to get a minor for this device."); } else { printk(KERN_INFO "Minor obtained: %d\n", interface->minor); } return retval; } static void pen_disconnect(struct usb_interface *interface) { usb_deregister_dev(interface, &class); } /* Table of devices that work with this driver */ static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F, 0x6387) }, {} /* Terminating entry */ }; MODULE_DEVICE_TABLE (usb, pen_table); static struct usb_driver pen_driver = { .name = "pen_driver", .probe = pen_probe, .disconnect = pen_disconnect, .id_table = pen_table, }; static int __init pen_init(void) { int result; /* Register this driver with the USB subsystem */ if ((result = usb_register(&pen_driver))) { err("usb_register failed. Error number %d", result); }
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
return result; } static void __exit pen_exit(void) { /* Deregister this driver with the USB subsystem */ usb_deregister(&pen_driver); } module_init(pen_init); module_exit(pen_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("USB Pen Device Driver");
As a reminder, the usual steps for any Linux device driver may be repeated with the above code, along with the following steps for the pen drive: Build the driver (pen_driver.ko) by running make. Load the driver using insmod pen_driver.ko. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Check for the dynamic creation of /dev/pen0 (0 being the minor number obtained checkdmesg logs for the value on your system). Possibly try some write/read on /dev/pen0 (you most likely will get a connection timeout and/or broken pipe errors, because of non-conforming SCSI commands). Unplug the pen drive and look for /dev/pen0 to be gone. Unload the driver using rmmod pen_driver. Meanwhile, Pugs hooked up his first-of-its-kind creation the Linux device driver kit (LDDK) into his system for a live demonstration of the USB data transfers. Aha! Finally a cool complete working USB driver, quipped Shweta, excited. Want to have more fun? We could do a block driver over it, added Pugs. Oh! Really? asked Shweta, with glee. Yes. But before that, we need to understand the partitioning mechanisms, commented Pugs.
Device Drivers, Part 14: A Dive Inside the Hard Disk for Understanding Partitions
This article, which is part of theseries on Linux device drivers, takes you on a tour inside a hard disk.
Doesnt it sound like a mechanical engineering subject: The design of the hard disk? questioned Shweta. Yes, it does. But understanding it gives us an insight into its programming aspect, reasoned Pugs, while waiting for the commencement of the seminar on storage systems. The seminar started with a few hard disks in the presenters hand and then a dive into her system, showing the output stem, of fdisk -l(Figure 1).
The first line shows the hard disk size in human friendly format and in bytes. The second line mentions the human-friendly number of logical heads, logical sectors per track, and the actual number of cylinders on the disk together known as the geometry of the disk. The 255 heads indicate the number of platters or disks, as one read write head is needed per disk. Lets read-write number them, say D1, D2, D255. Now, each disk would have the same number of concentric circular tracks, same starting from the outside to the inside. In the above case, there are 60,801 such tracks per disk. Lets number them, say T1, T2, T60801. And a particular track number from all the disks forms a cylinder of the sa same number. For example, tracks T2 from D1, D2, D255 will together form the cylinder C2. Now, each track has the same number of logical sectors 63 in our case, say S1, S2, S63. And each sector is typically 512 bytes. Given this data, one can actually compute the total usable hard disk size, using the following formula:
Usable hard disk size in b bytes = (Number of heads or disks) * (Number of tracks per disk) (N bytes per sector, i.e. sector size) For the disk under consideration, it would be: 255 * 60801 * 63 * 512 bytes = 500105249280 bytes. nder Note that this number may be slightly less than the actual hard disk (500107862016 bytes, in our case). The reason is that the formula doesnt consider the bytes in the last partial or incomplete cylinder. The primary last reason for that is the difference between todays technology of organising the actual physical disk geometry and the traditional geometry representation using heads, cylinders and sectors.
Note that in the fdisk output, we referred to the heads and sectors per track as logical not physical. One may ask that if todays disks dont have such physical geometry concepts, then why still maintain them and represent them in a logical form? The main reason is to be able to continue with the same concepts of partitioning, and be able to maintain the same partition table formats, especially for the most prevalent DOStype partition tables, which heavily depend on this simplistic geometry. Note the computation of cylinder size (255 heads * 63 sectors / track * 512 bytes / sector = 8225280 bytes) in the third line and then the demarcation of partitions in units of complete cylinders.
typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6; unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry; This partition table, followed by the two-byte signature 0xAA55, resides at the end of the disks first sector, commonly known as the Master Boot Record (MBR). Hence, the starting offset of this partition table within the MBR is 512 - (4 * 16 + 2) = 446. Also, a 4-byte disk signature is placed at offset 440. The remaining top 440 bytes of the MBR are typically used to place the first piece of boot code, that is loaded by the BIOS to boot the system from the disk. The part_info.c listing contains these various definitions, along with code for parsing and printing a formatted output of the partition table. From the partition table entry structure, it could be noted that the start and end cylinder fields are only 10 bits long, thus allowing a maximum of 1023 cylinders only. However, for todays huge hard disks, this is in no way sufficient. Hence, in overflow cases, the corresponding <head, cylinder, sector> triplet in the partition table entry is set to the maximum value, and the actual value is computed using the last two fields: the absolute start sector number (abs_start_sec) and the number of sectors in this partition (sec_in_part). The code for this too is in part_info.c: #include <stdio.h> 1 #include <sys/types.h> 2 #include <sys/stat.h> 3 #include <fcntl.h> 4 #include <unistd.h>
5 6 7 8 9 10 11
SECTOR_SIZE 512 MBR_SIZE SECTOR_SIZE MBR_DISK_SIGNATURE_OFFSET 440 MBR_DISK_SIGNATURE_SIZE 4 PARTITION_TABLE_OFFSET 446 PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) PARTITION_TABLE_SIZE 64 // sizeof(PartTable)
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
MBR_SIGNATURE_OFFSET 510 MBR_SIGNATURE_SIZE 2 MBR_SIGNATURE 0xAA55 BR_SIZE SECTOR_SIZE BR_SIGNATURE_OFFSET 510 BR_SIGNATURE_SIZE 2 BR_SIGNATURE 0xAA55
typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6; unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry; typedef struct { unsigned char boot_code[MBR_DISK_SIGNATURE_OFFSET]; unsigned long disk_signature; unsigned short pad; unsigned char pt[PARTITION_TABLE_SIZE]; unsigned short signature; } MBR; void print_computed(unsigned long sector) { unsigned long heads, cyls, tracks, sectors; sectors = sector % 63 + 1 /* As indexed from 1 */; tracks = sector / 63; cyls = tracks / 255 + 1 /* As indexed from 1 */; heads = tracks % 255; printf("(%3d/%5d/%1d)", heads, cyls, sectors); } int main(int argc, char *argv[]) { char *dev_file = "/dev/sda"; int fd, i, rd_val; MBR m; PartEntry *p = (PartEntry *)(m.pt); if (argc == 2) { dev_file = argv[1]; } if ((fd = open(dev_file, O_RDONLY)) == -1) { fprintf(stderr, "Failed opening %s: ", dev_file); perror(""); return 1; } if ((rd_val = read(fd, &m, sizeof(m))) != sizeof(m)) { fprintf(stderr, "Failed reading %s: ", dev_file);
perror(""); close(fd); return 2; } close(fd); printf("\nDOS type Partition Table of %s:\n", dev_file); printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n"); for (i = 0; i < 4; i++) { printf("%d:%d (%3d/%4d/%2d) (%3d/%4d/%2d) %02X %10d %9d\n", i + 1, !!(p[i].boot_type & 0x80), p[i].start_head, 1 + ((p[i].start_cyl_hi << 8) | p[i].start_cyl), p[i].start_sec, p[i].end_head, 1 + ((p[i].end_cyl_hi << 8) | p[i].end_cyl), p[i].end_sec, p[i].part_type, p[i].abs_start_sec, p[i].sec_in_part); } printf("\nRe-computed Partition Table of %s:\n", dev_file); printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n"); for (i = 0; i < 4; i++) { printf("%d:%d ", i + 1, !!(p[i].boot_type & 0x80)); print_computed(p[i].abs_start_sec); printf(" "); print_computed(p[i].abs_start_sec + p[i].sec_in_part 1); printf(" %02X %10d %9d\n", p[i].part_type, p[i].abs_start_sec, p[i].sec_in_part); } printf("\n"); return 0; }
As the above is an application, compile it with gcc part_info.c -o part_info, and then run./part_info /dev/sda to check out your primary partitioning information on /dev/sda. Figure 2 shows the output of ./part_info on the presenters system. Compare it with the fdiskoutput in Figure 1.
./part_info /dev/sda ## Displays the partition table on /dev/sda fdisk -l /dev/sda ## To display and compare the partition table l display entries with the above In case you have multiple hard disks (/dev/sdb, ), hard disk device files with other names (/dev/hda, ), or an extended partition, you may try ./part_info <device_file_name> on them as well. Trying on an extended partition would give you the information about the starting partition table of the logical partitions. Right now, we have carefully and selectively played (read only) with the systems hard disk. Why carefully? (read-only) Since otherwise, we may render our system non bootable. But no learning is complete without a total non-bootable. exploration. Hence, in our next session, we will create a dummy disk in RAM and do destructive exploration on it.
Device Drivers, Part 15: Disk on RAM Playing with Block Drivers
This article, which is part of the series
on Linux device drivers,
experiments with a dummy hard disk on RAM to demonstrate how block drivers work.
After a delicious lunch, theory makes the audience sleepy. So, lets start with the code itself.
partition.h
1 #ifndef PARTITION_H 2 #define PARTITION_H 3 4 #include <linux/types.h> 5 6 extern void copy_mbr_n_br(u8 *disk); 7 #endif
partition.c
1 2
#include <linux/string.h> #include "partition.h"
3 4 5 6 7 8 9
#define ARRAY_SIZE(a) (sizeof(a) / sizeof(*a)) #define SECTOR_SIZE 512 #define MBR_SIZE SECTOR_SIZE #define MBR_DISK_SIGNATURE_OFFSET 440 #define MBR_DISK_SIGNATURE_SIZE 4 #define PARTITION_TABLE_OFFSET 446 #define PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) #define PARTITION_TABLE_SIZE 64 // sizeof(PartTable)
10
#define MBR_SIGNATURE_OFFSET 510
11
#define MBR_SIGNATURE_SIZE 2
12
#define MBR_SIGNATURE 0xAA55
13
#define BR_SIZE SECTOR_SIZE
14 15 16 17 18 19 20 21 22 23 24 25 26
typedef struct { unsigned char boot_type; // 0x00 - Inactive; 0x80 - Active (Bootable) unsigned char start_head; unsigned char start_sec:6; unsigned char start_cyl_hi:2; unsigned char start_cyl; unsigned char part_type; unsigned char end_head; unsigned char end_sec:6;
27 28 29 30 31 32
unsigned char end_cyl_hi:2; unsigned char end_cyl; unsigned long abs_start_sec; unsigned long sec_in_part; } PartEntry;
33 34
static PartTable def_part_table =
35
{
36
{
37 38 39 40 41 42 43 44 45
boot_type: 0x00, start_head: 0x00, start_sec: 0x2, start_cyl: 0x00, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x09, abs_start_sec: 0x00000001, sec_in_part: 0x0000013F
46
},
47
{
48
boot_type: 0x00,
49
start_head: 0x00,
50
51 52 53
54
end_cyl: 0x13,
55
abs_start_sec: 0x00000140,
56
sec_in_part: 0x00000140
57
},
58 59 60 61 62 63 64 65
{ boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x14, part_type: 0x83, end_head: 0x00, end_sec: 0x20, end_cyl: 0x1F,
66
abs_start_sec: 0x00000280,
67
sec_in_part: 0x00000180
68
},
69
{
70 71 72 73 74
};
static unsigned int def_log_part_br_cyl[] = {0x0A, 0x0E, 0x12}; static const PartTable def_log_part_table[] = {
75 76 77 78 79
80
part_type: 0x83,
81
end_head: 0x00,
82
end_sec: 0x20,
83 84 85 86 87 88 89 90 91
}, {
boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x0E, part_type: 0x05,
92
end_head: 0x00,
93
end_sec: 0x20,
94
end_cyl: 0x11,
95
abs_start_sec: 0x00000080,
96 97 98
}, },
sec_in_part: 0x00000080
104
part_type: 0x83,
105
end_head: 0x00,
106
end_sec: 0x20,
boot_type: 0x00, start_head: 0x00, start_sec: 0x1, start_cyl: 0x12, part_type: 0x05,
116
end_head: 0x00,
117
end_sec: 0x20,
118
end_cyl: 0x13,
119
abs_start_sec: 0x00000100,
sec_in_part: 0x00000040
128
part_type: 0x83,
129
end_head: 0x00,
130
end_sec: 0x20,
131 132 133 134 135 136 137 138 139 140 141 142 143 144
} }; } },
static void copy_mbr(u8 *disk) { memset(disk, 0x0, MBR_SIZE); *(unsigned 0x36E5756D; long *)(disk + MBR_DISK_SIGNATURE_OFFSET) =
PARTITION_TABLE_OFFSET,
&def_part_table,
*)(disk
MBR_SIGNATURE_OFFSET)
145 146
static void copy_br(u8 *disk, int start_cylinder, const PartTable *part_table)
147 148 149 150 151 152 153 154 155 156 157
memset(disk, 0x0, BR_SIZE); memcpy(disk + PARTITION_TABLE_OFFSET, part_table, PARTITION_TABLE_SIZE); *(unsigned short *)(disk + BR_SIGNATURE_OFFSET) = BR_SIGNATURE; } void copy_mbr_n_br(u8 *disk) { int i;
copy_mbr(disk);
158
for (i = 0; i < ARRAY_SIZE(def_log_part_table); i++)
159
{
ram_device.h
1 2 3 4 5
extern int ramdevice_init(void); #define RB_SECTOR_SIZE 512 #ifndef RAMDEVICE_H #define RAMDEVICE_H
6
extern void ramdevice_cleanup(void);
7 8 9 10
extern void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors); extern void ramdevice_read(sector_t sector_off, u8 *buffer, unsigned int sectors); #endif
ram_device.c
#include <linux/types.h>
1
#include <linux/vmalloc.h>
2
#include <linux/string.h>
3 4
#include "ram_device.h"
5
#include "partition.h"
6 7 8 9 10 11 12
int ramdevice_init(void) /* Array where the disk stores its data */ static u8 *dev_data; #define RB_DEVICE_SIZE 1024 /* sectors */ /* So, total device size = 1024 * 512 bytes = 512 KiB */
13
{
14
dev_data = vmalloc(RB_DEVICE_SIZE * RB_SECTOR_SIZE);
15
if (dev_data == NULL)
16
return -ENOMEM;
17 18 19 20 21 22 23 24 25 26 27 28 29
} }
void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors) { memcpy(dev_data + sector_off * RB_SECTOR_SIZE, buffer,
30 31 32 33 34 35 36 37 38
} }
sectors * RB_SECTOR_SIZE);
sector_off,
u8
*buffer,
unsigned
ram_block.c
1 2 3 4
#include <linux/types.h> /* Disk on RAM Driver */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/fs.h>
5
#include <linux/genhd.h>
6
#include <linux/blkdev.h>
7
#include <linux/errno.h>
8 9
#include "ram_device.h"
10 11 12 13 14
static u_int rb_major = 0; #define RB_FIRST_MINOR 0 #define RB_MINOR_CNT 16
15 16 17 18 19 20
/* Size is the size of the device (in sectors) */ /* * The internal structure representation of our Device */ static struct rb_device {
21
unsigned int size;
22
/* For exclusive access to our request queue */
23
spinlock_t lock;
24 25 26 27 28 29 30 31 32 33
/* Our request queue */ struct request_queue *rb_queue; /* This is kernel's representation of an individual disk device */ struct gendisk *rb_disk; } rb_dev;
static int rb_open(struct block_device *bdev, fmode_t mode) { unsigned unit = iminor(bdev->bd_inode);
34
printk(KERN_INFO "rb: Inode number is %d\n", unit);
35 36
if (unit > RB_MINOR_CNT)
37
return -ENODEV;
38
return 0;
39 40 41 42 43 44
static int rb_close(struct gendisk *disk, fmode_t mode) { printk(KERN_INFO "rb: Device is closed\n"); return 0; }
45 46
/*
47
* Actual Data transfer
48
*/
49 50 51 52 53
static int rb_transfer(struct request *req) { //struct rb_device >private_data); *dev = (struct rb_device *)(req->rq_disk-
54
sector_t start_sector = blk_rq_pos(req);
55
unsigned int sector_cnt = blk_rq_sectors(req);
56 57 58 59 60 61 62
sector_t sector_offset; unsigned int sectors; u8 *buffer; struct bio_vec *bv; struct req_iterator iter;
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
} sectors = bv->bv_len / RB_SECTOR_SIZE; printk(KERN_DEBUG "rb: Sector Offset: %lld; Buffer: %p; Length: %d sectors\n", sector_offset, buffer, sectors); if (dir == WRITE) /* Write to the device */ { ramdevice_write(start_sector sectors); } + sector_offset, buffer, (%d).\n" "This may lead to data truncation.\n", bv->bv_len, RB_SECTOR_SIZE); ret = -EIO; sector_offset = 0; rq_for_each_segment(bv, req, iter) { buffer = page_address(bv->bv_page) + bv->bv_offset; if (bv->bv_len % RB_SECTOR_SIZE != 0) { printk(KERN_ERR "rb: Should never happen: " "bio size (%d) is not a multiple of RB_SECTOR_SIZE //printk(KERN_DEBUG "rb: start_sector, sector_cnt); Dir:%d; Sec:%lld; Cnt:%d\n", dir, int ret = 0;
else /* Read from the device */ { ramdevice_read(start_sector sectors); } sector_offset += sectors; + sector_offset, buffer,
if (sector_offset != sector_cnt) { printk(KERN_ERR info"); ret = -EIO; "rb: bio info doesn't match with the request
return ret;
* Represents a block I/O request for us to execute */ static void rb_request(struct request_queue *q)
105
{
106
struct request *req;
107
int ret;
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
{ #if 0 /* * This function tells us whether we are looking at a filesystem request * - one that moves block of data */ if (!blk_fs_request(req)) { printk(KERN_NOTICE "rb: Skip non-fs request\n"); /* We pass 0 to indicate that we successfully completed the request */ __blk_end_request_all(req, 0); //__blk_end_request(req, 0, blk_rq_bytes(req)); continue; } #endif ret = rb_transfer(req); __blk_end_request_all(req, ret);
127
//__blk_end_request(req, ret, blk_rq_bytes(req));
128
}
129
}
139 140
/*
146
int ret;
147 148
/* Set up our RAM Device */
149
if ((ret = ramdevice_init()) < 0)
150
{
return ret;
rb_dev.size = ret;
/* Get Registered */ rb_major = register_blkdev(rb_major, "rb"); if (rb_major <= 0) { printk(KERN_ERR "rb: Unable to get Major Number\n");
/* Get a request queue (here queue is created) */ spin_lock_init(&rb_dev.lock); rb_dev.rb_queue = blk_init_queue(rb_request, &rb_dev.lock);
165
if (rb_dev.rb_queue == NULL)
166
{
167
printk(KERN_ERR "rb: blk_init_queue failure\n");
168 169 170 171 172 173 174 175 176 177
*/ /* }
* Add the gendisk structure * By using this memory allocation is involved, * the minor number we need to pass bcz the device * will support this much partitions
178
rb_dev.rb_disk = alloc_disk(RB_MINOR_CNT);
179
if (!rb_dev.rb_disk)
180
{
181
printk(KERN_ERR "rb: alloc_disk failure\n");
182
blk_cleanup_queue(rb_dev.rb_queue);
188
rb_dev.rb_disk->major = rb_major;
189
/* Setting the first mior number */
190
rb_dev.rb_disk->first_minor = RB_FIRST_MINOR;
191
/* Initializing the device operations */
rb_dev.rb_disk->fops = &rb_fops; /* Driver-specific own internal data */ rb_dev.rb_disk->private_data = &rb_dev; rb_dev.rb_disk->queue = rb_dev.rb_queue; /* * You do not want partition information to show up in * cat /proc/partitions set this flags */ //rb_dev.rb_disk->flags = GENHD_FL_SUPPRESS_PARTITION_INFO; sprintf(rb_dev.rb_disk->disk_name, "rb");
201
/* Setting the capacity of the device in its gendisk structure */
202
set_capacity(rb_dev.rb_disk, rb_dev.size);
203 204
/* Adding the disk to the system */
205
add_disk(rb_dev.rb_disk);
206
/* Now the disk is "live" */ printk(KERN_INFO "rb: Ram Block driver initialised (%d sectors; %d
return 0;
210
}
211
/*
216
{
217
del_gendisk(rb_dev.rb_disk);
218
put_disk(rb_dev.rb_disk);
219
blk_cleanup_queue(rb_dev.rb_queue);
220 221 222 223 224 225 226 227 228 229
}
module_init(rb_init); module_exit(rb_cleanup);
230
231 232 233 234 235 236 237 238 239 240 241 242 243
You can also download the code demonstrated from here. As usual, executing make will build the Disk on RAM driver (dor.ko), combining the three C files. Check out the Makefile to see how.
Makefile
1 2
ifeq ($(KERNELRELEASE),) # If called directly from the command line, invoke the kernel build system.
3 4 5 6 7 8
module: KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module
9 10 11 12 13 14 15 16 17 18 19 20 21
# Otherwise KERNELRELEASE is defined; we've been invoked from the # kernel build system and can use its language. else obj-m := dor.o dor-y := ram_block.o ram_device.o partition.o endif
To clean the built files, run the usual make clean. Once built, the following are the experimental steps (refer to Figures 1 to 3).
Figure 2: xxd showing the initial data on the first partition (/dev/rb1)
Please note that all these need to be executed with root privileges: Load the driver dor.ko using insmod. This would create the block device files representing the disk on 512 ice KiB of RAM, with three primary and three logical partitions. Check out the automatically created block device files (/dev/rb*). /dev/rb is the entire disk, which is 512 KiB in size. rb1, rb2 and rb3 are the primary pa partitions, with rb2 being the extended partition and containing three logical partitions rb5, rb6 and rb7. Read the entire disk (/dev/rb) using the disk dump utility dd. Zero out the first sector of the disks first partition (/dev/rb1), again using dd. Write some text into the disks first partition (/dev/rb1) using cat. Display the initial contents of the first partition (/dev/rb1) using the xxd utility. See Figure 2 for xxd output. Display the partition information for the disk using fdisk. See Figure 3 for fdisk output. Quick-format the third primary partition (/dev/rb3) as a vfat filesystem (like your pen drive), format using mkfs.vfat (Figure 3). Mount the newly formatted partition using mount, say at /mnt (Figure 3). The disk usage utility df would now show this partition mounted at /mnt (Figure 3). You may go ahead and store files there, but remember that this is a disk on RAM, and so is non non-persistent. Unload the driver using rmmod dor after unmounting the partition using umount /mnt. All data on the disk da will be lost.
The code in this is responsible for the partition information like the number, type, size, etc., that is shown using fdisk. The ram_block.c file is the core block driver implementation, exposing the DOR as the block device files (/dev/rb*) to user-space. In other words, four of the five filesram_device.* and partition.* form the horizontal layer of the device driver, andram_block.c forms the vertical (block) layer of the device driver. So, lets understand that in detail.
*blk_init_queue(request_fn_proc
*,
spinlock_t
We provide the request-processing function and the initialised concurrency protection spin-lock as parameters. The corresponding queue clean-up function is given below:
while ((req = blk_fetch_request(q)) != NULL) /* Fetching a request */ { /* Processing the request: the actual data transfer */ ret = rb_transfer(req); /* Our custom function */ /* Informing return of ret */ that the request has been processed with
__blk_end_request_all(req, ret); }
rq_data_dir(req); /* Operation type: 0 - read from device; otherwise write to device */ blk_req_pos(req); /* Starting sector to process */ blk_req_sectors(req); /* Total sectors to process */ rq_for_each_segment(bv, req, iter) /* Iterator to extract individual buffers */
rq_for_each_segment() is the special one which iterates over the struct request (req)using iter, and extracting the individual buffer information into the struct bio_vec (bv: basic input/output vector) on each iteration. And then, on each extraction, the appropriate data transfer is done, based on the operation type, invoking one of the following APIs from ram_device.c:
void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors); void ramdevice_read(sector_t sector_off, u8 *buffer, unsigned int sectors);
Check out the complete code of rb_transfer() in ram_block.c.
Summing up
With that, we have actually learnt the beautiful block drivers by traversing through the design of a hard disk and playing around with partitioning, formatting and various other raw operations on a hard disk. Thanks for patiently listening. Now, the session is open for questions please feel free to leave your queries as comments.
After many months, Shweta and Pugs got together for some peaceful technical romancing. All through, they had been using all kinds of kernel windows, especially through the /proc virtual filesystem (using cat), to help them decode various details of Linux device drivers. Heres a non-exhaustive summary listing: /proc/modules dynamically loaded modules /proc/devices registered character and block major numbers /proc/iomem on-system physical RAM and bus device addresses /proc/ioports on-system I/O port addresses (especially for x86 systems) /proc/interrupts registered interrupt request numbers /proc/softirqs registered soft IRQs /proc/kallsyms running kernel symbols, including from loaded modules /proc/partitions currently connected block devices and their partitions /proc/filesystems currently active filesystem drivers /proc/swaps currently active swaps /proc/cpuinfo information about the CPU(s) on the system /proc/meminfo information about the memory on the system, viz., RAM, swap,
1 2 3 4 5 6 7 8 9 10 11 12
static struct proc_dir_entry *parent, *file, *link; static int state = 0; int time_read(char *page, char **start, off_t off, int count, int *eof, void *data) { int len, val; unsigned long act_jiffies; len = sprintf(page, "state = %d\n", state);
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
act_jiffies = jiffies - INITIAL_JIFFIES; val = jiffies_to_msecs(act_jiffies); switch (state) { case 0: len += sprintf(page + len, "time = %ld jiffies\n", act_jiffies); break; case 1: len += sprintf(page + len, "time = %d msecs\n", val); break; case 2: len += sprintf(page + len, "time = %ds %dms\n", val / 1000, val % 1000); break; case 3: val /= 1000; len += sprintf(page + len, "time = %02d:%02d:%02d\n", val / 3600, (val / 60) % 60, val % 60); break; default: len += sprintf(page + len, "<not implemented>\n"); break; } len += sprintf(page + len, "{offset = %ld; count = %d;}\n", off, count); return len; } int time_write(struct file *file, const char __user count, void *data) { if (count > 2) return count; if ((count == 2) && (buffer[1] != '\n')) return count; if ((buffer[0] < '0') || ('9' < buffer[0])) return count; state = buffer[0] - '0'; return count; }
*buffer,
unsigned
long
static int __init proc_win_init(void) { if ((parent = proc_mkdir("anil", NULL)) == NULL) { return -1; } if ((file = create_proc_entry("rel_time", 0666, parent)) == NULL) { remove_proc_entry("anil", NULL); return -1; } file->read_proc = time_read; file->write_proc = time_write; if ((link = proc_symlink("rel_time_l", parent, "rel_time")) == NULL) { remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); return -1; } link->uid = 0; link->gid = 100; return 0; } static void __exit proc_win_exit(void) {
62 63 64 65 66 67 68 69 70 71 72 73
remove_proc_entry("rel_time_l", parent); remove_proc_entry("rel_time", parent); remove_proc_entry("anil", NULL); } module_init(proc_win_init); module_exit(proc_win_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika pugs_dot_com>"); <email_at_sarika-pugs_dot_com>"); MODULE_DESCRIPTION("Kernel window /proc Demonstration Driver");
And then Pugs did the following: Built the driver file (proc_window.ko) using the usual drivers Makefile. Loaded the driver using insmod. Showed various experiments using the newly created proc windows. (Refer to Figure 1.) And finally, unloaded the driver using rmmod.
gid Group ID of the file Additionally, for a regular file, the following two function pointers for reading and writing over the file could be provided, respectively:
int (*read_proc)(char *page, char **start, off_t off, int count, int *eof, void *data) int (*write_proc)(struct file *file, const char __user *buffer, unsigned long count, void *data) write_proc() is very similar to the character drivers file operation write(). The above implementation lets the user write a digit from 0 to 9, and accordingly sets the internal state.read_proc() in the above
implementation provides the current state, and the time since the system has been booted up in different units, based on the current state. These are jiffies in state 0; milliseconds in state 1; seconds and milliseconds in state 2; hours, minutes and seconds in state 3; and <not implemented> in other states. onds And to check the computation accuracy, Figure 2 highlights the system uptime in the output of top. read_procs page parameter is a page sized buffer, typically to be filled up with count bytes from offset page-sized of off. But more often than not (because of less content), just the page is filled up, ignoring all other parameters.
All the /proc-related structure definitions and function declarations are available through<linux/proc_fs.h>. related <lin The jiffies-related function declarations and macro definitions are in<linux/jiffies.h>. As a special note, related the actual jiffies are calculated by subtractingINITIAL_JIFFIES, since on boot-up, jiffies is initialised up, to INITIAL_JIFFIES instead of zero.
Summing up
Hey Pugs! Why did you set the folder name to anil? Who is this Anil? You could have used my name, or maybe yours, suggested Shweta. Ha! Thats a surprise. My real name is Anil; its just that everyone in college knows me as Pugs, smiled Pugs. Watch out for further technical romancing from Pugs a.k.a Anil.
Each of these exports the symbol passed as their parameter, additionally putting them in one of the default, _gpl or _gpl_future sections, respectively. Hence, only one of them needs to be used for a particular symbol though the symbol could be either a variable name or a function name. Heres the complete code (our_glob_syms.c) to demonstrate this:
1 2 3 4 5 6 7 8 9 10 11 12
#include <linux/module.h> #include <linux/device.h> static struct class *cool_cl; static struct class *get_cool_cl(void) { return cool_cl; } EXPORT_SYMBOL(cool_cl); EXPORT_SYMBOL_GPL(get_cool_cl); static int __init glob_sym_init(void) { if (IS_ERR(cool_cl = class_create(THIS_MODULE, "cool"))) /* Creates /sys/class/cool/ */
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
{ return PTR_ERR(cool_cl); } return 0; } static void __exit glob_sym_exit(void) { /* Removes /sys/class/cool/ */ class_destroy(cool_cl); } module_init(glob_sym_init); module_exit(glob_sym_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika pugs.com>"); <email_at_sarika-pugs.com>"); MODULE_DESCRIPTION("Global Symbols exporting Driver");
Each exported symbol also has a corresponding structure placed into (each of) the kernel symbol table (__ksymtab), kernel string table (__kstrtab), and kernel CRC table (__kcrctab) sections, marking it to be globally accessible. Figure 1 shows a filtered snippet of the /proc/kallsyms kernel window, before and after loading the module our_glob_syms.ko, which has been compiled using the usual driverMakefile.
The following code shows the supporting header file (our_glob_syms.h), to be included by modules using the exported symbols cool_cl and get_cool_cl:
#ifndef OUR_GLOB_SYMS_H
#define OUR_GLOB_SYMS_H #ifdef __KERNEL__ #include <linux/device.h> extern struct class *cool_cl; extern struct class *get_cool_cl(void); #endif #endif Figure 1 also shows the file Module.symvers, generated by compiling the moduleour_glob_syms. This contains the various details of all the exported symbols in its directory. Apart from including the above header file, modules using the exported symbols should possibly have this file Module.symvers in their build directory. Note that the <linux/device.h> header in the above examples is being included for the various class-related declarations and definitions, which have already been covered in the earlier discussion on character drivers.
Module parameters
Being aware of passing command-line arguments to an application, it would be natural to ask if something similar can be done with a module and the answer is, yes, it can. Parameters can be passed to a module while loading it, for instance, when using insmod. Interestingly enough, and in contrast to the command-line arguments to an application, these can be modified even later, through sysfs interactions. The module parameters are set up using the following macro (defined in<linux/moduleparam.h>, included through <linux/module.h>):
module_param(name, type, perm) Here, name is the parameter name, type is the type of the parameter, and perm refers to the permissions of the sysfs file corresponding to this parameter. The supported type values are: byte, short, ushort, int, uint, long, ulong, charp (character pointer), bool or invbool (inverted Boolean). The following module code (module_param.c) demonstrates a module parameter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
#include <linux/module.h> #include <linux/kernel.h> static int cfg_value = 3; module_param(cfg_value, int, 0764); static int __init mod_par_init(void) { printk(KERN_INFO "Loaded with %d\n", cfg_value); return 0; } static void __exit mod_par_exit(void) { printk(KERN_INFO "Unloaded cfg value: %d\n", cfg_value); } module_init(mod_par_init); module_exit(mod_par_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>"); MODULE_DESCRIPTION("Module Parameter demonstration Driver");
Note that before the parameter setup, a variable of the same name and compatible type needs to be defined. Subsequently, the following steps and experiments are shown in Figures 2 and 3: Building the driver (module_param.ko file) using the usual driver Makefile Loading the driver using insmod (with and without parameters) Various experiments through the corresp corresponding /sys entries And finally, unloading the driver using rmmod. Note the following:
Initial value (3) of cfg_value becomes its default value when insmod is done without any parameters. Permission 0764 gives rwx to the user, rw- to the group, and r-- for the others on the filecfg_value under the parameters of module_param under /sys/module/. Check for yourself: The output of dmesg/tail on every insmod and rmmod, for the printk outputs. Try writing into the /sys/module/module_param/parameters/cfg_value file as a normal (non-root) user.
Summing up
With this, the duo have a fairly good understanding of Linux drivers, and are all set to start working on their final semester project. Any guesses what their project is about? Hint: They have picked up one of the most daunting Linux driver topics. Let us see how they fare with it next month.