How to utilize ioctl when writing a Linux kernel module for 3.10

The ioctl command numbers should be unique across the system in order to prevent errors caused by issuing the right command to the wrong device.

To help programmers create unique ioctl command codes, these codes have been
split up into several bitfields.  Such bitfields must be only manipulated with the _IO macros defined in linux/ioctl.h.

To device a new set of ioctl codes first of all you should read Documentation/ioctl/ioctl-number.txt and later check out how my example has come the theory toward the practice (code here).



What is a Kernel timer and how to use it in Linux 3.10 ?

Kernel timers are used to delay a function’s execution until an specified time interval has elapsed. The function will be run on the cpu on which it is submitted

These are the facts to know when we’ve got to use kernel timers

* Every time a timer interrupt occurs, the value of an internal kernel counter is incremented.

* The counter is initialized to 0 at system boot, so it represents the number of clock ticks since last boot.

* Driver writers normally wanna know the value of that counter. It is posible accessing the jiffies variable.

* Kernel timers are based on the clock tick.

* The functions scheduled to run almost certainly do not run while the process that registered them is executing. In fact kernel timers are run as the result of a “software interrupt”.

When we are outside of process context (and we are into the software interrupt context), we must observe the following rules:

* No access to user space is allowed.
* The current pointer is not meaningful and cannot be used
* No sleeping or scheduling may be performed. (not use “schedule” “wait_event” , semaphores , or any other function that could sleep. For example, calling kmalloc(…, GFP_KERNEL) is against the rules.

An example using the older jiffies-resolution timer API.

What is a completion and how to use one on linux kernel 3.10

What’s a completition ?
A completition is like a kernel’s semaphore which starts blocked.
But why don’t I use a semaphore that starts blocked ?
That’s why The semaphores have been heavily optimized for the “available” case.

kernel example code here

The device node that we are gonna utilize!

note : the major number was found when I saw the ouput of dmesg command after I inserted the new kernel module with the insmod command.

[eplauchu@lapMine ~]# mknod /dev/completition_practice_driver c 247 0

Compile & Test

— Step 1
We’ve got to compile and run this on the linux terminal A then we will have to type the letter ‘r’ (The process will get on a awaiting state)

[root@lapMine completition]# g++ testcomp.cpp
[root@lapMine completition]# ./a.out

— Step 2
We’ve got to run this on the linux terminal B.

[root@lapMine completition]# ./a.out

— Step 3
Check how the second process frees the completition of the step 1 process.

[root@lapMine completition]# ./a.out
Reading Data from virtual device 123456789

Hardware IO

Buses: Paths for flow of information between the processor, memory and peripherals.

*** Types of Buses ***

Data bus: A set of lines doing data parallel transfer
Address bus Transmits addresses
Control bus: transmit control information (allows data going back and forth cpu, ram and IO devices)

*** Controlling the IO ***

The IO bus
: Name which receives “The part of control bus connecting a cpu to an IO device” (through a combination of IO ports interfaces and device controllers).

IO ports: Controls the peripheral device involving reading and writing of a consecutive addresses, or registers.  Before we can utilize or access an IO port, kernel has to register their usage.

IO Ports x86: Upon the x86 architecture IO address space is 64 KB in length. Here IO Ports can be addressesed as individual 8, 16, 32 bit port (The IO ports should be aligned).

*** Problems with optimizations ***

Memory barriers: By using barriers, a driver ensures no caching is performed and no reordering occurs through operations upon IO ports (because of such operations differs from normal memory access due to compiler and hardware optimizations)

Address Types in Linux Kernel

Different Linux kernel functions require different type of addresses.

The following is a list of address types used in Linux

User virtual addresses: seen in user programs. User addresses are either 32 or 64 bits length, each process has its own virtual address space.

Physical addresses: (a better definition of physical addresses here)

Bus addresses: These are used between the peripheral buses and memory. They are the same as the physical addresses used by processor, but that’s not necessarily the case. Some architectures can provide an I/O memory management unit (IOMMU) that remaps addresses between a bus and main memory.

Kernel logical addresses: these make up the normal address space of the kernel (a definition of logical address here). All Linux process running in user land and kernel mode use the same segments to address instructions and data. Such segments start at 0x00000000 and reach 2 exp (32 -1) , so logical addresses coincide with the corresponding linear addresses.

Kernel virtual addresses: Also known as linear addresses. All logical addresses are kernel virtual addresses, but many kernel virtual addresses are not logical addresses.

Additional details

Linux kernel on x86 splits the 4GB as follows (This configuration can be change at kernel configuration time).Addres_space_linux
The maximum amount of physical memory that could be handled by the kernel is the amount that could be mapped into kernel’s portion of the virtual address space, minus the space needed for the kernel code itself.

Important notes:

Low memory: Logical addresses that exists in kernel space.  Kernel cannot directly manipulate memory that is not mapped into kernel’s address space.
High memory: Memory for which logical addresses do not exist, because of it is beyond the address range set aside for kernel virtual addresses. it tends to be reserved for user land processes. Before accessing a specific high memory page, the kernel must setup an explicit virtual mapping to make that page available in the kernel’s address space.



Contiguous Memory Allocator (or CMA)

The CMA plays a role as modular framework for physically contiguous memory management, which is not tied to any memory allocation method or strategyWhere memory for each device is allocated as per specific machine configurations that could be also loaded at run-time (avoiding kernel recompilation). Such configurations list memory regions (even from different memory banksassigned to each device.


The CMA framework is needed because of:
1. Various embedded device drivers have their own memory allocation code AKA .. pluggable allocators, so CMA acts as an internal entity among device drivers and  pluggable allocators.
2. Various embedded devices have no scatter-gather an I/O map support and require contiguous memory.
3. Various embedded devices impose additional requirements on the buffers (allocation in particular location/memory bank, being align at particular memory boundary, buffer memory has to be simply big).

Important Facts about Memory Regions: Each region has its own type, size, alignment demand, start address (physical address where it should be placed) and an allocation algorithm assigned.

Important Facts about Pluggable allocators:  One can develop a newer algorithm to allocate and management of memory then place it into a pluggable allocator.

Important Facts about CMA API: it operates on pages and  page frame numbers (PFNs) without providing mechanism for maintaining cache coherency , device driver should never call such API directly.

The CMA is integrated with DMA subsystem (The usual calls to the DMA API should work as usual)

Memory addressing – Segmentation in Hardware (THE CRITIC AND IMPORTANT STUFF)

Physical Page: The basic unit memory management that kernel treats. Each architecture defines its own size in asm/page.h. There is a struct page for each physical page on the system.

MMU: Today microprocessor’s hardware (compound of Segmentation and Paging units) that manages memory and performs virtual to physical address translation. it contains the system’s page table

Logical Address: address of real mode (In this mode is possible the os’s bootstrap but it is just mapping 1MB in “linear addresses”) and protected mode (In this mode is enforced HW I/O and memory protection, it allows to work with a 4GB address space of linear addresses), it is based in a segment architecture where x86 DOS developers were forced to divide their programs into segments (click here for a profound explanation why logical addresses were created). This kind of address is included in machine language instructions to specify the address of an operand for an instruction. This address is conformed by a segment selector(16 bits) + offset(32 bits)

Linear address: address also known as virtual address (Usually represented by an unsigned integer to range from 0x00000000 to 0xFFFFFFFF). Every time a segment selector is loaded in a segmentation register (cs, ds , ss and other extra that may refer to arbitrary data segments. Such registers live upon processor to retrieve segment selectors quickly), the corresponding segment descriptor (8 bytes object describing segment characteristics) is loaded from a memory descriptor table into a non programmable CPU register, so segmentation unit is in charge of performing a segment descriptor base address’s field + Logical address offset to get a Linear one from a Logical one (result will be stored in gdtr or ldtr register).

Physical address: Used to address memory cells in memory chips. They correspond to the electrical signals sent along the address pins of the microprocessor to the memory bus.