docs/lesson03/linux/timer.md
We finished the last chapter by examining global interrupt controller. We were able to trace the path of a timer interrupt all the way up to the bcm2836_chained_handle_irq function. Next logical step is to see how the timer driver handles this interrupt. However, before we can do this, you need to familiarize yourself with a few important concepts related to timer functionality. All of them are explained in the official kernel documentation, and I strongly advise you to read this document. But for those who are too busy to read it, I can provide my own brief explanation of the mentioned concepts.
In the next section, we are going to see how system timer is used to implement clock sources, clock events and sched_clock functionality.
As usual, we start the exploration of a particular device with finding its location in the device tree. System timer node is defined here. You can keep this definition open for a while because we are going to reference it several times.
Next, we need to use compatible property to figure out the location of the corresponding driver. The driver can be found here. The first thing we are going to look at is bcm2835_timer structure.
struct bcm2835_timer {
void __iomem *control;
void __iomem *compare;
int match_mask;
struct clock_event_device evt;
struct irqaction act;
};
This structure contains all state needed for the driver to function. control and compare fields holds the addresses of the corresponding memory mapped registers, match_mask is used to determine which of the 4 available timer interrupts we are going to use, evt field contains a structure that is passed to clock events framework and act is an irq action that is used to connect the current driver with the interrupt controller.
Next we are going to look at bcm2835_timer_init which is the driver initialization function. It is large, but not as difficult as you might think from the beginning.
static int __init bcm2835_timer_init(struct device_node *node)
{
void __iomem *base;
u32 freq;
int irq, ret;
struct bcm2835_timer *timer;
base = of_iomap(node, 0);
if (!base) {
pr_err("Can't remap registers\n");
return -ENXIO;
}
ret = of_property_read_u32(node, "clock-frequency", &freq);
if (ret) {
pr_err("Can't read clock-frequency\n");
goto err_iounmap;
}
system_clock = base + REG_COUNTER_LO;
sched_clock_register(bcm2835_sched_read, 32, freq);
clocksource_mmio_init(base + REG_COUNTER_LO, node->name,
freq, 300, 32, clocksource_mmio_readl_up);
irq = irq_of_parse_and_map(node, DEFAULT_TIMER);
if (irq <= 0) {
pr_err("Can't parse IRQ\n");
ret = -EINVAL;
goto err_iounmap;
}
timer = kzalloc(sizeof(*timer), GFP_KERNEL);
if (!timer) {
ret = -ENOMEM;
goto err_iounmap;
}
timer->control = base + REG_CONTROL;
timer->compare = base + REG_COMPARE(DEFAULT_TIMER);
timer->match_mask = BIT(DEFAULT_TIMER);
timer->evt.name = node->name;
timer->evt.rating = 300;
timer->evt.features = CLOCK_EVT_FEAT_ONESHOT;
timer->evt.set_next_event = bcm2835_time_set_next_event;
timer->evt.cpumask = cpumask_of(0);
timer->act.name = node->name;
timer->act.flags = IRQF_TIMER | IRQF_SHARED;
timer->act.dev_id = timer;
timer->act.handler = bcm2835_time_interrupt;
ret = setup_irq(irq, &timer->act);
if (ret) {
pr_err("Can't set up timer IRQ\n");
goto err_iounmap;
}
clockevents_config_and_register(&timer->evt, freq, 0xf, 0xffffffff);
pr_info("bcm2835: system timer (irq = %d)\n", irq);
return 0;
err_iounmap:
iounmap(base);
return ret;
}
Now let's take a closer look at this function.
base = of_iomap(node, 0);
if (!base) {
pr_err("Can't remap registers\n");
return -ENXIO;
}
It starts with mapping memory registers and obtaining register base address. You should be already familiar with this part.
ret = of_property_read_u32(node, "clock-frequency", &freq);
if (ret) {
pr_err("Can't read clock-frequency\n");
goto err_iounmap;
}
system_clock = base + REG_COUNTER_LO;
sched_clock_register(bcm2835_sched_read, 32, freq);
Next, sched_clock subsystem is initialized. sched_clock need to access timer counter registers each time it is executed and bcm2835_sched_read is passed as the first argument to assist with this task. The second argument corresponds to the number of bits that the timer counter has (in our case it is 32). the number of bits is used to calculate how soon the counter is going to wrap to 0. The last argument specifies timer frequency - it is used to convert values of the timer counter to nanoseconds. Timer frequency is defined in the device tree at this line.
clocksource_mmio_init(base + REG_COUNTER_LO, node->name,
freq, 300, 32, clocksource_mmio_readl_up);
Next line initializes clock source framework. clocksource_mmio_init initializes a simple clock source based on memory mapped registers. The clock source framework, in some aspects, duplicates the functionality of sched_clock and it needs access to the same 3 basic parameters.
Another 3 parameters include the name of the clock source, its rating, which is used to rate clock source devices, and a function that can read timer counter register.
irq = irq_of_parse_and_map(node, DEFAULT_TIMER);
if (irq <= 0) {
pr_err("Can't parse IRQ\n");
ret = -EINVAL;
goto err_iounmap;
}
This code snippet is used to find Linux irq number, corresponding to the third timer interrupt (Number 3 is hardcoded as DEFAULT_TIMER constant). Just a quick reminder: Raspberry Pi system timer has 4 independent set of timer registers, and here the third one is used. If you go back to the device tree, you can find interrupts property. This property describes all interrupts, supported by a device, and how those interrupts are mapped to interrupt controller lines. It is an array, where each item represents one interrupt. The format of the items is specific to the interrupt controller. In our case, each item consists of 2 numbers: the first one specifies an interrupt bank and the second - interrupt number inside the bank. irq_of_parse_and_map reads the value of interrupts property, then it uses the second argument to find which of the supported interrupts we are interested in and returns Linux irq number for the requested interrupt.
timer = kzalloc(sizeof(*timer), GFP_KERNEL);
if (!timer) {
ret = -ENOMEM;
goto err_iounmap;
}
Here memory for bcm2835_timer structure is allocated.
timer->control = base + REG_CONTROL;
timer->compare = base + REG_COMPARE(DEFAULT_TIMER);
timer->match_mask = BIT(DEFAULT_TIMER);
Next, the addresses of the control and compare registers are calculated and match_mask is set to the DEFAULT_TIMER constant.
timer->evt.name = node->name;
timer->evt.rating = 300;
timer->evt.features = CLOCK_EVT_FEAT_ONESHOT;
timer->evt.set_next_event = bcm2835_time_set_next_event;
timer->evt.cpumask = cpumask_of(0);
In this code snippet clock_event_device struct is initialized. The most important property here is set_next_event which points to bcm2835_time_set_next_event function. This function is called by the clock events framework to schedule next interrupt. bcm2835_time_set_next_event is very simple - it updates compare register so that interrupt will be scheduled after a desied interval. This is analogaus to what we did here for the RPi OS.
timer->act.flags = IRQF_TIMER | IRQF_SHARED;
timer->act.dev_id = timer;
timer->act.handler = bcm2835_time_interrupt;
Next, irq action is initialized. The most important property here is handler, which points to bcm2835_time_interrupt - this is the function that is called after an interrupt is fired. If you take a look at it, you will see that it redirects all work to the event handler, registered by the clock events framework. We will examine this event handler in a while.
ret = setup_irq(irq, &timer->act);
if (ret) {
pr_err("Can't set up timer IRQ\n");
goto err_iounmap;
}
After the irq action is configured, it is added to the list of irq actions of the timer interrupt.
clockevents_config_and_register(&timer->evt, freq, 0xf, 0xffffffff);
And finally clock events framework is initialized by calling clockevents_config_and_register. evt structure and timer frequency are passed as first 2 arguments. Last 2 arguments are used only in "one-shot" timer mode and are not relevant to our current discussion.
Now, we have traced the path of a timer interrupt all the way up to the bcm2835_time_interrupt function, but we still didn't find the place were the actual work is done. In the next section, we are going to dig even deeper and find out how an interrupt is processed when it enters the clock events framework.
In the previous section, we have seen that the real work of handling a timer interrupt is outsourced to the clock events framework. This is done in the following few lines.
event_handler = ACCESS_ONCE(timer->evt.event_handler);
if (event_handler)
event_handler(&timer->evt);
Now our goal will be to figure out were exactly event_handler is set and what happens after it is called.
clockevents_config_and_register function is a good place to start the exploration because this is the place where clock events framework is configured and, if we follow the logic of this function, eventually we should find how event_handler is set.
Now let me show you the chain of function calls that leads us to the place we need.
If you take a look at the last function in the call chain, you will see that Linux uses different handlers depending on whether broadcast is enabled or not. Tick broadcast is used to awake idle CPUs, you can read more about it here. But we are going to ignore it and concentrate on a more general tick handler instead.
In general case tick_handle_periodic and then tick_periodic functions are called. The later one is exactly the function that we are interested in. Let me copy its content here.
/*
* Periodic tick
*/
static void tick_periodic(int cpu)
{
if (tick_do_timer_cpu == cpu) {
write_seqlock(&jiffies_lock);
/* Keep track of the next tick event */
tick_next_period = ktime_add(tick_next_period, tick_period);
do_timer(1);
write_sequnlock(&jiffies_lock);
update_wall_time();
}
update_process_times(user_mode(get_irq_regs()));
profile_tick(CPU_PROFILING);
}
A few important things are done in this function:
tick_next_period is calculated so that next tick event can be scheduled.jiffies is a number of ticks since the last system reboot. jiffies can be used in the same way as sched_clock function, in cases when you don't need nanosecond precision.Now you see how long is the way of an ordinary timer interrupt, but we followed it from the beginning to the very end. One of the things that are the most important, is that we finally reached the place where the scheduler is called. The scheduler is one of the most critical parts of any operating system and it relies heavily on timer interrupts. So now, when we've seen where the scheduler functionality is triggered, its time to discuss its implementation - that is something we are going to do in the next lesson.
3.3 Interrupt handling: Interrupt controllers