kexec on iMX7d

So I get to work on some more on Kernel, yay! But not so yay after almost a week struggling with the problem. The problem is with the kexec on a custom iMX7d based platform. This is not the first time I am dealing with the kexec problem on this platform. Earlier two problems were fixed by me and I learned quite a few things while debugging those. But those were kind of simple problems compared to this. 1st one was about kexec not working at all. There were no logs after kexec runs and prints the famous “Bye!” message before booting into new kernel image. By enabling the earlyprintks and init level debugging (as I explained one of my earlier posts), I could figure out the problem which was related to a failure whilre reading a special OTP entry. And that was caused due to missing clock for that particular module. Good that it was fixed relatively quickly. The 2nd problem appeared when we updated the kernel version. The kexec just stopped working. Now I could not even see the “Bye!” message. Debugging for quite some time, it was found that the kernel was crashing when executing some “IRQ migrating” routine before doing the soft restart. It took a while to figure out what was the cause of the problem and I dug quite a lot of code in kernel’s IRQ subsystem. Eventually it turned out that an MFD based IRQ chip driver was trying to register itself and was failing due to missing IRQ entry in DTS. I just never paid attention to the boot logs where it clearly mentioned that failing to request IRQ. However, the driver for that MFD based irq chip simply ignored that warning to continue booting – which was done intentionally, but troubled me a lot. In the end I was sort of relieved when the issue was fixed. But while working on these issues, I sometimes encountered the problem where kexec just simply refused to do soft restart. But it happened so rarely that I never paid quite attention and considered them as low priority.

Suddenly this behavior of kexec failing to reboot appears quite frequently and I have to debug this. I hate such issues which fail to reproduce that easily because I unnecessarily waste the time to create an environment to reproduce the problem and it eats a lot of time. So far I have not managed to consistently reproduce the problem, but by enabling the “debug” and “earlyprintk” I hit another problem, which would put device into endless boot loop. The “dmesg” shows something similar to what is seen in this thread:

I had doubted on serial driver code (driver/tty/serial/imx.c) since I thought it looks like TTY driver has some buffer corruption and due to which I see the same booting messages again and again. I googled a bit and found a patch which was supposed to fix the TTY related problem for during kexec but that did not help either. The problem persisted. I kept searching on internet about TTY related problems, but it was in vain. I was clearing going on wrong direction. Then searching for quite some time on google, I stumbled upon the above thread, which has a similar debug message which I was observing. And I now cannot remember how did I reach to that email thread but that corrected my direction. By closely reading the thread multiple times, I learned that the problem is related to one of the CPUs (out of two Cortex-A7 core in iMX7d SoC) not coming online. I also noticed that when the kexec executes successfully, it always runs on single core. The 2nd core just fails to come online – which is good, because the other core can just do its work. And when kexec fails, it just keeps on trying to boot up the other core and ends up in a boot loop.

So I was disheartened initially that I don’t know anything about booting sequence, how would I debug. But since I had no other options, I started looking at the codes and putting prints. But then I remembered that I have seen this “CPU1: failed to come line” message earlier as well, when I used the sysfs interface to bring a core down and make it up again. It was when we were doing power consumption measurements. So it was easy to figure out what part of code gets executed when the CPU fails to come online. I traced it back using cscope, ag combination and landed into the assembly code. Wow. I never thought that I would debug assembly – it was a mixed feeling, exciting and disheartening. The ARM assembly is not that difficult, so I just started looking at the code to figure out what exactly happens. The branching instructions and special register MCore handling gave me some insights on the flow. I looked at the ARM linker script on what actually calls the start function and prints “Booting Linux on physical CPU…” and looked at the iMX7d Reset Controller registers and Clock gating registers to figure out if all the required settings are done or not. Everything looks okay as per the datasheet. I just have no clue what the hell is happening. It reached to a point where the CPU1 is powered up, it successfully calls the assembly routine and eventually ends up in secondary_start_kernel() in arch/arm/kernel/smp.c and in the process of figuring this out, I learned how to print ascii messages from ARM assembly (some yay moments). (There is an assembly file at arch/arm/kernel/debug.S which contains a lots of helper routines to print ascii, chars and hex digits from assembly). But then from secondary_start_kernel(), the code reports a completion (by calling complete(&cpu_running)) and wakes up a blocking process on calling processor (usually CPU0). But for whatever reason, this completion sometimes never reaches to CPU0. The data structure of “struct completion” somehow seems to be messed up between two CPUs (some caching issue as far as I can imagine). And even if sometimes the completion wakes up the blocking process on CPU0, the CPU online mask bits (cpu_onlne_mask) is not properly updated. At the moment I have no idea how do I debug this issue, and I am sad about it :-(

As a workaround to the original kexec problem, now we are calling the kexec with “nosmp” kernel command line so that the secondary core is never booted and causes any troubles. But I would keep an eye on this problem and definitely try to figure out what’s going on.

The only good thing that happened that happened while debugging this problem is that now I have become more familiar with the part of kernel code which I had never touched upon and did some debugging on assembly for the first time. Yay!

Happy debugging!


Debugging again…

I got one more kernel related problem which made me happier, that finally I was debugging something which I love. But the fading of love is directly proportional to the time it takes to solve the problem. And then, a sudden surge of satisfaction, a kind of kick you get, once you figured out the root cause.

This time the problem was showing the symptom at user-space. We recently upgraded the kernel version, and already working code at userspace broke after the upgrade. The basic logic would say that something is broken at kernel space. Since the problem was appearing when running our complex audio application, I wanted to get rid of all the complex stuff and wanted to reproduce it with basic ALSA utilities (aplay, arecord). And as usual, the problem did not appear when trying to reproduce with aplay and arecord. This was enough to force us to think that something is wrong with audio application. The exception reported by the application was about failing to set the ALSA hardware params. I had faces similar problem few years, where hardware param was failing due to passing invalid arguments. But how to identify what parameter was wrong? On the other hand, there was no change in codec driver and application between kernel upgrade. Nevertheless I brushed off the dust from memory and found the actual place in kernel (sound/core/pcm_native.c – snd_pcm_hw_refine()) where the hardware params are refined and could potentially fail if passed an invalid value. The entire calculation and the parameter refinement code is still a puzzle for me. Anyway, I could see that the channel count passed by the audio application was 4 for playback and it turned out be the culprit. More obvious and logical next step was to check whether codec supports 4 channels or not, and found codec actually supports 6 channels for playback. Hmm, strange. Then I tried to run aplay with 4 channels and that worked, without any problem. Looking at the application logs and comments from other engineer who worked on the same problem before me (but could not continue further due to other tasks), I figured the audio application makes use of an ALSA plugin. The ALSA plugin takes 2 channels as input and converts into 4 and passes to actual sound card. I thought something may be broken in the plugin code, but was not fully convinced, since nothing changed at user-space, and the code works just fine with old kernel.

And I entered into the usual printk-debugging method and spent rest of the day without any luck. Spent some time next day to understand the refinement code in pcm_native.c but little or no documentation makes it difficult. Then suddenly thought of taking a look at McASP platform driver (sound/soc/davinci/davinci-mcasp.c) and did a quick diff between two kernel versions. There were significant changes. I glanced through them and found many changes related to imposing channel related constraints. That gave me a direction for debugging further, and found the piece of code which actually overwriting the channel count, and setting the new channel constraint based on overwritten channel count, which I thought was the source of problem. I added some code to bypass that logic and whoa, the user-space application worked like a charm. I also figured why the standalone ALSA apps (aplay and arecord) failed to reproduce the problem. The problem was showing up with the audio application because it was opening playback and capture device at the same time. The order also mattered, because the application was opening capture device prior to playback device. Had I done the same, I could have encountered the problem. Now it is on my task list for Tuesday. Anyway I thought the channel constraint code can be fixed by having per stream channel count and prepared the patch for the same, in anticipation of upstreaming one more kernel patch. While writing the commit message, I thought of taking a look at the code once more to better formulate the message. That’s when I stumbled upon the code commentary which actually describes why the logic was done in that way; and I saved myself from embarrassment. Well, but I learnt something and at the end of the day it was greatly satisfying.

Happy debugging!

Playing with rpmsg on iMX7d

Lately it has been rare that I get to work on some kernel related problems. One of such task was to get “rpmsg” sample modules running on custom iMX7d board. There are many tutorials available, from NXP as well as from other vendors building iMX7d based SoM. So I was a bit confident to get it running pretty soon. But it ended up a bit more complicated than I thought.

First of all, “rpmsg” is a framework which allows main CPU to communicate with another CPU in a system. For example, iMX7d SoC has 2 cores of Cortex-A7 and one Cortex-M4. If you want to establish a communication channel between Cortex-A and Cortex-M, “rpmsg” framework is handy. The framework is based on “virtio” subsystem which I am not much familiar with at the moment. However, I browsed quite some code while debugging the problem that I was running into.

So, as I said, my task was to run a simple rpmsg ping-pong application (or a driver module). The ping-pong application has two parts: 1) Running on Cortex-M and 2) Running on Cortex-A. The software part on Cortex-A is a Linux kernel module. The software part on Cortex-M can either be a bare-metal software code compiled for Cortex-M or a full fledged FreeRTOS code (the entire toolchain for Cortex-M4 with FreeRTOS and demo examples are available from NXP). I used the FreeRTOS based demo example code.

The flow would be something like this:

  1. Initialize the Cortex-M4 before booting linux kernel (i.e. can be done from u-boot)
  2. Boot up Linux kernel
  3. Load ping-pong module from Linux user-space.

As soon as the module is loaded from user-space, the rpmsg communication channel will be initialized and sample data transfer will be started between Cortex-A and Cortex-M.

These looked pretty easy since most of the stuff is readily available (i.e. Cortex-M application, rpmsg framework and ping pong driver in linux kernel etc). I enabled the rpmsg and ping-pong module in linux kernel, and loaded the modules from user-space and boom, nothing on console or on dmesg. Then I figured that I missed enabling the “rpmsg” device node in DTS. I enabled it and rebooted the kernel, wow, kernel did not even boot. The usual debugging started by enabling the debugs (early printk) and for early init calls.  The culprit was rpmsg platform driver for imx (arch/arm/mach-imx/imx_rpmsg.c). The platform driver maps a region of address range which is used as a shared memory between Cortex-M and Cortex-A. So next logical step was to check that how does that not fail for iMX7D sabresd board (and boards from other vendors). Then I found that there is a dedicated device-tree for Cortex-M4. I could notice that in that device tree file, the usable memory (RAM) was reduced, and some portion of that left out memory was used as shared memory between Cortex-A and Cortex-M. I tried to make similar change on the device tree file for our platform and the crash was gone. But now I could see the warning (kernel trace caused by WARN_ON) pointing to a failure in ioremap, which was obvious since I was using wrong memory range in imx rpmsg platform driver. So I made following change:

--- a/arch/arm/mach-imx/imx_rpmsg.c
+++ b/arch/arm/mach-imx/imx_rpmsg.c
@@ -290,8 +290,8 @@ static int imx_rpmsg_probe(struct platform_device *pdev)
                        ret |= of_device_is_compatible(np, "fsl,imx6sx-rpmsg");
                        if (ret) {
                                /* hardcodes here now. */
-                               rpdev->vring[0] = 0xBFFF0000;
-                               rpdev->vring[1] = 0xBFFF8000;
+                               rpdev->vring[0] = 0x9FFF0000;
+                               rpdev->vring[1] = 0x9FFF8000;
                } else {

With this change, the warning was gone, and rpmsg driver was successfully registered as I could see in the dmesg. However, the loading ping-pong module still did not initialize the communication between two cores. I wondered in disbelief and checked the sysfs entries to see if rpmsg device is present or not. I found that rpmsg was registered as a bus driver as well as a device under virtio bus. So I could see a device listing under virtio bus, but no device under rpmsg bus. By looking into code somewhat deeper, I could see that a device under rpmsg bus is registered when an rpmsg endpoint is created and an initial communication is established successfully (by NS Announcement message exchange). This happens in rpmsg_create_channel() routine which is invoked within a callback registered while creating an endpoint. So, the problem was now about the missing communication message between M4 and A7. This took me a while to figure out, but I was a bit relieved that I am not missing any patch in kernel which might have caused missing device registration.

Wandering here and there in code for a while, I thought of looking at the code of FreeRTOS example and thought may be something needs to be changed there, related to memory mapping. And bingo, I had to adjust the memory mapping which I did change in kernel side but not on FreeRTOS side. The memory addresses are hard-coded and should be changed in middleware/multicore/open-amp/porting/imx7d_m4/platform_info.c file (VRING0_BASE and VRING1_BASE). With this change, I rebuilt the example binary, loaded it via uboot, rebooted linux kernel and wow, I could see an initial message exchange, but the entire buffer was filled with 0x00 to my surprise. I had no clue why this was the case, and then I wasted almost my entire day in putting debug prints and adding more debug code in rpmsg, virtio, virtio-ring drivers, in FreeRTOS middleware and platform driver but no luck. With extreme frustration, I decided to make one last change in device-tree file. Our platform has 512 MB of RAM (starting from address range 0x80000000). Since I am using 0x9FFF0000 as a starting range for shared buffer between the cores, I had initially limited the usable memory like this:

+/ {
+       memory {
+               linux,usable-memory = <0x80000000 0x10000000>;
+       };

Logically I thought this should work since, I am mocking Linux MM that system has 256 MB of RAM than 512 MB. And I could safely ioremap address starting from 0x9FFF0000. But to my surprise this never worked. I had to change the mapping to following to get the example running (which was a bit satisfying and a bit discouraging since I wasted entire day):

+/ {
+       memory {
+               linux,usable-memory = <0x80000000 0x1FF00000>;
+       };

This is still a puzzle to me, but I plan to post a question on NXP community to get this answered. If I get the answer, I will update the post.

Overall a good exercise of code browsing and learning something new about rpmsg and virtio framework.

Happy debugging.

Debugging early kernel crash

I was trying to get kexec running on our new platform, but it was failing. The kexec executable reads new uimage, dtb etc correctly and also jumps to new kernel, but then nothing – system just hangs. There is already some workaround related to system reset in the new platform that we have, so I was suspecting if the “cpu soft reset” is working correctly or not. This suspicion came to my mind when going through the kexec syscall implementation, which is nothing but a special “reboot” case. However, that was not the case as confirmed by our BSP provider. Since, there is nothing on the debug UART console, I was frustrated on how to proceed further. Luckily our contact person for the platform vendor suggested to enable “earlyprintk”. I knew about that kernel feature, but never thought I would ever gonna need that. But (fortunately) I have to work on so many new things here of which I had no prior experience. So, yea, why not, let us enable earlyprintk. And wow, I could see the kernel was booting, but stalling somewhere after initializing RAM.

Hmm, something is better than nothing. Next thing I did was to take a look at the dmesg log of a regular booting process, and tried to identify what comes next after RAM initialization. I just blindly grepp’ed the console message in kernel source tree, and I was landed in some __initcall – wow, another thing in kernel which I have been ignoring since quite a while.  Now it was a time to understand what it does, and how can I debug it. So naturally it was a time to google and I found an extremely insightful article on github books here. The entire book is awesome to understand linux internals btw. So, then I know how the calls are being made, and next idea was to see which call was causing the trouble.

So I added a few prints in init/main.c from where the __initcall loop is iterating (do_initcall_level). But this function just calls the function by pointers, so how do I know the name of that function. And luckily I am not the only one who wants to know the function name from the function pointer: linux kernel already has a way to do that. I can’t recall it right now, and googled it while writing this down, and found this. The printks can handle more than just function name extraction out of pointer, it can print IP addresses and UUIDs and what not. So, yea, with that I actually printed current function, and next function in queue, and I got to know the blocking function.

diff --git a/init/main.c b/init/main.c
index 2a89545..ee81844 100644
--- a/init/main.c
+++ b/init/main.c
@@ -841,7 +841,7 @@ static char *initcall_level_names[] __initdata = {
 static void __init do_initcall_level(int level)
-	initcall_t *fn;
+	initcall_t *fn, *fn_next;
 	strcpy(initcall_command_line, saved_command_line);
@@ -850,8 +850,14 @@ static void __init do_initcall_level(int level)
 		   level, level,
-	for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
+	for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++) {
+		printk(" Calling: %ps\t", fn);
+		fn_next = fn + 1;
+		if (fn_next)
+			printk(" Next: %ps", fn_next);
+		printk("\n");
+	}
 static void __init do_initcalls(void)


This debugging session is more than a month old, but today I had to debug a similar problem, the early boot up crash, where kernel just refused to boot, and I had to enable the earlyprintk. I am just writing this down, because I don’t want to forget how did I debug such kind of issues.

Happy debugging!

Jack Detection on MSM8960

I was wondering how the Jack detection events are reported upto user-space. Found something while discussing it with a colleage.

VT: dost
VT: need your help..
VT: frameworks/base/media/java/android/media/
VT: i observed that this file gets some notification when headphone is connected..
VT: i want to understand how that is happening..
– 12:26:12 IST –
VT: i found it.. :-) same file as you mentioned yesterday, WiredAccessoryObserver fires and Intent if a headset is plugged
TB: it will be like kernel sends the uevent/notification for detection,removal …
TB: wired accessory observer listens those and based on tht sets the current state …
VT: yea..
VT: one more thing.. how do i enable the logging in that file?
VT: services/java/com/android/server/
TB: Its enabled by default …
TB: but you may not get log.v/log.d
TB: so you can convert it to log.e
TB: or either by filtering the log …
TB: using logcat options …
VT: don’t we need to enable it by using some global #defines?
VT: something like ALOG?
VT: i dont know..u r the boss
TB: tht will work in cpp
TB: this is java file :-)
VT: yea right
TB: private static final boolean LOG = true;
VT: there something called Slog.v()
TB: this lineis true
TB: change it to Slog.w/e
VT: yea right
VT: then recompile?
TB: yes
VT: and push the .so file, right?
TB: you may need to flash complete image
VT: :-(
VT: but what to build then?
TB: whole source code
VT: man!! how long it will take?
TB: for anyhing you modfy in system services…
TB: ha ha ha
TB: no other option …
VT: sucks man.. can’t i only build “frameworks/base/services/java” ?
VT: it has a top level
TB: normaly it doesn’t work …
VT: anyways, let me rebuild it
VT: thanks!!
TB: we have seen it during VS service changes :-)
TB: you can try logcat options
TB: with your current build …
TB: logs are enabled by default…
VT: yea, but will it also print Slog.v ??
TB: yes …
TB: you need to use something like V:-* with logtag …
– 01:17:31 IST –
VT: dost. .
TB: bolo ne…
VT: dragon vali public aa wireaccessory manager nathi use kari
VT: they have something called windowmanager, which is used to detect accessories
VT: but i fail to understand the code of window
VT: i mean, it is a differnt file, let me send u the path
VT: vi ./frameworks/base/policy/src/com/android/internal/policy/impl/ +2646
– 01:26:41 IST –
TB: I looked the code but I am not aware from where its getting called …
VT: i got that
VT: base/services/input/EventHub.cpp
VT: long path.. troublesome
VT: it is using /dev/input/events* file and executes ioctls to detect the events
VT: weird
TB: thts how android implementation at other places as well …
VT: qualcomm guys are really weird.. i am surprised to see the code like this..
VT: just look at that file, EventHub.cpp
VT: function getEvents()
VT: it has an infinite loop
VT: anyways, it looks like i’ve figured out how jack detection works on dragonics
TB: it uses epoll_wait()
TB: so it blocks on there …
TB: whenever event is received, it continues from there …
VT: yea..
VT: for timeoutMillis
TB: you can see similar stuff in system/core/init/ueventd.c
TB: it blocks for indefinite time
VT: hmm..

mmap v/s malloc

I was browsing LKNB forum and came across a code snippet doing mmap() with MAP_ANONYMOUS flag. The surprise with this mmap() call was the “fd” used in this call. It was -1. Doing a quick read of man pages did not help much. Though it showed that fd argument is ignored if the MAP_ANONYMOUS flag is used.

As usual, I did google and came across this nice link which explains the benefits of using mmap() in special situations:!topic/comp.sys.hp.hpux/TNR-GpVK73k

Wiki page of mmap also has some information on anonymous (and file backed) mapping. Here is the link:

Having said all this, following is the code snippet which triggered all the confusion. The beauty of the code is, it tries to map the NULL address to a valid memory location :-)

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>

int main(void)
int *ptr = NULL;
if(ptr == MAP_FAILED)
perror("Error in mapping\n");
printf("After mmap\n");
*ptr = 16;
printf("Contents of address 0x%x is :: %d\n",ptr,*ptr);
return 0;

To make above code work, /proc/sys/vm/mmap_min_addr needs to be set to 0.