Playing with rpmsg on iMX7d

Lately it has been rare that I get to work on some kernel related problems. One of such task was to get “rpmsg” sample modules running on custom iMX7d board. There are many tutorials available, from NXP as well as from other vendors building iMX7d based SoM. So I was a bit confident to get it running pretty soon. But it ended up a bit more complicated than I thought.

First of all, “rpmsg” is a framework which allows main CPU to communicate with another CPU in a system. For example, iMX7d SoC has 2 cores of Cortex-A7 and one Cortex-M4. If you want to establish a communication channel between Cortex-A and Cortex-M, “rpmsg” framework is handy. The framework is based on “virtio” subsystem which I am not much familiar with at the moment. However, I browsed quite some code while debugging the problem that I was running into.

So, as I said, my task was to run a simple rpmsg ping-pong application (or a driver module). The ping-pong application has two parts: 1) Running on Cortex-M and 2) Running on Cortex-A. The software part on Cortex-A is a Linux kernel module. The software part on Cortex-M can either be a bare-metal software code compiled for Cortex-M or a full fledged FreeRTOS code (the entire toolchain for Cortex-M4 with FreeRTOS and demo examples are available from NXP). I used the FreeRTOS based demo example code.

The flow would be something like this:

  1. Initialize the Cortex-M4 before booting linux kernel (i.e. can be done from u-boot)
  2. Boot up Linux kernel
  3. Load ping-pong module from Linux user-space.

As soon as the module is loaded from user-space, the rpmsg communication channel will be initialized and sample data transfer will be started between Cortex-A and Cortex-M.

These looked pretty easy since most of the stuff is readily available (i.e. Cortex-M application, rpmsg framework and ping pong driver in linux kernel etc). I enabled the rpmsg and ping-pong module in linux kernel, and loaded the modules from user-space and boom, nothing on console or on dmesg. Then I figured that I missed enabling the “rpmsg” device node in DTS. I enabled it and rebooted the kernel, wow, kernel did not even boot. The usual debugging started by enabling the debugs (early printk) and for early init calls.  The culprit was rpmsg platform driver for imx (arch/arm/mach-imx/imx_rpmsg.c). The platform driver maps a region of address range which is used as a shared memory between Cortex-M and Cortex-A. So next logical step was to check that how does that not fail for iMX7D sabresd board (and boards from other vendors). Then I found that there is a dedicated device-tree for Cortex-M4. I could notice that in that device tree file, the usable memory (RAM) was reduced, and some portion of that left out memory was used as shared memory between Cortex-A and Cortex-M. I tried to make similar change on the device tree file for our platform and the crash was gone. But now I could see the warning (kernel trace caused by WARN_ON) pointing to a failure in ioremap, which was obvious since I was using wrong memory range in imx rpmsg platform driver. So I made following change:

--- a/arch/arm/mach-imx/imx_rpmsg.c
+++ b/arch/arm/mach-imx/imx_rpmsg.c
@@ -290,8 +290,8 @@ static int imx_rpmsg_probe(struct platform_device *pdev)
                        ret |= of_device_is_compatible(np, "fsl,imx6sx-rpmsg");
                        if (ret) {
                                /* hardcodes here now. */
-                               rpdev->vring[0] = 0xBFFF0000;
-                               rpdev->vring[1] = 0xBFFF8000;
+                               rpdev->vring[0] = 0x9FFF0000;
+                               rpdev->vring[1] = 0x9FFF8000;
                } else {

With this change, the warning was gone, and rpmsg driver was successfully registered as I could see in the dmesg. However, the loading ping-pong module still did not initialize the communication between two cores. I wondered in disbelief and checked the sysfs entries to see if rpmsg device is present or not. I found that rpmsg was registered as a bus driver as well as a device under virtio bus. So I could see a device listing under virtio bus, but no device under rpmsg bus. By looking into code somewhat deeper, I could see that a device under rpmsg bus is registered when an rpmsg endpoint is created and an initial communication is established successfully (by NS Announcement message exchange). This happens in rpmsg_create_channel() routine which is invoked within a callback registered while creating an endpoint. So, the problem was now about the missing communication message between M4 and A7. This took me a while to figure out, but I was a bit relieved that I am not missing any patch in kernel which might have caused missing device registration.

Wandering here and there in code for a while, I thought of looking at the code of FreeRTOS example and thought may be something needs to be changed there, related to memory mapping. And bingo, I had to adjust the memory mapping which I did change in kernel side but not on FreeRTOS side. The memory addresses are hard-coded and should be changed in middleware/multicore/open-amp/porting/imx7d_m4/platform_info.c file (VRING0_BASE and VRING1_BASE). With this change, I rebuilt the example binary, loaded it via uboot, rebooted linux kernel and wow, I could see an initial message exchange, but the entire buffer was filled with 0x00 to my surprise. I had no clue why this was the case, and then I wasted almost my entire day in putting debug prints and adding more debug code in rpmsg, virtio, virtio-ring drivers, in FreeRTOS middleware and platform driver but no luck. With extreme frustration, I decided to make one last change in device-tree file. Our platform has 512 MB of RAM (starting from address range 0x80000000). Since I am using 0x9FFF0000 as a starting range for shared buffer between the cores, I had initially limited the usable memory like this:

+/ {
+       memory {
+               linux,usable-memory = <0x80000000 0x10000000>;
+       };

Logically I thought this should work since, I am mocking Linux MM that system has 256 MB of RAM than 512 MB. And I could safely ioremap address starting from 0x9FFF0000. But to my surprise this never worked. I had to change the mapping to following to get the example running (which was a bit satisfying and a bit discouraging since I wasted entire day):

+/ {
+       memory {
+               linux,usable-memory = <0x80000000 0x1FF00000>;
+       };

This is still a puzzle to me, but I plan to post a question on NXP community to get this answered. If I get the answer, I will update the post.

Overall a good exercise of code browsing and learning something new about rpmsg and virtio framework.

Happy debugging.

Sending attachments via Linux command line

I tried with sendmail, and sendemail (notice the “e” in sendemail). Could not find the option for sending attachments in sendmail, and sendemail requires “from” field to send the email.

So here is the result from, nice and dandy:

echo "This is the message body" | mutt -a "/path/to/" -s "subject of message" --

Usual updates

I was on leave for a week and a half due to back pain suggested by doctors. Today is the last day of that boring imparted vacation. In the morning I visited a doctor again, and she suggested to take rest again for a week. I feel bad since I am much better now and can resume work from tomorrow. So I have decided to go against this mandatory rest and would resume the work to see if the back is really better. Otherwise would visit doctor again on Friday.

So what I have done this whole duration: I decided to start my Android developer course on Udemy to which I registered a couple of months back. Learning android programming was my hobby but I could never start it systematically. (Everytime I had some ideas for which I wanted to make an app, and start programming randomly in Android – just learning the things which I required. And in that process, I never finished any of my apps and never actually learned the real android programming). This course from Udemy was on discount and I got to enroll for just 10 Euros. I finished 4 sessions and I feel great. At least I know something very basic but in a systematic way.

And during this process, I encountered so many problems with Linux. First of all, the JDK installation, setting up correct Android emulator, the option for installing KVM modules and on top of all, the highly unstable Android Studio for Linux. I cannot decide to whom to blame, Android Studio OR the new “Wayland” (replacement of Xorg). To point out one of the problems, newer version of Android studio has introduced this new layout, called Constrained Layout. I liked it very much since it provides a great freedom in arranging the widgets. However, most of the work needs to be done using drag/drop with mouse. That is where the you need a solid native GUI framework (usually from OS). On Linux, the mouse used to stop working properly as soon as I start playing around with “Constrained Layout” in Android studio. I have no option other than restarting the studio. This is pathetic. As soon as I close the studio, mouse starts working as usual. Then I thought of giving Windows a try. I installed Android studio on windows, and viola, only once executable bundle. Just click next, next and next, you are done. No need of installing separate JDK, setting the path, no need to download special emulator packages. And on top of that, there was no bug in accessing “Constrained Layout”. Mouse just worked as expected. I can now feel the pain of people who want to do real work and don’t want to dive into the hassles of OS issues. After these 15 years, this is the first crack in my love with Linux – just due to this sluggish/buggy/slow GUI and due to lack of standardized GUI framework. I feel that my Windows 8 boots faster than my Fedora 25 (with latest updates installed, WTF)

Anyway, Android’s latest version (Developers edition) O has been rolled out today. My phone is still running KitKat 4.4.2 (shame shame)! Samsung is announcing next phone on 29th of this month, and would be cursing Google on this roll out of O :-O. Samsung is the slowest among all in rolling out the updates. I wish to buy some cheaper motorola device now, g5+ mostly, just to get newest of android, because I like stock android: OnePlus is a cousin of Samsung in terms of updates, and Pixel is overpriced just like iPhone. So that’s the android story.

I started reading the Satya na Prayogo: an autobiography of Mahatma Gandhi. I have no words to explain his brutal honesty, his simplicity, his courage, his faith on God and above all his love for truth. My respect towards him has grown immensely while reading this book. I always felt that I do not know much about him being a guy sharing the birthplace with him ;-) I have heard lots and lots of people cursing him, abusing him and opposing to his views on his ways/methodology of freedom movement. Sometimes I used get carried away with those thoughts but deep inside, I always felt that I don’t know much about Gandhiji and how can I be judgmental about him. Reading his biography is an eye-opener. His thoughts are filled with purity and social service was his motto. He was extremist I think. His experiments were really absurd and hard to understand but the intentions were pure. Respect to him!

And finally winter is getting over leaving some signs of coming Spring. Days are longer and brighter now. Get to see the Sun more, feeling more positive. Hope to get back go work from tomorrow, and finish my android course soon ;-)

Have a nice day (to?, me ;-)!

Create a .desktop entry for Gnome Shell

I don’t want to forget this, so here is my sample entry which I created for Android Studio.

#!/usr/bin/env xdg-open
[Desktop Entry]
Name=Android Studio

Save this file as android-studio.desktop at ~/.local/share/applications. Now it will be visible in the list of application and you can also “search” by pressing <win> / <mod> key.


#include <iostream>
using namespace std;
int main() {
int p = 77778;
long square = p*p;
cout << p << " " << square << endl;
return 0;

Let’s waste some time in figuring out the problem. SMH!

Cortex-M4 UART

i.MX7D SoC is a dual core Cortex-A7 and an additional Cortex-M4. The iMX7D sabre board exposes two debug UART ports when connected to PC: 1) for Cortex-A 2) for Cortex-M. By looking at the schematic, it is easy to figure out that Sabre evaluation board uses UART 2 as the debug port for Cortex-M. However, how to figure out on a custom SoM based on iMX7d uses which UART port for Cortex-M4. This was the mystery of today, which took me a while to figure out. Our BSP provider told us that, the UART port is not fixed for Cortex-M4 and any UART our of UART1 to UART7 can be used for M4. Since 1 of the UART is already in use by A7, there are remaining 6 UARTs. Now how to tell M4 to use one of the UART out of this? Initially I thought there may be some setting in the UART block, but that is not the case. I tried to figure out the answer in Technical Reference Manual, but no luck there as well.

Then I tried to see the FreeRTOS sample code, which is the actual code that runs on M4. And bingo, FreeRTOS sample code contains the UART port initialization. It configures the UART port 2 as debug console, so not at all portable for other custom platforms. There are three things which need to be changed:

  1. Default console macros
  2. pinmux for custom selected UART
  3. clock setting for custom selected UART

With these changes I could get the debug prints from M4 core on different UART than UART2.

Again, writing this down, so that I don’t have to start the exploration session again.

Happy debugging!

Debugging early kernel crash

I was trying to get kexec running on our new platform, but it was failing. The kexec executable reads new uimage, dtb etc correctly and also jumps to new kernel, but then nothing – system just hangs. There is already some workaround related to system reset in the new platform that we have, so I was suspecting if the “cpu soft reset” is working correctly or not. This suspicion came to my mind when going through the kexec syscall implementation, which is nothing but a special “reboot” case. However, that was not the case as confirmed by our BSP provider. Since, there is nothing on the debug UART console, I was frustrated on how to proceed further. Luckily our contact person for the platform vendor suggested to enable “earlyprintk”. I knew about that kernel feature, but never thought I would ever gonna need that. But (fortunately) I have to work on so many new things here of which I had no prior experience. So, yea, why not, let us enable earlyprintk. And wow, I could see the kernel was booting, but stalling somewhere after initializing RAM.

Hmm, something is better than nothing. Next thing I did was to take a look at the dmesg log of a regular booting process, and tried to identify what comes next after RAM initialization. I just blindly grepp’ed the console message in kernel source tree, and I was landed in some __initcall – wow, another thing in kernel which I have been ignoring since quite a while.  Now it was a time to understand what it does, and how can I debug it. So naturally it was a time to google and I found an extremely insightful article on github books here. The entire book is awesome to understand linux internals btw. So, then I know how the calls are being made, and next idea was to see which call was causing the trouble.

So I added a few prints in init/main.c from where the __initcall loop is iterating (do_initcall_level). But this function just calls the function by pointers, so how do I know the name of that function. And luckily I am not the only one who wants to know the function name from the function pointer: linux kernel already has a way to do that. I can’t recall it right now, and googled it while writing this down, and found this. The printks can handle more than just function name extraction out of pointer, it can print IP addresses and UUIDs and what not. So, yea, with that I actually printed current function, and next function in queue, and I got to know the blocking function.

diff --git a/init/main.c b/init/main.c
index 2a89545..ee81844 100644
--- a/init/main.c
+++ b/init/main.c
@@ -841,7 +841,7 @@ static char *initcall_level_names[] __initdata = {
 static void __init do_initcall_level(int level)
-	initcall_t *fn;
+	initcall_t *fn, *fn_next;
 	strcpy(initcall_command_line, saved_command_line);
@@ -850,8 +850,14 @@ static void __init do_initcall_level(int level)
 		   level, level,
-	for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++)
+	for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++) {
+		printk(" Calling: %ps\t", fn);
+		fn_next = fn + 1;
+		if (fn_next)
+			printk(" Next: %ps", fn_next);
+		printk("\n");
+	}
 static void __init do_initcalls(void)


This debugging session is more than a month old, but today I had to debug a similar problem, the early boot up crash, where kernel just refused to boot, and I had to enable the earlyprintk. I am just writing this down, because I don’t want to forget how did I debug such kind of issues.

Happy debugging!