Parallel and Concurrent computing part I

Carlos Gustavo Merolla
Apr 4, 2020
8 min read

Updated: Jun 23, 2020

In this post we will get into some details and code to discover why the Spresense is such a good tool when you need intelligence at the Edge.

As it was mentioned in the first posts of this series, the Spresense has six cores for the application domain, that means that you can use all those cores for your application. In addition to that, the Spresense uses the NuttX as its operating system, that is a great thing to achieve an even extremer level of resource optimization. The NuttX is a POSIX compliant operating system and that means that you have a few APIs that you can use in your application. The following posts will be a title bit different from the previous ones, from now on, less concepts, more code (it will get worse as we advance)

Multi-core programming sounds scary but it really isn't that hard, threads on the other hand require a bit more of attention. In a way, each Spresense core is isolated from the rest and one core, the main core, is responsible for synchronizing the rest. Please, refer to the previous posts to see the details about what you can do and what you cannot within cores and sub-cores.

Multiple cores and threads will allow you to achieve parallelism and concurrency, each one of these concepts has its advantages. Parallelism means that several tasks are resolved at the same time while concurrency means that some portions of code can operate partially in no particular order to achieve a goal.

Main Core

For now, this post (and the following one) will teach you how to create or "boot" cores and handle threads. The Sony Spresense SDK provides you with a class called MP (For multi-processing) that will allow you to activate the sub-cores whenever you need them. The main core is always booted or started the moment you power the board. We will use the Arduino IDE to do the job and you will see that it is a pretty easy procedure. First of all, you will have to have different sketches for each core and you will upload them one by one (you cannot flash code for different cores at the same time). The code for the main core looks like this:

/* --- This code is meant to run on the main core --- */

#ifdef SUBCORE
#error "Select main core to upload the code (tools->main core)"
#endif

/* --- The multiprocessing header file --- */
#include <MP.h>

void setup()
{
  int e= 0;
  
  /* --- Serial comm enabling (Console) --- */
  Serial.begin(115200);
  while (!Serial);
 
  e= MP.begin(1); /* -- 1 is the sub core 1 -- */
  if (e < 0) 
  {
        MPLog( "Error booting Sub-core 1 [code: %d]\n", e);
        /* --- Do nothing, this is just an example! --- */
        while(1);
  } 
}

void loop()
{
  MPLog( "I am the main core and sub-core 1 is running \n" );
  sleep(10);
  MPLog( "I am the main core and I Stopped sub-core 1\n" );
  MP.end(1);
  MPLog( "I am the main core and I Restarted sub-core 1\n" );
  sleep(10);
  MP.begin(1);
}

The code is pretty straight forward, copy this code into a sketch, save it, check that the Spresense platform is selected and compile it. When you are about to flash the binary package into the board you will have to select which core you want to flash, to do so, just click on the IDE Tool menu option and set the option "core" to "main core" and flash it. Once you have flashed the code, open the console and you will notice and error message.... that's normal because the sub-core 1 you are trying to enable is not flashed yet. Before we go to that step, some quick explanations on this first sketch.

The first three lines are preprocessor guards to avoid uploading code to the wrong core. Each time you change the core option on the Arduino IDE the definition of the macro #SUBCORE will change, if you selected the sub-core 1 then the value of #SUBCORE will be 1, if you chose the main core then the #SUBCORE macro won't be defined. So, if you have selected a sub-core and try to upload this sketch it will fail at flash time.

The rest of the code is almost self-explanatory. In order to use the Multi-Processing classes you will have to include the MP.h header file. The MP Class is responsible for managing cores, in this case and inside the setup() function you will find the MP.begin(1) method call to activate the sub-core 1. If everything goes fine this method will return 0 and the sub-core 1 will be running in parallel alongside the main core. Sub-cores are numbered from 1.

Inside the loop function we just stop and start the sub-core periodically. Two remarks here, the MPLog class is a safe way to write log messages to the Serial Console (using a string format a la printf() C function); the other thing to pay attention to is the use of sleep() instead of delay(), why? because we are going to play with threads... you will notice the use of sleep() when you need to use the Spresense Camera class uses threads as well. Needless to say MP.end(1) stops the sub-core 1.

Sub-Core 1

For the sub-core 1 you will have to create a separated sketch and we are going to go a little bit fancy here to allow you understand the resources at your disposal. While the main core will do almost nothing, the sub-core 1 will play with the built in LEDs but we will use threads to do so.

While this is an exaggeration for a simple example you will find some new tools and some good practices to make ultra reactive and predictable devices. To make it as clear and simple as possible, cores run at the same time in parallel while threads run inside a core and they are executed when the core finds some idle time (while the core is sleeping or waiting for some resource to be ready) . Threads can have priorities and many other attributes but we will leave them for future posts, but the basic idea is that threads competes for execution time, sometimes, you will want one thread to have a higher execution priority over other threads trying to get the attention.

The basic code for the sub-core should look like this:

#if (SUBCORE != 1)
#error "Select Sub-core 1 to upload the code (tools->SubCore1)"
#endif

#include <MP.h>

void setup()
{
      MP.begin();
}

void loop() 
{
  MPLog( "Sub-core 1 Running\n");
  sleep(1); 
}

Write the sketch, select Tool ->Core: Sub-core 1, compile and upload. Now, you will see that the error message displayed by the previous sketch is not longer there because we have now the two cores in place. This skeleton is quite simple, it just starts the core and prints a message, now, you will see the two cores sending messages to the serial console. With this approach you can optimize many things, for instance, you could take a picture on the main core and then split the picture into 4 pieces and send them to other core to run pattern detection algorithms, all at the same time. In future posts I will show you how to exchange information between cores.

But for now we are going to use the sub-core 1 idle time to turn LEDs on and off in order to introduce the concept of threads. There are a few interesting things that I want to remark here. As I said in a previous post, the Spresense SDK has its own methods to create tasks and the like, however I want to show you something really important in my opinion. The NuttX is a POSIX compliant RTOS, that means that it implements some operating level system calls that will compile and run on any other POSIX compliant operating system, that means that your code is portable to other platforms (it may have a different behavior though because POSIX is implemented by regular operating systems as well, not only real time OS ones). In the context of my project, I need some things to run on different types of machines so, it is a great thing to write some POSIX compliant code, write once, compile everywhere.

So, what is a thread? A thread is a simple function that will be called each time the processor has some spare time, there are many uses of threads, for instance, one thread could collect notifications and another one could send them over a Radio. That will make effective use of every clock cycle in your system, specially if you are using resources with a high latency such as radios, SD cards and other external components in general. For instance, if you are used to ultrasonic sensors, the waiting time between sending the sound and receiving the echo could be used for other things.

The functions used as a thread has a particular form:

static void your_function( void *arg );

The arg parameter will be passed the moment you create the thread, it could be a buffer handler or whatever you want (You will have to do the proper casting from void *). Now, this function looks like any other function. To run this function as an independent thread you will use the POSIX call:

from <pthread.h>

int pthread_create( pthread_t *th, const pthread_attr_t *attr, 
                    void *(*your_function) (void *), void *arg);

Looks scary right? well, it is not. The your_function parameter is the standard way of declaring a function pointer, there, you will place your function. Both attr and arg can be set to NULL if you don't need to specify parameters or priorities for each thread, if you don't specify priorities, threads will be executed in a round robin way (a cycle following the order of thread creation). the Attr parameter is used to define fine grained controls on the thread, such as priorities, cancellation types, etc. Arg is the argument that will be passed to your function and finally, th is a handler for the thread itself, think about it as a file handler. On success this function call will return 0.

Another great thing about the Spresense SDK is that you can freely mix pthread (or any other POSIX supported interface) calls with the ones provided by the SDK with no conflicts. In the next example we are going to use a mutex lock to prevent simultaneous access to the board's LEDs. This is not necessary but it will illustrate the complete idea.

A mutex lock is an atomic guard, a kind of semaphore, for concurrent code. The idea is that each time a thread wants to access a shared resource it will try to get and lock (Atomic means no other thread can get in the middle of getting and locking) the semaphore preventing other threads to make use of the shared resource, once the job is done, the mutex is released and another thread will be able to access the resource. While the POSIX standard has its mutex functions and types, I will use the one provided by the Spresense SDK just to illustrate that you can work at the level you want.

In the next post, I will show you the details for the sub-core 1 code adapted to this scheme. For now, I hope I was able to clearly explain these concepts. If that is not the case and you are not a wix registered user feel free to contact me at the email provided in the BCB Home page; OK, in the case you are lazy to look for it: dnna.cgm11@gmail.com

To be continued in the following post... A character limit issue!

Wild

Edge

Big Cat Brother

Parallel and Concurrent computing part I

In this post we will get into some details and code to discover why the Spresense is such a good tool when you need intelligence at the Edge.

Main Core

Sub-Core 1

Recent Posts

Comments