Category Archives: Research

Multithreading in a MATLAB mex file

Together with colleagues from the F.C. Donders Centre and the NICI, I am currently working on a project on real-time analysis of EEG and MEG data for Brain-Computer Interfacing (BCI). Since we already have a lot of relevant code in MATLAB and since a lot of the target end-users are familiar with it, we would like to do the real-time processing in MATLAB.

The idea is to read the data that is streaming from the EEG amplifier, do the computation on the data (e.g. spatial filtering, spectral decomposition and classification) and then continue with the next section of the data. The challenge is not to lose any of the streaming EEG data while MATLAB is busy with the computation. A potential solution would be to implement the reading/acquisition of the data into matlab in a mex file which is multi-threaded.

For this purpose I wrote some code for testing the multithreading in MATLAB. The test code below demonstrates that it works, and shows how easy it actually is.

I am working on an Apple PowerBook G4 with OS X 10.4.11 and with MATLAB Version 7.4 (R2007a). The example code uses the POSIX Pthread libraries and I would expect the same code to work on most Unix and Linux systems. It will not work on a windows system, but might work with pthreads-win32.

Put the following code in a file with the name threads.c

#include < pthread.h > /* for threading */
#include < unistd.h > /* for sleep */
#include "mex.h"
#include "matrix.h"

void *do_thread(void *threadid)
{
int tid;
tid = (int)threadid;
mexPrintf("In thread: just started thread #%dn", tid);
sleep(4); /* pretend that the code is busy for a few seconds */
mexPrintf("In thread: about to exit from thread #%dn", tid);
pthread_exit(NULL);
}

void mexFunction (int nlhs, mxArray * plhs[], int nrhs, const mxArray * prhs[])
{
int num;

if (nrhs<1)
mexErrMsgTxt("not enough input arguments");
else
num = mxGetScalar(prhs[0]);

pthread_t threads[num];
int rc, t;
for(t=0; t < num ; t++){ mexPrintf("In main: creating thread %dn", t); rc = pthread_create(&threads[t], NULL, do_thread, (void *)t); if (rc){ mexErrMsgTxt("problem with return code from pthread_create()"); } sleep(1); /* wait some time before making the next thread */ } return; }

Subsequently, in MATLAB you should compile the mex file using the following command:

>> mex threads.c -lpthread

After that, you can run the mex function. It takes a single argument, corresponding to the number of threads to create. Each thread will live for three seconds (pretending that it is busy for some time), and the time between threads being created is one second. Some example output is below.

>> threads(6)
In main: creating thread 0
In thread: just started thread #0
In main: creating thread 1
In thread: just started thread #1
In main: creating thread 2
In thread: just started thread #2
In main: creating thread 3
In thread: just started thread #3
In thread: about to exit from thread #0
In main: creating thread 4
In thread: just started thread #4
In thread: about to exit from thread #1
In main: creating thread 5
In thread: just started thread #5
In thread: about to exit from thread #2
>> In thread: about to exit from thread #3
In thread: about to exit from thread #4
In thread: about to exit from thread #5
>>

You can see each thread starting and stopping. The last threads stop after the control has been returned to the MATLAB prompt.

Here you can find some comments from Mathworks, including some windows specific example code. Important to note is that MATLAB itself is single-threaded, and not thread safe. Hence you should take care in interfacing your threaded c-code in the mex file with the MATLAB core.

I would appreciate it to get some feedback on this. Please leave a comment below if you tested it successfully or if it does not work for you, with some details on your configuration (like operating system and MATLAB version).

Average reference for dipole fitting

Here is a question that I get asked occasionally. I have slightly edited the question and my answer to it, which both were posted to the EEGLAB email discussion list.

Question: I’m using EEGLAB-DIPFIT to localize independent components using the spherical head model. Apparently the software requires the data to use the average reference. Why is this?

Answer: In principle you could use an arbitrary reference in your source reconstruction. The practical reason to use an average reference over the sampled electrodes in source estimation is that this prevents the solution to be biassed due to forward modelling errors at the reference electrode. Let me give a partially intuitive, partially mathematical explanation.

Assume that you would use left mastoid as reference. That would mean that the measured value “V” at each electrode “x” is V_x, so the list of all measured values in the N channels is

V_C3-V_M1
V_Cz-V_M1
V_C4-V_M1
...
V_M1-V_M1 (this is zero)
V_M2-V_M1

Those values can be modeled using the source model and the volume condution model. Now, lets assume a spherical volume conduction model. That is especially inaccurate for low electrodes, and the bony structure of the mastoid is definitely not modelled appropriately in a spherical model. So for the model potential “P” we would have the value at each of the N electrode also referenced to the model mastoid
electrode:

P_C3-P_M1
P_Cz-P_M1
P_C4-P_M1
...
P_M1-P_M1 (this is zero)
P_M2-P_M1

The source estimation algorithm tries to minimize the quadratic error between model potential distribution and the measurement, so the error term to be minimized is

Total_Error
= sum of quadratic error over all channels
= [(V_C3-V_M1)-(P_C3-P_M1)]^2 + ....
= [(V_C3-P_C3)-(V_M1-P_M1)]^2 + .... (here the terms are re-ordered)

So for each channel the error term consists of a part that corresponds to the potential at the electrode of interest, plus a part that corresponds to the reference electrode. The error term corresponding to the reference electrode is identical over all channels (i.e., repeats in each channel), hence for each channel you are adding some error term for the reference electrode. Therefore, the minimum error (“minimum norm”) solution will be one that especially tries to minimize the model error at the reference electrode (since that is included N times). In the case of a mastoid reference we know that there is a large volume conductor model error at M1, hence the source solution would mainly try to minimize that error term. The result would be that the source solution would be biassed, because it tries to reduce the (systematic) error at the reference.

The solution is to use an average reference (average over all measured electrodes). That implicitely assumes that the model error over all electrodes is on average zero, hence the minimum norm solution is not biassed towards a specific reference electrode.

P.S. The maths in my explanation above are rather sloppy, but the argument still holds for a more elaborate mathematical derivation which would assume the forward model inaccuracies are uncorrelated over electrode sites.

High-density EEG electrode placement

Some time ago we wrote an article on electrode placement for high-resolution EEG measurement (referred to as the 5% article). After its apearance I have noticed that there is a demand for a concise and methodological overview of electrode placement systems. With this page I want to share some of my knowledge on this subject. This page contains non-technical comments on the different standards for electrode placement. Continue reading

EEProbe

EEProbe is a Linux software package for EEG acquisition and analysis that was originally develloped at the Max Planck Institute in Leipzig, Germany (there known as EEP), and that is now further developed as a commercial product by ANT Software in Enschede, The Netherlands.

EEProbe stores its continuous data in a file with the extension *.cnt (not to be confused with the Neuroscan *.cnt format). The *.cnt file is a binary file and the data in it is compressed, which makes reading the data in your own software quite difficult. The file format for averaged ERPs (*.avr) is also binary, but not compressed. Furthermore there are files that contain event information such as triggers (*.trg), rejection marks (*.rej) and some others.

ANT has released a library with c-code (cntopenlib) to its customers, containing functions to read the header, the binary data and to decompress the data. I have used that library to make two mex files to import the binary data into Matlab, you can download those functions here. Furthermore, at the bottom of this page you can also download the source code of the mex files.

Note: since the release of EEProbe 3.2.4 in March 2004 the binary format of the *.avg files has been slightly changed (they now also contain history information). No new version of cntopenlib has yet been released, and therefore the mex function cannot always read the averages properly. You can use the “avrstrip” program from EEProbe to remove the history information, the resulting file can be read into Matlab.

Plain Matlab functions:

Mex files for Mac OS X (created using Matlab 6.5):

Mex files for Solaris (created using Matlab 6.1):

Mex files for Linux (created using Matlab 6.1):

Mex files for Windows (created using Matlab 6.1):

Source code of the mex files, released under the BSD license:

Or you can download them all together in a single package here.

EEGLAB workshop

From 17-19 September 2005, the second EEGLAB workshop will be held in Porto, Portugal. I will be giving a lecture on the use of the DIPFIT plugin (i.e., dipole analysis of ICA components). The new version 2 of the DIPFIT plugin will be introduced during he workshop, which means that the link between FieldTrip and EEGLAB is finally official.

For more information about the workshop, you can look here. For more information about the relation between FieldTrip and EEGLAB, you can look here and here.

Matlab compiler #2

I discovered another problem with the matlab compiler (version 3, which is included with Matlab 6.5), namely that it will not translate an m-file to c-code when that m-file tries to read a variable from a mat-file.

Mathworks does specify that one of the things that is not supported by the compiler (see here) are m-files that dynamically name variables to be loaded or saved. They give this example which is disallowed by the compiler:

x= 'f';
load('foo.mat',x);

However, this function also will not compile:

function testload(cfg);
b = load(cfg.filename);
c = getfield(b, 'a');
disp(c);

giving the following error

>> mcc -m testload
testload.c: In function `Mtestload':
testload.c:103: error: `mlxLoadStruct' undeclared (first use in this function)
testload.c:103: error: (Each undeclared identifier is reported only once
testload.c:103: error: for each function it appears in.)
mbuild: compile of 'testload.c' failed.

If I slightly rewrite the function it does work. The alternative function that has the desired functionality and that does compile correctly is

function testload(cfg);
filename = cfg.filename;
b = load(filename);
c = getfield(b, 'a');
disp(c);

At least this gives me a handle on how to modify my code with little effort to make it compatible with the Matlab compiler.

Matlab compiler

Currently I am testing out the use of my mentat toolbox for parallel computing within Matlab in combination with GridEngine on our linux cluster. Initially it worked fine, but that was because I was testing it with simple level Matlab functions like “rand” and “fft”. However, I am now trying to get it to work with more complex real-world scripts for EEG and MEG analysis that use the FieldTrip toolbox and I am running into a few problems:

I had to edit the file /opt/matlab-6.5.1/toolbox/matlab/iofun/private/readavi.m and rename the output argument “varargin” into “varargout”. I think that this is a plain bug in that function, which only becomes apparent after translating the code to c. The original file resulted in a compilation error, changing the output variable name fixed that.

Somehow, the code wants to include the Matlab function wk1read in the stand-alone executable. I cannot determine where that happens, because all functions are very deeply nested: mcc generates ~350 c-files on which my analysis function depends. The problem is that, after compilation, wk1read results in a unresolved symbol error. Since I am not interested in reading Lotus database stuff (which is what that particular function does), I have simply replaced it on my own Matlab search path with an alternative (empty) implementation for wk1read. That solved the unresolved symbol error.

Furthermore, I encountered a problems with the exist function. In the stand-alone executable, it does not seem to detect the existence of a variable. Since I am using that function on multiple locations throughout the code, I have to think of a nice workaround. Related to this problem I found this page which gives some relevant hints on using the Matlab compiler:

  1. The following things cause a warning in compilation and may cause a runtime error if the offending statement is executed: objects, most toolboxes, inline, whos, some graphics functions that are passed functions by string.
  2. Program compiled with graphics will not run without X windows. This includes running on beowulf nodes from PBS scripts.
  3. Program containing graphics commands and compiled without graphics will produce compile time warning but will run fine as long as the graphics commands are not executed.
  4. The following commands work differently: eval, feval, exist, global.
  5. Use handles to pass functions rather than strings.
  6. Do not use exist to check for the presence of arguments in a function.

FieldTrip

FieldTrip is a Matlab toolbox for MEG/EEG analysis that is being developed by the F.C. Donders Centre in Nijmegen, the Netherlands. The toolbox includes algorithms for simple and complex analysis of MEG and EEG data, such as time-frequency analysis, sourceanalysis and non-parametric statistial testing. It contains high-level functions that you can use to construct your own analysis protocol in Matlab. It supports various file formats, and new formats can be added easily.

FieldTrip has its own website where you can download the code and documentation.

Mentat: parallel computing using the Matlab compiler

The Mentat toolbox is a collection of Matlab functions that enables you to perform parallel computations from within Matlab on a Beowulf-style cluster. The toolbox was developped on Linux with Matlab 6.5, but probably will also work on other platforms.

I have evaluated various open source parallel computing toolboxes for Matlab, but found that none of them was suitable for my specific needs. Therefore I decided to implement one myself…

The most important problem that I faced is that the parallel computations are performed in separate Matlab sessions. That means that each node in the cluster has to be running it’s own Matlab session, which requires a Matlab license for each node. Furthermore, when using specialized Matlab toolboxes in the computation (e.g., signal processing, image processing, optimization, statistics), also a separate license is required for each of these toolboxes on every node.

Mathworks recently released their commercial distributed computing toolbox. I have no experience with it, but it appears to me that my license problem still would not be solved with that toolbox.

The goal of the Mentat toolbox is:

  • evaluate Matlab code, not low-level c-code
  • work from within the Matlab environment, i.e., normal users should be able to use it
  • the Matlab code should be “unaware” of it being evaluated in parallel

Furthermore, I made use of the following restrictions when designing the toolbox:

  • the computational problem (in our case data processing) should be seperable in chunks
  • each chunk is evaluated in a separate job, independently from all other chunks
  • the chuncks should be computationally large enoug to justify the overhead of sending the data over the network

Since I want my computations to simply scale with the number of available cluster nodes, without me having to buy additional licenses, I implemented a solution based on the Matlab compiler toolbox. Let me give an example: Assume that you are running an interactive Matlab session on the master node of the cluster, then you can type something like

a = rand(1000,1000,30);
pfor(1:30, 'b(:,:,%d) = fft(a(:,:,%d))');

which is equivalent to executing

for i=1:30
b(:,:,i) = fft(a(:,:,i));
end

The pfor function is the main interface to my toolbox, and it takes care of the parallelization. What happens is that the fft function, or any other function in its place, is wrapped into a m-function that is compiled into a standalone executable. Subsequently, the data for each job is written to a network drive that is common to all nodes and all jobs are remotely executed on the cluster. There is also a peval function, which takes multiple strings as input and evaluates them in parallel.

The only requirements are that Matlab and the compiler toolbox should be present on the master node, that there should be a way to remotely start jobs (e.g. ssh/rsh), and there should be a common network disk.

The mentat toolbox is released under the GPL license, you can download it here: mentat_050418.tgz.

The toolbox is still in a very experimental stage, just as this webpage. I hope to develop it further and to improve the documentation, to make it more generally usable. Please contact me if you have any questions, remarks or suggestions.