Category Archives: Computing

Use SSH tunnel to access remote linux computer

This is a very short reminder for myself how to set up a reverse SSH tunnel between two computers, using an intermediate Raspberry Pi computer that can be accessed from both.

This assumes that you have three computers

  1. A remote linux computer that you want to tunnel to. You should be able execute commands on it, for example through VPN, TeamViewer or so.
  2. A raspberry pi or similar linux computer at home that is in the DMZ of your home network, i.e., it can be directly accessed from the internet.
  3. Your macbook that you want to connect to.

The schematic connection setup is like this:

remote -> raspberry -> macbook
macbook -------------> remote

Continue reading

First steps to realtime EEG and BCI on Raspberry Pi

I just compiled the FieldTrip realtime EEG interface on the Raspberry Pi. The code compiled out of the box, not a single line of code needed to be changed thanks to the existing cross-platform support for the old Apple PPC-G4 and the Neuromag HPUX-RISC MEG system. Streaming data to and from the FieldTrip buffer over TCP/IP works like a charm.

I’ll add my binaries for the Raspberry Pi to the regular FieldTrip release.

The next step will be to compile some of the EEG acquisition drivers, e.g. for OpenEEG and BrainVision.

Eventually it would be nice to also get BCI2000 to work on the Pi. According to Juergen large parts of BCI2000v3 should compile on the ARM… I look forward to gving it a try.

Torque batch queue system for mentat

I have installed the torque batch queue system on our 50 node (~300 core) mentat cluster. Here are some useful PBS commands that can be used with Torque.

qsub script
Submit a job script for execution.
qstat
Show status of running and pending jobs.
tracejob
Display historical information about your jobs.
qdel
Kill a job.
qhold
Hold a job.
qstat -Q
qstat -Qf

Show configuration of queues.

Peer-to-peer distributed Matlab computing

In a recent meeting with the SPM developers, we discussed parallel computing using the Matlab distributed computing toolbox, Star-P, Sun Grid Engine, and other batch systems that can be linked to Matlab. These are all limited in their usefulness for the typical neuroimaging research setting in that they are based on a centralized job distribution system. That may work fine on a large cluster with a centralized configuration and system administration, but even then the usefullness is limited because all input and output data (which are typically large) have to be send over the network twice: first to the job manager, then to the compute node (and vice versa for the results).

To resolve some of these problems, I came up with the idea of peer-to-peer distributed computing in Matlab. The full description can be found on http://fieldtrip.fcdonders.nl/development/peer

Matlab compiler

Currently I am testing out the use of my mentat toolbox for parallel computing within Matlab in combination with GridEngine on our linux cluster. Initially it worked fine, but that was because I was testing it with simple level Matlab functions like “rand” and “fft”. However, I am now trying to get it to work with more complex real-world scripts for EEG and MEG analysis that use the FieldTrip toolbox and I am running into a few problems:

I had to edit the file /opt/matlab-6.5.1/toolbox/matlab/iofun/private/readavi.m and rename the output argument “varargin” into “varargout”. I think that this is a plain bug in that function, which only becomes apparent after translating the code to c. The original file resulted in a compilation error, changing the output variable name fixed that.

Somehow, the code wants to include the Matlab function wk1read in the stand-alone executable. I cannot determine where that happens, because all functions are very deeply nested: mcc generates ~350 c-files on which my analysis function depends. The problem is that, after compilation, wk1read results in a unresolved symbol error. Since I am not interested in reading Lotus database stuff (which is what that particular function does), I have simply replaced it on my own Matlab search path with an alternative (empty) implementation for wk1read. That solved the unresolved symbol error.

Furthermore, I encountered a problems with the exist function. In the stand-alone executable, it does not seem to detect the existence of a variable. Since I am using that function on multiple locations throughout the code, I have to think of a nice workaround. Related to this problem I found this page which gives some relevant hints on using the Matlab compiler:

  1. The following things cause a warning in compilation and may cause a runtime error if the offending statement is executed: objects, most toolboxes, inline, whos, some graphics functions that are passed functions by string.
  2. Program compiled with graphics will not run without X windows. This includes running on beowulf nodes from PBS scripts.
  3. Program containing graphics commands and compiled without graphics will produce compile time warning but will run fine as long as the graphics commands are not executed.
  4. The following commands work differently: eval, feval, exist, global.
  5. Use handles to pass functions rather than strings.
  6. Do not use exist to check for the presence of arguments in a function.

Mentat: parallel computing using the Matlab compiler

The Mentat toolbox is a collection of Matlab functions that enables you to perform parallel computations from within Matlab on a Beowulf-style cluster. The toolbox was developped on Linux with Matlab 6.5, but probably will also work on other platforms.

I have evaluated various open source parallel computing toolboxes for Matlab, but found that none of them was suitable for my specific needs. Therefore I decided to implement one myself…

The most important problem that I faced is that the parallel computations are performed in separate Matlab sessions. That means that each node in the cluster has to be running it’s own Matlab session, which requires a Matlab license for each node. Furthermore, when using specialized Matlab toolboxes in the computation (e.g., signal processing, image processing, optimization, statistics), also a separate license is required for each of these toolboxes on every node.

Mathworks recently released their commercial distributed computing toolbox. I have no experience with it, but it appears to me that my license problem still would not be solved with that toolbox.

The goal of the Mentat toolbox is:

  • evaluate Matlab code, not low-level c-code
  • work from within the Matlab environment, i.e., normal users should be able to use it
  • the Matlab code should be “unaware” of it being evaluated in parallel

Furthermore, I made use of the following restrictions when designing the toolbox:

  • the computational problem (in our case data processing) should be seperable in chunks
  • each chunk is evaluated in a separate job, independently from all other chunks
  • the chuncks should be computationally large enoug to justify the overhead of sending the data over the network

Since I want my computations to simply scale with the number of available cluster nodes, without me having to buy additional licenses, I implemented a solution based on the Matlab compiler toolbox. Let me give an example: Assume that you are running an interactive Matlab session on the master node of the cluster, then you can type something like

a = rand(1000,1000,30);
pfor(1:30, 'b(:,:,%d) = fft(a(:,:,%d))');

which is equivalent to executing

for i=1:30
b(:,:,i) = fft(a(:,:,i));
end

The pfor function is the main interface to my toolbox, and it takes care of the parallelization. What happens is that the fft function, or any other function in its place, is wrapped into a m-function that is compiled into a standalone executable. Subsequently, the data for each job is written to a network drive that is common to all nodes and all jobs are remotely executed on the cluster. There is also a peval function, which takes multiple strings as input and evaluates them in parallel.

The only requirements are that Matlab and the compiler toolbox should be present on the master node, that there should be a way to remotely start jobs (e.g. ssh/rsh), and there should be a common network disk.

The mentat toolbox is released under the GPL license, you can download it here: mentat_050418.tgz.

The toolbox is still in a very experimental stage, just as this webpage. I hope to develop it further and to improve the documentation, to make it more generally usable. Please contact me if you have any questions, remarks or suggestions.