Backing up my iPod mp3’s

Note: prior to applying the script below, read all the way through to the bottom. There you will find an important update!

I recently bought a 60GB iPod photo, and encountered the problem that the iTunes “synchronisation” does not really work for me: I can store many more files on my iPod than I have space available on the hard disk of my PowerBook. Therefore, I decided not to use the “synch” option in iTunes and to do it manually … so far, so easy…

Recently I accidentally pressed the “automatically update” button in the iPod preferences in iTunes. To my horror suddenly it started deleting 20GB of mp3’s that I copied on my iPod and that I did not have on my hard disk any more 🙁

To prevent this accidental deletion of the files on my iPod, I made the following backup script. It uses low-level unix command line tools, and creates a “hard link” for each file on the iPod. The hard link does not prevent the file from being deleted using iTunes, but at least it ensures that the mp3 data itself is not erased. I.e., it is simple to add the mp3 back to your iTunes music collection on your iPod.

The hard link is similar as a copy of the file, except that only the pointer to the file is copied, the data is left alone. Therefore a hard link does not occupy additional space on your iPod. If you edit the ID3 tags of a file, the iD3 content of both the original file and of the hard-linked copy will be updated.

The following script creates the hard links in a subdirectory with today’s date as the name. Copy and paste it into a text file with the name “ipodbackup”, set the permissions to “execute” (e.g. chmod +x ipodbackup) and you can execute it from the terminal command line to make a backup of all files on your iPod prior to adding new mp3’s.

#!/bin/sh

IPOD="/Volumes/Robert Oostenveldâ~@~Ys iPod"
SOURCE="$IPOD"/iPod_Control/Music
TARGET="$IPOD"/Backup/`date +'%Y%m%d'`

# create the directory that will contain the backup
mkdir -p "$TARGET"
cd "$TARGET"

# create the subdirectories that conatin the mp3's
find "$SOURCE" -type d -mindepth 1 -exec basename {} ; | xargs mkdir

# create the hard links in each subdirectory
for subdir in F* ; do
echo linking `ls "$SOURCE"/$subdir | wc -l` music files in directory $subdir
ln "$SOURCE"/$subdir/* "$TARGET"/$subdir
done

Important update (12 Aug 2005): After applying the script and making all the hard links, I discovered that they confuse the software on the iPod. It will not play the songs any more, although they still appear correctly in both iTunes and on the iPod when disconnected. Also, iTunes is still able to play the songs. To solve this, I had to copy all the songs off and reimport them in iTunes and on the iPod. So the backup script does not work as I hoped 🙁

Matlab compiler #2

I discovered another problem with the matlab compiler (version 3, which is included with Matlab 6.5), namely that it will not translate an m-file to c-code when that m-file tries to read a variable from a mat-file.

Mathworks does specify that one of the things that is not supported by the compiler (see here) are m-files that dynamically name variables to be loaded or saved. They give this example which is disallowed by the compiler:

x= 'f';
load('foo.mat',x);

However, this function also will not compile:

function testload(cfg);
b = load(cfg.filename);
c = getfield(b, 'a');
disp(c);

giving the following error

>> mcc -m testload
testload.c: In function `Mtestload':
testload.c:103: error: `mlxLoadStruct' undeclared (first use in this function)
testload.c:103: error: (Each undeclared identifier is reported only once
testload.c:103: error: for each function it appears in.)
mbuild: compile of 'testload.c' failed.

If I slightly rewrite the function it does work. The alternative function that has the desired functionality and that does compile correctly is

function testload(cfg);
filename = cfg.filename;
b = load(filename);
c = getfield(b, 'a');
disp(c);

At least this gives me a handle on how to modify my code with little effort to make it compatible with the Matlab compiler.

Matlab compiler

Currently I am testing out the use of my mentat toolbox for parallel computing within Matlab in combination with GridEngine on our linux cluster. Initially it worked fine, but that was because I was testing it with simple level Matlab functions like “rand” and “fft”. However, I am now trying to get it to work with more complex real-world scripts for EEG and MEG analysis that use the FieldTrip toolbox and I am running into a few problems:

I had to edit the file /opt/matlab-6.5.1/toolbox/matlab/iofun/private/readavi.m and rename the output argument “varargin” into “varargout”. I think that this is a plain bug in that function, which only becomes apparent after translating the code to c. The original file resulted in a compilation error, changing the output variable name fixed that.

Somehow, the code wants to include the Matlab function wk1read in the stand-alone executable. I cannot determine where that happens, because all functions are very deeply nested: mcc generates ~350 c-files on which my analysis function depends. The problem is that, after compilation, wk1read results in a unresolved symbol error. Since I am not interested in reading Lotus database stuff (which is what that particular function does), I have simply replaced it on my own Matlab search path with an alternative (empty) implementation for wk1read. That solved the unresolved symbol error.

Furthermore, I encountered a problems with the exist function. In the stand-alone executable, it does not seem to detect the existence of a variable. Since I am using that function on multiple locations throughout the code, I have to think of a nice workaround. Related to this problem I found this page which gives some relevant hints on using the Matlab compiler:

  1. The following things cause a warning in compilation and may cause a runtime error if the offending statement is executed: objects, most toolboxes, inline, whos, some graphics functions that are passed functions by string.
  2. Program compiled with graphics will not run without X windows. This includes running on beowulf nodes from PBS scripts.
  3. Program containing graphics commands and compiled without graphics will produce compile time warning but will run fine as long as the graphics commands are not executed.
  4. The following commands work differently: eval, feval, exist, global.
  5. Use handles to pass functions rather than strings.
  6. Do not use exist to check for the presence of arguments in a function.

FieldTrip

FieldTrip is a Matlab toolbox for MEG/EEG analysis that is being developed by the F.C. Donders Centre in Nijmegen, the Netherlands. The toolbox includes algorithms for simple and complex analysis of MEG and EEG data, such as time-frequency analysis, sourceanalysis and non-parametric statistial testing. It contains high-level functions that you can use to construct your own analysis protocol in Matlab. It supports various file formats, and new formats can be added easily.

FieldTrip has its own website where you can download the code and documentation.

Mentat: parallel computing using the Matlab compiler

The Mentat toolbox is a collection of Matlab functions that enables you to perform parallel computations from within Matlab on a Beowulf-style cluster. The toolbox was developped on Linux with Matlab 6.5, but probably will also work on other platforms.

I have evaluated various open source parallel computing toolboxes for Matlab, but found that none of them was suitable for my specific needs. Therefore I decided to implement one myself…

The most important problem that I faced is that the parallel computations are performed in separate Matlab sessions. That means that each node in the cluster has to be running it’s own Matlab session, which requires a Matlab license for each node. Furthermore, when using specialized Matlab toolboxes in the computation (e.g., signal processing, image processing, optimization, statistics), also a separate license is required for each of these toolboxes on every node.

Mathworks recently released their commercial distributed computing toolbox. I have no experience with it, but it appears to me that my license problem still would not be solved with that toolbox.

The goal of the Mentat toolbox is:

  • evaluate Matlab code, not low-level c-code
  • work from within the Matlab environment, i.e., normal users should be able to use it
  • the Matlab code should be “unaware” of it being evaluated in parallel

Furthermore, I made use of the following restrictions when designing the toolbox:

  • the computational problem (in our case data processing) should be seperable in chunks
  • each chunk is evaluated in a separate job, independently from all other chunks
  • the chuncks should be computationally large enoug to justify the overhead of sending the data over the network

Since I want my computations to simply scale with the number of available cluster nodes, without me having to buy additional licenses, I implemented a solution based on the Matlab compiler toolbox. Let me give an example: Assume that you are running an interactive Matlab session on the master node of the cluster, then you can type something like

a = rand(1000,1000,30);
pfor(1:30, 'b(:,:,%d) = fft(a(:,:,%d))');

which is equivalent to executing

for i=1:30
b(:,:,i) = fft(a(:,:,i));
end

The pfor function is the main interface to my toolbox, and it takes care of the parallelization. What happens is that the fft function, or any other function in its place, is wrapped into a m-function that is compiled into a standalone executable. Subsequently, the data for each job is written to a network drive that is common to all nodes and all jobs are remotely executed on the cluster. There is also a peval function, which takes multiple strings as input and evaluates them in parallel.

The only requirements are that Matlab and the compiler toolbox should be present on the master node, that there should be a way to remotely start jobs (e.g. ssh/rsh), and there should be a common network disk.

The mentat toolbox is released under the GPL license, you can download it here: mentat_050418.tgz.

The toolbox is still in a very experimental stage, just as this webpage. I hope to develop it further and to improve the documentation, to make it more generally usable. Please contact me if you have any questions, remarks or suggestions.

Warping toolbox

The warping toolbox is a collection of 3 dimensional linear and non-linear warping functions written in Matlab that operate on point clouds (e.g., vertices, electrodes, dipoles). You can download the toolbox here. The toolbox functions are modelled after Roger P. Woods’ AIR 3.08 package, see its documentation for more background details.

The warping toolbox is released under the GPL license.

Neuroscan

C-code to read/convert Neuroscan 3.x (and partially 4.x?) AVG, EEG and CNT datafiles

This page gives the c-code to some applications and functions which read the Neuroscan EEG data format. A part of this code comes from the Neuroscan website, a part was written by myself. The code is completely free (as in free beer) and it comes with no license or conditions of use. I give it to you in the hope that it will be useful. However, I do not accept any responsibility for the correctness and usefullness of this code.

You might also be interested in reading the information on different versions of the Neuroscan data format on their site.

Header files:

Source code files:

You can also download all of the code in a single zip file.