Electron Incantation: Reverse engineering a motion-capture file format (or, the answer to my prayers... a week ago)

So, in a surprising turn of events, I am posting about something that I actually did for my research. Part of the work I do involves motion capture; I use cameras and strobes and markers affixed to bony landmarks on the rat hindlimb to record the motion of the limb in space during behavior. One of the motion capture files that I recorded was corrupted with noise, and could not be un-corrupted using the programs and tools from the system manufacturer. Being the clever and industrious fellow that I am (read: I didn't want to do the analysis that I actually had scheduled), I spent a day to completely reverse-engineer the motion capture data file format and use that knowledge to create a program which completely removed the corruption from the file in question and allowed normal data analysis to occur.

The Problem

As I said above, part of the analysis I am doing for my research involves recording the kinematics of hindlimb locomotion. The system that the lab purchased to get this data is passive and camera-based; that is, bits of shiny stuff (markers) are affixed to the subject and illuminated, and the grayscale images of the shiny stuff are used to infer the location of bony landmarks in space over time.

Since the shiny stuff is so shiny, it's usually easy to set a threshold on the grayscale images to get rid of non-marker sources in the image. However, sometimes there is something else similarly shiny in the image; in those cases you 'mask' the offending pixels (always setting them to zero). Of course, that also means that, if a legitimate marker moves into the masked area, it will not be detected.

My situation was that I had masked the offending reflections in the image... but then had to shift the treadmill around a bit. As a result, some large reflections were present in the data. They were such that the post processing (converting the grayscale images into labeled 3-D trajectories) just wasn't working; bits of whiteness from the reflections were being erroneously labeled.

Unfortunately, the system we use does not have a native facility for re-masking data after it's been recorded. So, I needed to roll my own. To do this, I needed to understand the native data file format.

DISCLAIMER: The system we use is the Vicon Nexus. This is NOT RECOMMENDED by them. DO NOT USE THIS INFORMATION TO DO ANYTHING. In fact, stop reading now. I make no guarantees as to the usability or safety of the software provided here.

The Solution

I opened my my trusty hex editor (HxD by Maël Hörz) and took a look at a couple of the raw data files. Long story short, they all shared almost identical initial segments (the first 770 bytes, specifically) which I assume are header information and contain ASCII sub-strings with the camera type and specs in plain text. There were also two 4-byte-long sections of this header which described A: the number of images frames in the file and B: the offset (number of bytes from the beginning of the file) at which the 'index' began. This header was followed by the second section (the largest by far) which contained the grayscale image and blob center data. The final section was the 'index', which contained a series of 12-byte-long records describing the frame numbers (first four bytes) and the offset at which each frame began in the file (last eight bytes).

Sections of the data segment were arranged hierarchically; each object on a given level started with two bytes of 'start sentinel', four bytes describing the length of the object, and four bytes giving some other important number (e.g., camera number, number of blobs in a frame, number of grayscale scan lines in a blob). The top level for each frame was, well, the frame; that is, all of the data taken during one sample. Below the frame level was the camera subframe level; each of those contained the data for the given frame from one of the cameras.

A bit of indirection below the camera subframe, and we come to the meat of the file: the grayscale image data. Each subframe specifies how many bytes long it is, and how many 'blobs' it contains. Blobs are just contiguous sections of non-zero in the grayscale image. Each blob then specifies how many bytes it contains, and how many horizontal scan lines of grayscale data it is made up of. These lines then specify the X- and Y-coordinates of their left-most pixel, how many pixels long they are, and then proceed to actually post the grayscale data. Using the file read and write commands makes traversing this hierarchy simpler, because the file pointer helps to keep track of where you are.

I keep things vague, because I don't want to ruin the fun for anyone else, and because I am a coward.

The Goods

Using my detailed notes on the structure of the data file and its many headers and start sentinel codes, I implemented several useful functions to make it possible to quickly and painlessly re-mask my data. These functions are included in this .zip archive; the files are MATLAB m-files and use mostly generic, easily-ported syntax. One of the functions uses the MATLAB sparse matrix data type; I have kept the non-sparse version of the code in the comments, so porting should be straightforward.

I reiterate from above: DO NOT USE THIS UNLESS YOU KNOW WHAT YOU ARE DOING. This is in no way endorsed by Vicon. Always make a backup. Et cetera. Contact me if you have concerns and absolutely need to re-mask some bad Vicon data. I post this only in the spirit of giving, in the hope that someone in the future, in the same spot that I was in a week ago, will be helped by my efforts.

The functions are as follows:

extractFrameIndices: This function extracts the offsets for all of the frames in the record.
makeSparseFrameRaster: This function extracts a specified frame from the record, for viewing.
remaskInPlace: This function dances through the record, zeroing out all of the pixels in the record the user desires
testFunc: An example masking function that I used to test this out (also, coincidentally, exactly the masking function that I needed applied to my data)

The re-masking function is designed to be as mutable as possible; the form of the masking can be as complicated as the user desires, since a handle (MATLAB version of pointer) to the masking function is passed to remaskInPlace, rather than parameters defining a restricted domain of masks. The masking function receives as inputs from the calling function the X- and Y-location of the pixels in question as well as the index of the current frame, allowing the masks to be functions of time as well as space. Additionally, user-specified parameters can be passed transparently through remaskInPlace to the masking function.

Electron Incantation

Friday, March 30, 2012

Reverse engineering a motion-capture file format (or, the answer to my prayers... a week ago)

No comments:

Post a Comment