Parsing Superbase SBF files

Monday, March 27th, 2006 at 11:58 am

For those of you that read my posts ages ago about parsing Superbase files with Perl (Parsing Progress and Parsing Binary data with Perl), here’s an update

I finally found some documentation of the SBF file format. It has made the work a lot easier.

I had realised before that the data was split into multiples of 128B blocks but I didn’t realise that this block size was specified in the document info at the beginning of the file (i had no idea what that data was previously). My previous attempts at parsing relied on 0×80 appearing at the start of new records, unfortunately it also sometimes appeared as part of the block data so any attempt to split the data that way resulted in some records being cut in half when they shouldn’t have been.

The new script reads the first 60 bytes of the file to get the file info (such as block size) and then reads the file in block by block. The first 4 bytes of each block (the rest is data) are a pointer to the next block in the record with some bit flags showing whether the block is the first block of a record (this was what I saw as 0×80).

After struggling with trying to extract data from an unknown binary file format I now realise how important open and well documented file formats are.

3 Responses to “Parsing Superbase SBF files”

  1. Dave M Says:

    Hey –

    Just found your blog on parsing superbase files. Nice job.

    Sadly, I, too, need to extract data from a bunch of superbase files — even slightly more sad, I am attempting to use windoze tools to do the dirty deeds. =o[

    The link to the sbf file format redirects to a wiki page that does not contain any info about superbase files. Would you be able to post up links/refs to the superbase info you were able to find? (and possibly any other useful sb links too? I need to extract an entire sb database, not just specific sbf files data)

    thanks, and congrats on extracting your data from the sbf’s.
    Dave <— total newb to SB databases, files, formats

  2. dave Says:

    after digging around the non-fuctioning website ‘http://isrc.interapps.com’, I found the direct link to the dox. here’s the working link for sbf files:
    http://isrc.interapps.com/file_formats/hh_start.htm

    dave

    are there any other sbf “insider info” type sites? My googles are coming up blank. Is there an odbc driver (or any sort of driver) for superbase files? That would make extracting jobs most flexible.

  3. Gem Says:

    Source code can be found here.

    http://gemmapeter.co.uk/projects/parse_sbf/