For those of you that read my posts ages ago about parsing Superbase files with Perl (Parsing Progress and Parsing Binary data with Perl), here’s an update
I finally found some documentation of the SBF file format. It has made the work a lot easier.
I had realised before that the data was split into multiples of 128B blocks but I didn’t realise that this block size was specified in the document info at the beginning of the file (i had no idea what that data was previously). My previous attempts at parsing relied on 0×80 appearing at the start of new records, unfortunately it also sometimes appeared as part of the block data so any attempt to split the data that way resulted in some records being cut in half when they shouldn’t have been.
The new script reads the first 60 bytes of the file to get the file info (such as block size) and then reads the file in block by block. The first 4 bytes of each block (the rest is data) are a pointer to the next block in the record with some bit flags showing whether the block is the first block of a record (this was what I saw as 0×80).
After struggling with trying to extract data from an unknown binary file format I now realise how important open and well documented file formats are.