in reply to Re: Regex trouble w/ embedded 0s?
in thread Regex trouble w/ embedded 0s?

Laurent_R:

I'm dismantling a large (5GB) binary file archive, and the first 36 bytes of each file entry is stuff I haven't determined the purpose of. Then comes the filename (variable length) and the data. The filename appears to be unicodey terminated by a 0, so it looks like: (letter, 0, letter, 0, ..., letter, 0, 0, 0). Since the filename is variable length, it felt like a regex would be the simplest to use to dismantle it.

Normally when exploring things like this, I take things apart, and as I find the patterns, I improve the parsing. This file freely seems to mix binary, unicode and normal ASCII, I'm still thinking about how to dismantle it best. I also don't know much about the internal structure of the file yet, other than from a very gross overview. I could look it up on the 'net, but I like figuring stuff out as much as I can first before looking at the answer in the back of the book.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Replies are listed 'Best First'.
Re^3: Regex trouble w/ embedded 0s?
by Laurent_R (Canon) on Jul 11, 2014 at 17:31 UTC
    OK, roboticus, thank you for answering, I now understand your context.