I have a binary file with "records" of variable lengths. Each record ends with a '\n'.
This makes no sense. Let's say that your records consist of a number of unsigned longs, it doesn't matter how many. When these are packed to their 4-byte binary representations, there are ~67,000,000 legitimate values where one or more bytes of the four will be "\n". These include the numbers 10, 266, 522, 778, 1034, 1290, 1546, 1802, 2058, 2314, 2560, 2561, 2562, 2563, 2564, 2565, 2566, 2567, 2568 ...
When you try to read a record containing one of these values back expecting a "\n" as the delimiter, the IO code will encounter the "\n" embedded within one of your binary values and truncate the record.
And you will have the same problem with all types of packed binary numbers. Shorts & longs, signed & unsigned, and floating point also.
So, if you are writing raw binary values to a file and trying to use "\n" as a delimiter--or any other character--you will not be able to read that data back successfully. Period.
If you are not writing raw binary values to the file, then you are goingt o have to clarify what you mean by "binary data", because your question as asked simply doesn't make any sense.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
How many edits per file? If it's a one off then as you suggest is probably optimum. If there are multiple edits per file, but otherwise a one off then order the edits by file position then run through the source file and output to an edited version applying edits as you go. If this is an ongoing issue use a database (DBI and DBD::SQLite is pretty much a no-brainer to get going).
DWIM is Perl's answer to Gödel
| [reply] |
Tie::File maybe? I can't say for sure since I still don't know all the details. | [reply] |
Slightly off-topic: For variable-length binary files, you should consider having each record start with something fairly unique (like an SOH) followed by the length of the record.
This doesn't answer your question, but it will make it easier to determine that you're procesing a valid record. (One organisms record break is anothers data.) | [reply] |