Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Thoughts on designing a file format.

by radiantmatrix (Parson)
on Sep 14, 2005 at 17:17 UTC ( #491936=note: print w/replies, xml ) Need Help??


in reply to Thoughts on designing a file format.

I've done a number of file formats as well, and there are two pieces of advice I'd like to add to your excellent list:

  1. Explicitly specify your escape methodology: if you are creating a CSV file, how will a comma in the data be escaped?
  2. If possible, use record and unit separators that are unlikely to exist in your data: for example, I like to use the ASCII chars \x1E\x0A ("Record Separator"+ newline) and \x1F ("Unit Separator") to separate records and elements, respectively. These are unlikely to appear in text data (unlike columns, tabs, etc.) and reduce the complexity of the escaping strategy that will be required.

In many cases, combining these can result in "the record-separator and element-separator chars are not allowed in text data" as an escaping strategy. This means you can use code like:

open my $F_data, '<', 'filename.dat' or die("bad open: $!"); local $\ = "\x1E\x0A"; while (<$F_data>) { my @row = split("\x1F", $_); process (\@row); }
Instead of relying on (admittedly excellent) modules like Text::CSV_XS. Using these chars tremendously simplifies one's life!

<-radiant.matrix->
Larry Wall is Yoda: there is no try{} (ok, except in Perl6; way to ruin a joke, Larry! ;P)
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://491936]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2023-02-09 00:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer not to run the latest version of Perl because:







    Results (44 votes). Check out past polls.

    Notices?