note
radiantmatrix
<p>I've done a number of file formats as well, and there are two pieces of advice I'd like to add to your excellent list:<ol>
<li><b>Explicitly specify your escape methodology</b>: if you are creating a CSV file, how will a comma in the data be escaped?
<li><b>If possible, use record and unit separators that are unlikely to exist in your data</b>: for example, I like to use the ASCII chars <tt>\x1E\x0A</tt> ("Record Separator"+ newline) and <tt>\x1F</tt> ("Unit Separator") to separate records and elements, respectively. These are unlikely to appear in text data (unlike columns, tabs, etc.) and reduce the complexity of the escaping strategy that will be required.
</ol>
<p>In many cases, combining these can result in "the record-separator and element-separator chars are not allowed in text data" as an escaping strategy. This means you can use code like:
<code>
open my $F_data, '<', 'filename.dat' or die("bad open: $!");
local $\ = "\x1E\x0A";
while (<$F_data>) {
my @row = split("\x1F", $_);
process (\@row);
}
</code>
Instead of relying on (admittedly excellent) modules like [cpan://Text::CSV_XS]. Using these chars tremendously simplifies one's life!</p>
<!-- p><small><b>Updates:</b><ul type='square'>
<li>YYYY-MM.bbb-DD : description</li>
</ul></small></p -->
<div class="pmsig"><div class="pmsig-375088">
<small>
<small><-</small><b>radiant</b>.<b>matrix</b><small>-></small>
<br>Larry Wall is Yoda: there is no <tt>try{}</tt> (ok, except in Perl6; way to ruin a joke, Larry! ;P)
<br><em>The Code that can be seen is not the true Code</em>
<br><em>"In any sufficiently large group of people, most are idiots"</em> - <strong>Kaa's Law</strong>
</small>
</div></div>
491321
491321