harleypig has asked for the wisdom of the Perl Monks concerning the following question:
I have a really nasty legacy flat file I'm trying to parse. I can convert the file, but right now my employer doesn't want to do this.
It appears that DBD::CSV or Text::CSV_XS (I'm thinking the latter actually) will only support a single character field separator and escape character. The current field separator is ' _ ' (that's space underscore space) with _UNSC being the escape string (in the original code it's really just replacing _ with _UNSC_ but it's basically the same thing).
Any pointers on making this work? Do I need to replace Text::CSV_XS with my own module (please, no) or can I make this work somehow?
Upon doing a quick search it appears that Text::xSV is a pure perl solution, but it's got a hard coded limitation that restricts the separator to 1 character as well. I'll see if I can figure out why, but my time is limited--my boss wants this soonest. If I can't make this work then I'll have to go back to the old code *shudder*
Update: It appears I wasn't as clear as I thought I was.
Sample data:
12345 _ value1 _ value2 _ value3 _ ...
12346 _ value1 _ value2 _ value3 _ ...
12347 _ valuewith_UNSC_ _ value2 _ value3 _ ...
I can convert this data easily enough to "standard" csv:
if ( open my $FH, "<dbfile" ) { my @newrows; for my $row ( <$FH> ) { chomp $row; push @newrows, ( join ',', map { s/"/""/g ; $_ = "\"$_\"" } split / _ /, $row ); } # print @newrows to file }
My ultimate goal is to convince my boss to move to a RDBMS of some kind. He *loves* being able to just open the data file and modify something quick and easy. And he sees no reason to learn SQL or hide his data in a binary format. So I need to get DBD::CSV working with the original file so that he can switch between my new code and the live code and see the same data. I can't convince him to change the format, he's used to working with it the way he is. Once I show him how much simpler the code is with DBD::CSV and SQL (albeit simplified via SQL::Statement) I can convince him to move to a RDBMS.
|
|---|