Are you able to know in advance, for a given input file, what sort of format it contains (e.g. based on the file name or which directory it's in)? Or does your task include discovering what format is being used in each file, and then parsing it accordingly?
Assuming that each of the input files is pretty small, I would "slurp" the full file into a single scalar variable rather than read it line-by-line:
Starting like that, you should be able to create a suitable subroutine for each file type, such that the sub returns a list or hash structure that will go directly into the desired XML output. The sub could just be a regex match or something more complicated (series of matches, and/or split on "\n", and/or Text::xSV parse, and/or whatever).$_ = ''; if ( open( INPUT, "<", $file )) { $/ = undef; $_ = <INPUT>; close INPUT; } if ( ! $_ ) { warn "No data found in $file\n"; next; } # now it will be easier to categorize/parse $_ ...
As for your code, I strongly recommend that you start with use strict; and you should add a lot more error checking on things like opening files and doing chdir.
I think your handling of the config file might not provide the right sort of flexibility. If each run only applies to a particular "scanpath" directory, and all such directories are always organized the same way, and only contain files of a particular type, then sure, your approach would be workable. But don't you want a single process that can be run once and cover all types of input, rather than having to run it several times with a different config file each time?
In reply to Re: Perl Parser to Handle Any File Format
by graff
in thread Perl Parser to Handle Any File Format
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |