Re: framework for data parsing

To begin with your concrete example ...

The specification looks something like -- fieldA is from character 9 to 14

my $TEMPLATE = '@8A6'; # oops, originally posted without the quotes  
while (<DATA>) {
    my @fields = unpack $TEMPLATE, $_;
 
    # for output try pack or printf
}
[download]

This generalizes. The template for unpack should be machine-generated, to avoid off-by-one errors and other typos.

In what follows, let's suppose you have collected a list of column specifications. Each specification tells you

a field name,
an offset, and
the width of the field in your fixed-width extract.

You might get this from a config file of some sort, or as the result set from a database query, if you happen to have saved your parse specifications in a database table.

use DBI;
my $dbh = ...
my $sth = $dbh->prepare(
    'SELECT field, offset, width'
    . ' FROM Source_Field'
    . ' WHERE source = ?;'
);
my $source = 'input_file.txt';
$sth->execute($source);
my $template;
my @fields;
while (my $column_spec = $sth->fetchrow_hashref() ) {
    my ($field, $offset, $width) 
        = @$column_spec{qw(field offset width)};
    $template .= "\@${offset}A$width";
    push @fields, $field;
}
open my $reader, '<', $source;
while (<$reader>) {
    my %value_of;
    my @values = unpack($template, $_);
    @value_of{@fields} = @values;

    # you've got your current record in a hash
    # print it or save it somewhere
}
[download]

Comment on Re: framework for data parsing Select or Download Code

Replies are listed 'Best First'.
Re^2: framework for data parsing by ikegami (Patriarch) on Jun 20, 2008 at 04:30 UTC
`Bareword found where operator expected at 693061.pl line 1, near "@8A6 +" (Missing operator before A6?) syntax error at 693061.pl line 1, near "@8A6" Execution of 693061.pl aborted due to compilation errors.` [download] ( I'm paraphrasing your earlier reply. Normally, I would just have sent a message for such a small oversight. )	[reply] [d/l]

In Section Seekers of Perl Wisdom