in reply to Re: Google has failed me! Using pack and substr for fixed width file output
in thread Google has failed me! Using pack and substr for fixed width file output

Hi Ken,

That's a really elegant solution to something I do almost every day especially as I was about to start coding up 119 fields for the data rows :-)

I may be forced to nick the above, thanks very much! Rest assured I shall put a "#Ken" in my code ;-)

Cheers, Jake

  • Comment on Re^2: Google has failed me! Using pack and substr for fixed width file output

Replies are listed 'Best First'.
Re^3: Google has failed me! Using pack and substr for fixed width file output
by kcott (Archbishop) on Apr 10, 2014 at 01:40 UTC
    "That's a really elegant solution to something I do almost every day especially as I was about to start coding up 119 fields for the data rows :-)"

    It sounds like you really need to take that abstraction one step further and make a module for all your scripts to use. You will already have saved yourself a lot of work by reducing all those hand-crafted substr statements to a single add_field() subroutine. What you want to avoid is copying that function into every new script you write.

    Consider a situation where you find a problem with add_field() or you want to extend its functionality. If you've just pasted copies of add_field() into multiple scripts, you'll need to fix or modify every one of those; if you've used a module, you'll only need to make changes in one place.

    Here's an example of how the code I posted earlier could be put into a module:

    package PM::FixedWidthFile; use strict; use warnings; use autodie; use Exporter qw{import}; our @EXPORT_OK = qw{populate_file}; use Carp; sub populate_file { my ($fh, $record_length, $field_data) = @_; for my $fields (@$field_data) { my $line = pack 'A' . $record_length; my $offset = 0; _add_field(\$line, \$offset, @$_) for @$fields; print {$fh} $line, "\n"; } return; } sub _add_field { my ($line_ref, $offset_ref, $length, $data, $r_align) = @_; if ($length < length $data) { croak "Data [$data] too large for field of length [$length]"; } my @dat_pad = ($data, ' ' x ($length - length $data)); substr($$line_ref, $$offset_ref, $length) = join '' => @dat_pad[$r_align ? (1, 0) : (0, 1)]; $$offset_ref += $length; return; } 1; =head1 NAME PM::FixedWidthFile - TODO (for Jake): module documentation in POD form +at

    add_field() is now the (pseudo-)private routine _add_field(). I've added an optional, boolean argument ($r_align) to right-align field data. There's also some validation code.

    populate_file() is the public API. It creates a line of the desired length ($record_length) and calls _add_field() to populate the lines with the data from $field_data and outputs the lines to $fh (without having to know anything about what file is involved or whether it's writing to a new file or appending to an existing one).

    In most cases, your scripts will need little more than:

    use PM::FixedWidthFile qw{populate_file}; ... my $record_length = ...; my $outfile = ...; my $file_data = ...; open my $fh, '>', $outfile; populate_file($fh, $record_length, $file_data);

    Here's an actual example with dummy test data:

    #!/usr/bin/env perl use strict; use warnings; use autodie; use PM::FixedWidthFile qw{populate_file}; my $fixed_width_file_base = './pm_fixed_width_file.out_'; my $record_length = 32; # not including line terminator my @multi_file_data = ( [ [ [10, ''], [10, 123], [10, 456] ], ], [ [ [10, ''], [10, 123], [10, 456] ], [ [10, ''], [10, 123, 1], [10, 456] ], [ [10, ''], [10, 123], [10, 456], [2, 78] ], ], [ [ [10, ''], [10, 123], [10, 456], [2, 78] ], [ [10, ''], [10, 123], [10, 456], [2, 789] ], ], ); for my $i (0 .. $#multi_file_data) { my $outfile = $fixed_width_file_base . $i; print "Populating: $outfile\n"; open my $fh, '>', $outfile; populate_file($fh, $record_length, $multi_file_data[$i]); close $fh; system qw{cat -vet}, $outfile; unlink $outfile; # my housekeeping }

    [In case you didn't know, cat -vet filename prints filename and shows various symbols to represent characters that you can't normally see or may have display problems (e.g. whitespace and characters outside the 7-bit ASCII range). The only symbol of interest here is the $ sign which represents a newline. See the cat manpage for more information.]

    Output:

    Populating: ./pm_fixed_width_file.out_0 123 456 $ Populating: ./pm_fixed_width_file.out_1 123 456 $ 123456 $ 123 456 78$ Populating: ./pm_fixed_width_file.out_2 Data [789] too large for field of length [2] at ./pm_fixed_width_file. +pl line 32.

    "I may be forced to nick the above, thanks very much! Rest assured I shall put a "#Ken" in my code ;-)"

    Help yourself to the code. Attribution is courteous but not required. A link to the node where you got the code may be useful for subsequent maintainers and could possibly save you having to redocument what's already been written here (e.g. rationale for changes you implement).

    -- Ken

      It sounds like you really need to take that abstraction one step further and make a module for all your scripts to use.

      Gee, I wonder if such a library might already exist...

        "Gee, I wonder if such a library might already exist..."

        Given the plethora of modules available on CPAN, this is quite likely.

        However, the main thrusts of my post were:

        • To explain why a module would be a good idea.
        • To show how existing code could be used as the basis for creating a module.
        • To show how existing code might be rewritten to use this module.

        -- Ken