Bilbo has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a short program to convert some data in a rather bizarre format to something more useful.

The file contains n fixed width data fields (where n is known from the header of the file). Each line of the file contains up to three fields, so one record can run over more than one line, but each record starts on a new line. For example, if n=4, part of the file would look like this:

           654           -0.1192052729835885D-03 -0.1074533611698108D+02
  0.2511576310952854D+05
           655           -0.1173814075917466D-03 -0.1074770905898570D+02
  0.2511659074419869D+05
           656           -0.1169565901326905D-03 -0.1074824015295493D+02
  0.2511670643648702D+05
           657           -0.1169057197333296D-03 -0.1074839256994435D+02
  0.2511658884706037D+05
           658           -0.1229838335208557D-03 -0.1074184682541694D+02
  0.2511451830545289D+05

I want to extract this data, and write it out with one record per line. All the data I have at the moment have n=4, so at the moment I am reading the entire multiline file into a single string, and breaking it up like this:

# $data contains the entire contents of the input file while ($data =~ /\s+ (\d+) \s+ (\S+) \s+ (\S+) \s+ (\S+)/gx) { print "$1 $2 $3 $4\n"; }

This works, but I don't see how to generalise it for variable values of n. Can anyone suggest how I might modify this fragment of code, or suggest an entirely different way of solving the problem?

Thanks in advance

Replies are listed 'Best First'.
Re: Converting data from file with multiline records
by marvell (Pilgrim) on Jun 06, 2002 at 15:48 UTC
    How about ...
    my $n = 3; my $re = '(\d+)'.('\s+(\S+)' x $n); my $qre = qr/$re/; # save it being compiled more than once # no need to load it all in, we can just split it differently # (assuming we can define a record separator) $/ = "\n "; # might want to save old value or reset later # or make a local block while(<DATA>) { my @things = /$qre/; print "++ @things\n"; } __DATA__ 654 -0.1192052729835885D-03 -0.1074533611698108D+02 0.2511576310952854D+05 655 -0.1173814075917466D-03 -0.1074770905898570D+02 0.2511659074419869D+05 656 -0.1169565901326905D-03 -0.1074824015295493D+02 0.2511670643648702D+05 657 -0.1169057197333296D-03 -0.1074839256994435D+02 .2511658884706037D+05 658 -0.1229838335208557D-03 -0.1074184682541694D+02 0.2511451830545289D+05

    --
    ¤ Steve Marvell

Re: Converting data from file with multiline records
by Joost (Canon) on Jun 06, 2002 at 15:24 UTC
    try this:
    # assuming a record starts with a space... my @records = split '\n ',$data; my $header = shift @records; # determine $num_of_records here ... my $re_string = '\s+ (\d+) \s+ '.('(\S+)\s*' x $number_of_records); my $re = qr/$string/x; for (@records) { my @values = /$re/; }
    Update: fixed small typo
    -- Joost downtime n. The period during which a system is error-free and immune from user input.
Re: Converting data from file with multiline records
by Aristotle (Chancellor) on Jun 06, 2002 at 21:50 UTC
    Sounds like a job for redo.
    #!/usr/bin/perl -w use strict; my $fields_per_record = 4; while(<>) { chomp; my @records = split " ", $_; if(@records < $fields_per_record) { last if not defined (my $next = <>); $_ .= $next; redo; } print "@records\n"; }
    ____________
    Makeshifts last the longest.