Angharad has asked for the wisdom of the Perl Monks concerning the following question:

Hi there
I have a text file that looks something like this
filename = without012.csv pred = -0.0105 0.0645 0.1989 0.0151 0.0289 0.0400 0.5313 filename = without013.csv pred = -0.0080 0.0663 0.1911 0.0075 0.0288 0.0302 0.5234 filename = without014.csv pred = -0.0023 0.0595 0.2582 0.0323 0.0285 0.0578 0.5807
And so on. This text file was produced from 120 smaller .csv files, as you can probably guess from the names .. lol. What I need to do is to save each individual 'block' of numeric values .. for example
-0.0023 0.0595 0.2582 0.0323 0.0285 0.0578 0.5807
Into separate files, with each file being saved according to the smaller files from which these results originate. So, for the example below
filename = without014.csv pred = -0.0023 0.0595 0.2582 0.0323 0.0285 0.0578 0.5807
I would like to save the numeric block in a small file called without014.csv.
With this in mind I started writing a perl script as follows:
#!/usr/bin/perl @file = <>; for($i = 0; $i < @file; $i+4) { print "$file[$i]"; }
I was hoping that it would print out just the file names initially and then I was going to use the same technique for extracting the numeric block but all I am getting is blank lines when I run it (and yes, I'm checking that I am attempting to print off 'the right line' lol).
Anyway .. it's obvious that I am not getting very far .. so any suggestions/tips would be much appreciated.

Replies are listed 'Best First'.
Re: extracting data and saving into seperate files
by wfsp (Abbot) on Jan 09, 2006 at 12:15 UTC
    Here's one way that may help you get started.

    It uses a hash of arrays.

    #!/bin/perl5 use strict; use warnings; use Data::Dumper; my %data; my $file; while (my $line = <DATA>){ if ($line =~ /([^.]+\.csv)/){ $file = $1; $data{$file} = []; } elsif ($line =~ /(-?[\d.]+)/){ push @{$data{$file}}, $1; } } for my $file (keys %data){ print "file: $file\n"; # open $file print "data:\n"; print "\t$_\n" for @{$data{$file}}; print "\n"; # print data to file } #print Dumper \%data; __DATA__ filename = without012.csv pred = -0.0105 0.0645 0.1989 0.0151 0.0289 0.0400 0.5313 filename = without013.csv pred = -0.0080 0.0663 0.1911 0.0075 0.0288 0.0302 0.5234 filename = without014.csv pred = -0.0023 0.0595 0.2582 0.0323 0.0285 0.0578 0.5807

    output:

    ---------- Capture Output ---------- > "C:\Perl\bin\perl.exe" _new.pl file: without012.csv data: -0.0105 0.0645 0.1989 0.0151 0.0289 0.0400 0.5313 file: without013.csv data: -0.0080 0.0663 0.1911 0.0075 0.0288 0.0302 0.5234 file: without014.csv data: -0.0023 0.0595 0.2582 0.0323 0.0285 0.0578 0.5807 > Terminated with exit code 0.

    Hope this helps

Re: extracting data and saving into seperate files
by pKai (Priest) on Jan 09, 2006 at 12:55 UTC

    Actually with a few small modifications your starter code as shown in the OP is able to do what you expected from it (showing you the file names):

    #!/usr/bin/perl $/ = ''; # read input by paragraphs @file = <>; for($i = 1; $i < @file; $i+=4) # from field 1 (2nd field); and += for + proper loop { print "$file[$i]"; }

    Edit: Oops, it's $/ = '';  # not undef

Re: extracting data and saving into seperate files
by explorer (Chaplain) on Jan 09, 2006 at 12:49 UTC
    Perl can resolve your problem with one line:
    perl -ne 'if(/(\w+\.csv)/){close $fh if defined $fh;open $fh,">$1"} el +sif(/([0-9.-]+)/){print $fh "$1\n"}' file.txt
    or few more:
    open FILE, "<file.txt"; # Open the file while( <FILE> ) { # read a line if ( /(\w+\.csv)/ ) { # If the line is a filename... close $fh if defined $fh; # close previous open $fh,">$1"; # and open new for writing } elsif ( /([0-9.-]+)/ ) { # If the line is a number... print $fh "$1\n"; # write it } } close $fh; # close the files opened close FILE;
      I would be more inclined to check for the result of open even in an one liner. If I want to be lazy, I'd leave out the close instead, because perl takes care of those.
Re: extracting data and saving into seperate files
by radiantmatrix (Parson) on Jan 09, 2006 at 22:16 UTC

    Hm, I'd separate these into "chunks" that are each a file, and worry those apart later. Something like this:

    open INFILE, '<', 'infile.txt' or die("Can't read: $!"); # each 'line' is a chunk with 'filename =' delimiting. local $/ = 'filename ='; while (<INFILE>) { s/^s\+//s; # remove starting whitespace. # use a regex capture -- slow but simple. my ($filename, $data) = ($1, $2) if m/^(\w+)\s+pred=\s+(.*?)\s+$/s; open my $OUT, '>', $filname or die("Can't write $filename: $!"); foreach ( split(/\s+|\n/, $data) ) { next unless m/^[-.0-9][0-9.]+/; #skip any empty or malformed lin +es print $OUT $_; } }

    Untested, but the principle is sound and somewhat reusable.

    <-radiant.matrix->
    A collection of thoughts and links from the minds of geeks
    The Code that can be seen is not the true Code
    "In any sufficiently large group of people, most are idiots" - Kaa's Law
Re: extracting data and saving into seperate files
by Anonymous Monk on Jan 09, 2006 at 15:26 UTC
    Strangely enough I did something almost exactly similar a few minutes ago, but with one of my really grotty one liners.

    The program you might be looking for might be this :

    perl -ne '/(\S+\.csv)/ and open X, ">$1" and next; next if /=/ or /^\s +*$/; s/^\s+//; print X' inputfile

    Of course, you don't want to do it like that :)

    The key things to keep in mind are triggering an output destination swap when you see a filename, and triggering an output print when you see output you intend to store. Nothing else matters.

    You don't need an array of files. You don't even need to hold it in memory for any operations.

    If the same filename could come up more than once, use ">>" instead of ">" to open the files.

Re: extracting data and saving into seperate files
by Happy-the-monk (Canon) on Jan 09, 2006 at 11:50 UTC

    First thing to notice is you forgot to initially open the original text file. So whether or not the count of line numbers is right, there is no data to work with.

    Cheers, Sören

      He's using <> so the original text file is implicitly opened if given on the cmd line. I guess you already knew, but you may have just overlooked it.

      To be sure, just take his code and modify it like thus:

      #!/usr/bin/perl -l my @file = <>; print 0+@file;

      Indeed

      $ ./bar.pl data.txt 46 $ wc data.txt 46 36 346 data.txt
A reply falls below the community's threshold of quality. You may see it by logging in.