in reply to copying records from one file to another with filter.

If you know the start and end point use a range operator.

Note: In this context "eof" doesn't really correspond to the eof end-of-file marker, it is just a string that isn't found before the end of the file. It works in this situation, but beware of using it in a general case. If you want the true eof marker, use the hex char at ordinal 4, "\x4"

use warnings; use strict; my $start = '4-Dec-2009'; my $end = "eof"; print "Date,Total\n"; while (<DATA>) { if ( /^$start/ .. /^$end/ ) { chomp; my (@items) = split /,/; printf "%s,%d\n", $items[0], $items[1] + $items[2]; } } __DATA__ Date,Expense,Income 1-Dec-2009,12,87 2-Dec-2009,54,204 3-Dec-2009,75,214 4-Dec-2009,78,198 5-Dec-2009,98,155 6-Dec-2009,10,180 7-Dec-2009,51,91 8-Dec-2009,32,130 9-Dec-2009,29,207

produces:

Date,Total 4-Dec-2009,276 5-Dec-2009,253 6-Dec-2009,190 7-Dec-2009,142 8-Dec-2009,162 9-Dec-2009,236

Replies are listed 'Best First'.
Re^2: copying records from one file to another with filter.
by avanta (Beadle) on Jan 11, 2010 at 20:22 UTC
    your code seems to be more relevant to my problem....I implemented the technique and heres my edited code...
    #!/usr/bin/perl use strict; use warnings; # source file directory my $srcdir = "../source"; # source file name my $srcfile = $srcdir."/vol.dat"; # Open source file. open (READ, "< $srcfile") || die "Can't find the DAT file\n"; my $epochToday = time; $epochToday = $epochToday - 2592000; my ($year, $month, $day) = (localtime($epochToday))[5,4,3]; $month++; $year+=1900; my $startdate = $year."-".$month."-".$day; my $x=0; my $info; my @input; while ($info = <READ>) { chomp $info; my @data = split (/,/, $info); push @input, [@data]; $x++; } close READ || die "Couldn't close the DAT file"; @ordered_input); my $desdir = "../target"; my $desfile = $desdir."/total_volume.csv"; open (WRITE, "> $desfile") || die "Can't find the CSV file.\n"; my @headers = ("Date",",","Total_Volume"); print WRITE @headers,"\n"; my $printout; while (@input) { my $start = $startdate; my $end = "eof"; if ( /^$start/../^$end/ ) { chomp; my (@items) = split /,/; $printout .= "%s,%d\n", $items[0], $items[1] + $items[2]; } } print WRITE $printout; close WRITE || die "Couldn't close the CSV file"; exit 0;
    but im getting and infinite loop error:
    Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Use of uninitialized value $_ in pattern match (m//) at basic.pl line +63. Terminating on signal SIGINT(2) Terminating on signal SIGINT(2)
    kindly help in this matter..

    Thanks
    AvantA

      A few observations and comments:

      You are reading in the entire file and saving it in an array of arrays, but then don't do anything with it. (Well, other than use the size of the array as a never terminating while() condition.) For that situation your loop probably should be

      for (@input)
      rather than
      while (@input)

      You are searching for a YYYY-MM-DD pattern when your example data is in DD-MMM-YYYY format. You'll never find anything.

      You should probably use the 3 argument form of open and lexical file handles. What you have isn't wrong particularly, but isn't considered best practices.

      The way you are calculating 30 days is easy but fragile. You may be better off using one of the date calculation packages. It may be good enough here though, so take that with a grain of salt.

      Here's a minor reworking of your posted script to be more efficient (and work, for certain values of "work").

      #!/usr/bin/perl use strict; use warnings; my @months = ( qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec/); # source file directory my $srcdir = '../source'; # source file name my $srcfile = $srcdir . '/vol.dat'; my $desdir = '../target'; my $desfile = $desdir . '/total_volume.csv'; # Open source file. open my $read, '<', $srcfile or die "Can't open the DAT file $!\n"; open my $write, '>', $desfile or die "Can't open the CSV file. $!\n"; print $write "Date,Total_Volume\n"; my ( $year, $month, $day ) = ( localtime( time() - 2592000 ) )[ 5, 4, +3 ]; $year += 1900; my $startdate = "$day-".$months[$month]."-$year"; while (<$read>) { if ( /^$startdate/ .. /\x4/ ) { chomp; my (@items) = split /,/; printf $write "%s,%d\n", $items[0], $items[1] + $items[2]; } } # probably not necessary but not a bad idea close $read or die "Couldn't close the DAT file: $!\n"; close $write or die "Couldn't close the CSV file: $!\n";
        First of all, thanks a lot of the help. Also Im really sorry i didnt updated the source file format. Actually, I was done with the code(with some glitches). Havent used ur code yet, but will do in another script.


        New Source format:

        2009-12-22,174865.6853,171 2009-12-23,158423.1442,155 2009-12-24,146650.9855,143 2009-12-25,127228.4832,124 2009-12-26,179032.6644,175 2009-12-27,179770.0221,176 2009-12-28,153049.9829,149 2009-12-29,159508.811,155 2009-12-30,75322.9348,143 2009-12-31,184494.3142,124 2010-01-01,88085.89262,87 2010-01-02,157525.6179,204 2010-01-03,213673.8187,214 2010-01-04,190080.1713,198 2010-01-05,139624.0644,155 2010-01-06,159684.3982,180 2010-01-07,159508.811,91 2010-01-08,75322.9348,130 2010-01-09,174867.4572,207 2010-01-10,206403.5704,86 2010-01-11,121876.6863,154 2010-01-12,89091.60969,209 2010-01-13,159684.3982,180 2009-01-14,153049.9829,149


        New code (which works):

        #!/usr/bin/perl use strict; use warnings; # source file directory. my $srcdir = "../source"; # source file name. my $srcfile = $srcdir."/vol.dat"; # Open source file in READ mode. open (READ, "< $srcfile") || die "Can't find the DAT file\n"; ################## Start date time i.e. the first day of the last 30 d +ays, is calculated by this code. my $epochToday1 = time; $epochToday1 = $epochToday1 - 2592000; my ($year1, $month1, $day1) = (localtime($epochToday1))[5,4,3]; $month1++; $year1+=1900; my $startdate = $year1."-".$month1."-".$day1; ######################## ################## End date time i.e. the current day, is calculated b +y this code. my $epochToday2 = time; my ($year2, $month2, $day2) = (localtime($epochToday2))[5,4,3]; $month2++; $year2+=1900; my $enddate = $year2."-".$month2."-".$day2; ######################## # Destination Directory. my $desdir = "../target"; #destination File name. my $desfile = $desdir."/total_volume_last30days.csv"; #Open target file in Write mode. open (WRITE, "> $desfile") || die "Can't find the CSV file.\n"; # Print the headers for the Report in CSV. my @headers = ("Date",",","Total_Volume"); print WRITE @headers,"\n"; my $printout; #declaration of variable which prints in the Tar +get file. while (<READ>) { my $start = $startdate; my $end = $enddate; if ( /^$start/ .. /^$end/ ) #range to be checked and writt +en to Target file. { chomp; my (@items) = split (/,/,$_); my $tot = $items[1]+ $items[2]; #sum of incoming an outgoin +g bytes. $printout .= "$items[0],$tot \n"; #variable which store +s the Data to be written in target file. } } print WRITE $printout; #write $printout va +lues to Target file. close READ || die "Couldn't close the DAT file"; #close input f +ile. close WRITE || die "Couldn't close the CSV file"; #close target + file. exit 0;


        In this new code I have used today's date as ending parameter in the range operator, which is what I actually require. And the glitch which I was talking about is that if the input contains a date next to today's date, the output file also gets that data. but as u can see I have used $end as $enddate which is current system date, How can I check this?

        Apart from this I wish to create another or edit this same script which gives the current month data. For that the idea which I was using was hardcode the $day1 and remove that 30 days of seconds which I was subtracting in $startdate as '01' but just changing that creates an error
        Use of uninitialized value $printout in print at total_volume_last30.p +l line 67, <READ> line 122.


        If you can suggest some other way it would be grateful of you.

        Thanks
        AvantA,
        .
Re^2: copying records from one file to another with filter.
by avanta (Beadle) on Jan 12, 2010 at 07:34 UTC
    if( /^$start/ .. /^$end/ )
    Sir, I am havng trouble interpretting in this line.. there is an error

    Use of uninitialized value $_ in pattern match (m//) at range.pl line 12
    Can you please help me.. its urgent..

    Avanta