comment on

Hi,

So I have created this script with help from previous posts on Perl Monks. It successfully reads from large tab and comma delimited text docs and picks out the rows of data which I require and places them into new docs that are much smaller and more manageable in size.

However I am unable to create a version of it which carries it out in bulk. My command for running it is as follows:

 perl midasproc2.pl midas_wxhrly_199901-199912.txt 19260 "1999-12-15 1
+1:00"
[download]

This means that the a different value must be manually entered each time in place of "19260". Is it possible to alter the script to allow me to enter numerous values in addition to 19260?

1999-01-01 00:00, EGPK, ICAO, METAR, 1, 1006, 1001, , , 120, 13, 00, ,
+ , , , , , , , 1000, , 1, , 96, 6, , $
1999-01-01 00:00, EGSS, ICAO, METAR, 1, 484, 1001, , , 130, 8, 00, , ,
+ , , , , , , 1000, , 1, , 120, , , , $
1999-01-01 01:00, 03002, WMO, SYNOP, 1, 12, 1011, 4, 6, 160, 20, , , ,
+ , , , , , 20, 440, 1004.1, 7, , 24, $
[download]

Below is the script I pieced together: ------------------------------------------

#!/usr/bin/perl

use strict;
use warnings;
my ($record, $date, $outfile, $station, $sstation, $linecnt, $pcntg, @
+values);

print "Processing \"$ARGV[0]\"...\n";
$date = $ARGV[2];
$station = $ARGV[1];
$outfile = $date.".txt";
$outfile =~ s/ /-/;

print "For Date: $date and Station: $station\n\n";
$linecnt = `wc -l < $ARGV[0]`;
open (INFILE, $ARGV[0]);
open (OUTFILE, ">>", $outfile);

while (<INFILE>) {
   $pcntg = int (($. / $linecnt ) * 100) ;
   print "$pcntg %\r";
   chomp();
   $record = $_;
   @values = split (',',$record);
   $sstation = $values[5];
   $sstation =~ s/ //g;
   if ($values[0] eq $date && $sstation eq $station ) {
      print OUTFILE $record."\n";
#      print "xx".$values[5]."xx\n";
      print "Record written to $outfile...\n";
      last;
   }
}

print "\nFinished\n";
close (INFILE);
close (OUTFILE);
[download]

-----------------------------------------------

Any help anyone can offer will be greatly appreciated. Also I hope this code can help others filter through large delimited documents.

In reply to Bulk Reading and Writing of Large Text Files by Sterling_Malory

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.