Hi, I'm a bit rusty at this so I thought I seek advice from a higher source =). I've got a script I put together and its aim is to read in a data file and create multiple files based on the same first 6 characters (in my case I'm processing NMEA data. i.e.

$GPVTG,156.08,T,,M,0.08,N,0.15,K,D*3E $GPGGA,181908.20,3809.22198,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,7.2,0138*73 $GPVTG,156.13,T,,M,0.05,N,0.09,K,D*34 $GPGGA,181908.40,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,7.4,0138*7C $GPVTG,284.88,T,,M,0.06,N,0.11,K,D*30 $GPGGA,181908.60,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,6.6,0138*7D $GPVTG,1.72,T,,M,0.01,N,0.02,K,D*3F $GPGGA,181908.80,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,6.8,0138*7D $GPVTG,175.67,T,,M,0.06,N,0.11,K,D*3C $GPGGA,181909.00,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,7.0,0138*7D $GPVTG,357.02,T,,M,0.11,N,0.21,K,D*38 $GPZDA,181909.00,24,07,2008,00,00*65 $GPGGA,181909.20,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,7.2,0138*7D $GPVTG,25.22,T,,M,0.06,N,0.11,K,D*09 $GPGGA,181909.40,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,7.4,0138*7D $GPVTG,157.60,T,,M,0.06,N,0.12,K,D*38 $GPGGA,181909.60,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,3.6,0138*79 $GPVTG,49.76,T,,M,0.09,N,0.17,K,D*0B $GPGGA,181909.80,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,3.8,0138*79 $GPVTG,304.77,T,,M,0.08,N,0.15,K,D*33 $GPGGA,181910.00,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,4.0,0138*76 $GPVTG,168.33,T,,M,0.08,N,0.15,K,D*3B $GPGGA,181910.20,3809.22197,N,09726.10823,W,2,10,0.9,453.7,M,-27.1,M,4.2,0138*76 $GPVTG,202.08,T,,M,0.16,N,0.29,K,D*3C

The above example would make 3 files (one each for $GPGGA, $GPVTG and $GPZDA.

I want the files producted to be of the format: <orig_filename_prefix>_<first 6 chars>.txt

so file mydata.txt might split to mydata_$GPGGA.txt and mydata_$GPGGA

The script WORKS for <first 6 chars>.txt format but when I add the filename prefix it all goes to hell and gobbles up my main file.

I'd appreciate some hints on what obvious n00b mistake I've made this time.

Thanks Paul =) p.s. Heres the script
#!/usr/bin/perl my $file = shift; bad_format() if ($file eq "" ); open FILE, $file or die "Could not open file [$file]\n"; print "file : $file\n"; #Uncomment this section and things go goofy #$file =~ m/(\w+)\..*/; #my $fname = $1; ### my %files = (); # Hash with file prefix and handle my $line; while ($line = <FILE>) { $line =~ m/^(.{6}).*/; my $ffc = $1; if ($ffc ne "" ) { my $check = 0; foreach my $key(%files) { $check = 1 if ($ffc eq $key); } if ($check == 0) { print "Adding new handle : $ffc\n"; local *FH; open (FH, ">$ffc.txt") or die; #open (FH, ">$fname_$ffc.txt") or die; # I want to save + the file as this format $files{$ffc} = *FH; } my $f = $files{$ffc}; print $f $line; #print "writing to $key\n"; } } while (my ($key, $value) = each (%files)) { print "Closing $key\n"; close $value; } close FILE; sub bad_format { print "\nformat: split <file>\n\n"; exit; }
UPDATE:

Thanks all for your comments. There were a number of mistakes I'd made and some nice alternate methods for doing things I hadn't seen. I hadn't worked with hashes of file handles before and used an older PM search to integrate the method I had, but Ikegami's direct assignment of the handle into the hash is much more elegant. Thanks for the help, here is the final script I ended up with.
#!/usr/bin/perl # Splits a data file into unique files based on each lines first 6 cha +racters use warnings; use strict; my $file = shift; bad_format() if ($file eq "" ); open FILE, $file or die "Could not open file [$file]\n"; my ($fname) = $file =~ m/(\w+)\..*/; my %files = (); while (my $line = <FILE>) { if ($line !~ /^\s*$/) { my $fc = substr($line, 0, 6); # first characters if (!exists $files{$fc}) { open ($files{$fc}, ">$fname\_$fc.txt") or die; } print {$files{$fc}} $line; } } while (my ($key, $value) = each (%files)) { print "Created $fname\_$key.txt\n"; close $value; } close FILE;

In reply to RegExp eating my $1 - FIXED! by thekestrel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.