in reply to Re: Using the second word in a line to split a file into multiple files
in thread Using the second word in a line to split a file into multiple files

Hello, thank you so much for your help and guidance, I am very new to this forum. I was able to get this working, and you are correct, I had filecount in there from the code I was using to start out, and focused mainly on the regex problems. I was able to get it to do exactly what was needed on the test data I presented, then realized the reason it wasn't working on the actual data file I need to parse, is because I have .s in the second word, and that second word always ends with a dot. (.s) This works for the orignal test data:
use warnings; use strict; my $infn = '/Users/azeller/Documents/Rogers_import/20190822_RR_export- +nrcmd.txt'; open(my $infh, '<', $infn) or die "$infn: $!"; my $outfh; my $filecount = 0; while ( my $line = <$infh> ) { if ( $line =~ /^zone\s+(\w+)\W+\w+\s*$/ ) { close $outfh if $outfh; my $outfn = sprintf '%sdb', $1; open($outfh, '>', $outfn) or die "$outfn: $!"; } if ($outfh) { print {$outfh} $line or die "print: $!"; } } close($outfh) if $outfh; close($infh);
But it doesn't work for the actual data I am parsing. My data file format is actually more like the following:
one 1file1.nest. 1ss record1a record1b record1c record 1d 2 record empty endoffile zone 2file2.egg. 1ss record1a record1b record1c record 1d 2 record empty endoffile

Replies are listed 'Best First'.
Re^3: Using the second word in a line to split a file into multiple files
by haukex (Archbishop) on Aug 26, 2019 at 14:42 UTC

    Please use <code> tags to format your code and sample input and output.

    I have .s in the second word, and that second word always ends with a dot.

    Now might be a good time to look at perlretut, as jcb suggested, or perhaps perlrequick. The \w+ will only match Word characters (normally [a-zA-Z0-9_] plus Unicode "Word" characters), but not including the dot. Perhaps you want to say "word characters plus dot", i.e. [\w.]+, or simply "any non-whitespace characters", i.e. \S+.

    Update: Edited first sentence that was accidentally cut off.