in reply to Re^2: Write to multiple files according to multiple regex
in thread Write to multiple files according to multiple regex

I can suggest several improvements to the code you have posted.

Declare all variables in the smallest possible scope. Your declaration of all variables at the start of the file largely defeats your use of strict.

Lexical file handles are much easier to manage than globs.

The three argument form of open would make the intention clearer.

Storing your file data in an array of hashes rather than in parallel arrays probably would not make any difference in speed, but it would help your readers by keeping related data together.

Store you regexes as regexes (use qr//) rather than strings. It is probably faster, and it certainly makes the intention clearer.

Note: The $INPUT_RECORD_SEPARATOR is a string not a regex.

UNTESTED

#!perl use strict; use warnings; use FindBin; my $dir = "$FindBin::Bin/../rxo"; opendir( my $dh, $dir ) || die "can't opendir $dir: $!"; my @inputs = readdir($dh); closedir $dh; splice @inputs, 0, 2; my @dispatch; foreach (@inputs) { my $outfile = "$FindBin::Bin/../blocks/$_"; open my $ofh, '>', $outfile || die; my $file = "$FindBin::Bin/../rxo/$_"; open my $fh, '<', $file || die; my $regex = <$fh>; close $fh; push @dispatch, { file => $ofh, regex => qr/$regex/ }; } while ( my $line = do{ local $/ = 'END'; <> } ) { foreach (@dispatch) { print { $_->{file} } $line if $line =~ $_->{regex}; } }
Bill

Replies are listed 'Best First'.
Re^4: Write to multiple files according to multiple regex
by Foodeywo (Novice) on Jul 21, 2015 at 18:46 UTC
    thank you very much! this runs and is much faster. however I have problems with the $/. It stops after the first match was found. So i get 1 entry in 1 File, and the rest of the file remains empty.
      It stops after the first match was found. So i get 1 entry in 1 File, and the rest of the file remains empty.

      Are all the output files created, but empty? Are you sure that the one entry is correct? If your input is not broken into blocks correctly, the value of $/ is not correct. The scheme will not work if every end-of-block is not exactly the same. (Remember: $/ is a string) Please post a few (three to ten) blocks of realistic data. Use code tags so we can download it exactly. For security, you can use made-up data, but the format must be exact.

      Bill
        yes they are all created and empty.

        the single match that is found is only found if the first block is a match. else everything is empty.

        data looks like that:
        UT 123456789 1234 9876 1234 some additional string information THE_END UT 987654321 1234 2345 some additional string information THE_END UT 1928374756 4321 2567 1234 THE_END some additional string information UT 5647382910 1234 2435 5678 some additional string information THE_END

        notice I changed END to THE_END to make it more unique, since other lines may accidentially contain the string "END" and I cant use regex "^END"

        the current code is:

        #!perl use strict; use warnings; use FindBin; my $dir ="$FindBin::Bin/../rxo"; opendir(my $dh, $dir) || die "can't opendir $dir: $!"; my @inputs = readdir($dh); closedir $dh; splice @inputs, 0, 2; my @dispatch; foreach(@inputs) { my $outfile = "$FindBin::Bin/../blocks/$_"; open my $ofh, '>', $outfile || die; my $file = "$FindBin::Bin/../rxo/$_"; open my $fh, '<', $file || die;; my $regex = <$fh>; close $fh; push @dispatch, { file => $ofh, regex => qr/$regex/ }; } while(my $line = do { local $/ = 'THE_END'; <> }) { foreach (@dispatch) { print { $_->{file} } $line if $line =~ $_->{regex}; } }