Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I have a doubt on substituting array value in regular expression .

I have a array with any number of values in it .
Now I want to search a file and if any value in array matches any line in the file I want to replace that with the word "pagingRAC" .
So far I am doing the following without much help :
@final_array is the array containing any number of values say .
Now I open a file and search the file for any line matching any array value .
If a match is found I replace it with the word "pagingRAC".
My regular expression for the above task is :
open(DOC,'sample_testing'); while(<DOC>) { $_.=<DOC>; foreach my $value (@final_array) { s/$value/pagingRAC/g; } print $_; }
My basic aim is to get array value substituted in the regular expression .
Thankyou !

Replies are listed 'Best First'.
Re: Substitute array value in regular expression
by Roger (Parson) on Dec 30, 2003 at 13:07 UTC
    There are many ways to do this of course. The following are a few that I normally use.

    # method 1 my $value = join '|', @final_array; while (<DOC>) { s/$value/pagingRAC/g; print } # method 2 my $doc = do { local $/; <DOC> }; my $value = join '|', @final_array; $doc =~ s/$value/pagingRAC/gm; print $doc; # method 3 my $value = join '|', @final_array; print for map { s/$value/pagingRAC/g; $_ } <DOC>;
Re: Substitute array value in regular expression
by maa (Pilgrim) on Dec 30, 2003 at 12:43 UTC

    Hi,
    it seems to do what you want (although I don't think $_.=<DOC>; is really what you want... (append to $_ ?) better using while(defined($_=<DOC>))

    #!C:/Activeperl/bin/perl.exe use strict; use warnings; my @final_array=qw/test1 test2 fred/; #open(DOC,'sample_testing'); while(defined($_=<DATA>)){ #$_.=<DOC>; <- why are you appening? foreach my $value (@final_array) { s/$value/pagingRAC/g; } print $_; } __DATA__ this is a test this line contains test1 and test2 fred bassett is a dog but I can't spell his name.

    Prints

    this is a test this line contains pagingRAC and pagingRAC pagingRAC bassett is a dog but I can't spell his name.

    HTH - mark

    Edit by tye: Change PRE to CODE around not-short lines

      while(defined($_=<DATA>))
      can be this:
      while (<DATA>)

      also check open() for error:

      open(DOC,'sample_testing') or die "Can't open input file: $!";
      If you want to rid yourself of the for loop every time you go through the while you could build up a combined regular expression match before entering the loop....something like:
      use strict; use warnings; my @final_array=qw/test1 test2 fred/; my $regexp = "(" . join("|", @final_array) . ")"; while(<DATA>){ s/$regexp/pagingRAC/g; print $_; } __DATA__ this is a test this line contains test1 and test2 fred bassett is a dog but I can't spell his name.
      This makes it a little cleaner for the eye.
      Hi,
      my @final_array=qw/test1 test2 fred/;
      How can I define array in this way when I making the array dynamically .
      I search any "ps" file and then feeds the array from the search performed on the "ps" file.
      So the search pattern is based on the array value and i can't prefix it using qw.
      So how can I declare the pattern now using qw dynamically.
      Thanks for ur previous file .
        Looks like you need to read the perlop documentation on CPAN and understand the meaning of qw (Quoted Words). What qw does is to build a list, not to construct search patterns.

        my @array = qw/ element1 element2 element3 /;
        is equivalent to
        my @array = (); push @array, 'element1'; push @array, 'element2'; push @array, 'element3';
        Given that your @final_array is already setup earlier, all you need to do is to build the combined search pattern with join:
        my $pattern = join '|', @final_array;
        Why do I want to join the patterns with '|'? Because effectively I want to build a regular expression like below:
        s/pattern1|pattern2|pattern3/replace/g; # which is equivalent to my $patterns = "pattern1|pattern2|pattern3"; s/$patterns/replace/g;
Re: Substitute array value in regular expression
by davido (Cardinal) on Dec 30, 2003 at 16:36 UTC
    pilot_vijay: This is an adaptation on the method I proposed last night in the Chatterbox when you asked this question. Also remember the suggestion to read perlop, perlretut and perlre. Therein you will gain a greater understanding of the concepts at work.

    The method I proposed last night was to put your RE patterns as values in a hash, "quoted" using the qr// mechanism so that possibly the RE's could be precompiled for efficiency. My earlier suggestion was based on the idea that the keys to the hash would be the replacement value. Iterating over the hash would produce replacement/pattern pairs. But that was based on the idea that you had multiple replacement values. Now that the question is better defined, I see that there is little need for a hash, and an array is a good choice.

    That said, here's my recommendation:

    # Prebuild the RE patterns for efficiency's sake. my @patterns = ( qr(pattern1), qr(pattern2), qr(pattern3), qr(pattern_n) ); # Use paragraph mode in case your pattern spans multiple lines. # This may need to be combined with the /s switch on your # regexp. Since we don't know what the patterns look like, # I'll use this mechanism to be on the safe side. { local $/ = "\n\n"; while ( my $paragraph = <FILE> ) { $paragraph =~ s/$_/pagingRAC/gs foreach @patterns; print TEMP_OUT $paragraph; } }

    My example prints the results of the substitution out to a temporary output file that presumably you open earlier in the script. And again, paragraph mode is used so that patterns that span linebreaks will work ok. But you may not need that part. Also, if your list of RE's is short, you may just join them all together into a big alternation chain. And if the file you're running through is small enough, you may just want to slurp it in, though that doesn't scale well.

    Is this question related to the barrage of questions you posted a month or so ago about converting PDF files to HTML, DOC files to HTML, and so on? If so, that might provide additional context that we could relate to in giving helpful answers.

    Update: Fixed the foreach. Thanks Roy Johnson for the correction. The foreach was in my head but failed to make it to my fingertips. ;)


    Dave

      Shouldn't the substitution line be
      $paragraph =~ s/$_/pagingRAC/gs for @patterns;
      ?

      The PerlMonk tr/// Advocate
        update: deleted

        --
        TTTATCGGTCGTTATATAGATGTTTGCA

Re: Substitute array value in regular expression
by ysth (Canon) on Dec 30, 2003 at 21:14 UTC
    This is a wonderful example of a Seeker of Perl Wisdom posting his/her trial code. What's missing is example input and output that shows what is expected vs. what is produced.

    Looks like you are trying to operate on 2 lines at a time (at least that's the effect of your while(<DOC>){$.=<DOC>). Is this correct?

    I don't see anything wrong with the rest of the code (modulo others suggestions of using join or qr to avoid the inner loop or recompiling the pattern each time).

    Is it possible you have special characters in @final_array that you want to match literally? If so, you need s/\Q$value\E/... or to preprocess your array (once, not in the while loop) with $_ = quotemeta($_) for @final_array or something similar.