in reply to search pattern and arrays

I think this is what you are after although I may be misreading your post. If I understand correctly, you search for your phases in a certain order and you want to find the first occurence of each phrase ignoring any phrases that are out of sequence, e.g. a "phrase2" that is before the first "phrase1" is ignored.

To do this I read the lines into an array. If your file is very large this might not be feasible. The adjustment to $lineNo is because arrays are zero-based and I'm assuming you number your lines from 1.

use strict; use warnings; use List::Util q{first}; use Data::Dumper; open my $inFH, q{<}, \ <<EOD or die qq{open: $!\n}; 1:gash line 2:phrase4 3:akjdakj 4:fwefkwe 5:phrase5 6:phrase1 7:adsfwfw 8:phrase3 9:phrase5 10:jkjd wsekjw wiu 11:phrase2 12:wewefwefwf 13:another line 14:dsjwjk 15:adsfwfw 16:phrase3 17:another line 18:adsfwfw 19:phrase5 20:phrase6 21:ertgerher EOD my @lines = <$inFH>; close $inFH or die qq{close: $!\n}; my @phrases = map { qq{phrase$_} } 1 .. 6; my $cumulativeOffset = 0; foreach my $phrase ( @phrases ) { my $rxPhrase = qr{$phrase}; my $lineNo = first { $lines[ $_ ] =~ $rxPhrase } 0 .. $#lines; unless ( defined $lineNo ) { print qq{$phrase: not found in sequence\n}; next; } $lineNo ++; $cumulativeOffset += $lineNo; print qq{$phrase: $cumulativeOffset\n}; splice @lines, 0, $lineNo; }

The output.

phrase1: 6 phrase2: 11 phrase3: 16 phrase4: not found in sequence phrase5: 19 phrase6: 20

I hope I have guessed right and this is of use.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^2: search pattern and arrays
by Anonymous Monk on Jan 23, 2008 at 21:30 UTC
    thats great John... thanks a lot ....
    u almost got my question...
    but a few ellaborations of my posting,

    1. the file should not be inside the program, its read with the given location as in my program.

    2. the file doesnt contain the line number.

    3. we dont know the exact number of arguments which will be passed. (in ur program 1..6)

    4. Phrase 1, Phrase 2... was jus an example, it could be anything. for eg:

    total Laptops produced: 60
    total mice produced: 40
    total cpu sold : 57
    total printers produced: 98
    total monitors produced: 10

    .......................

    phrases like these could be present any number of times in the file. but our function should search in the order the phrases are passed.

    actually the phrase passed will be like, for eg "total cpu sold :" but it shud return the whole line, cuz the value 57 is important.

    to give u clear idea...
    the file " test.txt" contains:

    total Laptops produced: 60
    total cpu sold : 57
    total mice produced: 40
    total cpu sold : 45
    total Laptops produced: 68
    total mice produced: 48
    total cpu sold : 51
    total printers produced: 19
    total monitors produced: 149

    -------

    for eg: this is given


    $a= "total Laptops produced:";
    $b ="total mice produced:";
    $c = "total cpu produced:";

    &search_phrase($filename, $a, $b, $c);

    this function shud return 45.

    i thank u again for ur support...
      To take your points in order:

      1. I put the file inside the script to keep everything together. Another way of doing this would be to place the data at the end of the script after a __END__ or __DATA__ tag and read the DATA filehandle that the interpreter opens for you. However, I wanted to show you how to use the three-argument form of open which is considered best practice these days. Just substitute your variable containing your file to be read for the \ <<EOD ...

      2. I just put the line numbers in the data to show that the script was giving the "right" answers. Having them there did not affect how the script ran.

      3. Put something like my ( $file, @phrases ) = @ARGV; at the top of your script so that you don't have to worry how many phrases are being sought.

      4. If you are calling your script from the command line with a file and a series of phrases then I imagine you will enclose each phrase in single-quotes. To avoid the problem where the phrase might contain regex metacharacters change the line compiling the regex to my $rxPhrase = qr{\Q$phrase\E};

      I'm not sure how you arrive at an answer of 45; did you mean to say $c = "total cpu sold:"?

      Cheers,

      JohnGG

        yes John


        total cpu sold: 45

        i need that 45...
        even though that phrase was present before, we shud ignore it and follow the order in which the phrases are passed as arguments.... and return the last phrase's value (ie., 45)


        if the last phrase was not found in that file then the last but one phrase, if not found the previous one and so on...

        that is why the order in which we search is very important...

        if u still have questions, i can explain it to you further down

        thanks,
        Mercury.