venkat1312 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I would like to know how to grep the first occurrence of a string above(upwards) a reference string.

for (e.x.)in the below set, I would like to get the first occurrence of "AAA" above "XXX".

AAA BBB AAA CCC XXX
Thanks,
Venkat

Replies are listed 'Best First'.
Re: Grep the first occurrence of a string above a reference string
by Discipulus (Canon) on May 05, 2016 at 07:18 UTC
    welcome to the monastery venkat1312

    Have you tried something until now? the answer hevily depends on rules of the pattern and the string composition.

    But something almost simple like this can suffice (pay attention to the [^AX] part: it may need to be more elaborate).

    # warning win32 doublequotes oneliner perl -e " print $1 if $ARGV[0]=~/(AAA)[^AX].+XXX/" ABBBAAACCCXXX AAA

    that explained by YAPE::Regex::Explain becomes:

    perl -MYAPE::Regex::Explain -e " print YAPE::Regex::Explain->new(qr/(A +AA)[^AX].+XXX/)->explain(); " The regular expression: (?-imsx:(AAA)[^AX].+XXX) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- AAA 'AAA' ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- [^AX] any character except: 'A', 'X' ---------------------------------------------------------------------- .+ any character except \n (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- XXX 'XXX' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Grep the first occurrence of a string above a reference string
by ww (Archbishop) on May 05, 2016 at 11:46 UTC

    Your example suggests that the entire search is to be performed on data appearing once in a single line. If so, you have a fine suggestion above.

    But if your problem case involves data that is multi-line with more than one instance of each target and reference datum (or similarly, if the data is all on a single line but may contain repetitions of the target and reference data) -- perhaps like this:

    YYY AAA BBB CCC AAA freida XXX GGG FFF AAA. XXX ozymandis BBB...

    ... then you'll need to inspect each line for your initial string, and save it (and perhaps, its index if that's going to be useful as you go forward, which seems likely as you've stated your question) or a reference (etc) before continuing to look for your "reference data" discarding any intervening data.

    Then process the saved hit (AAA) before continuing the search for additional instances of AAA before XXX.


    $anecdote ne $data

    And please note: we expect SOPW to seek wisdom, not to ask us to do so for them.

      Your example suggests that the entire search is to be performed on data appearing once in a single line.

      This is due to missing <code> tags. As evidenced by the HTML source code, the OP is as follows:


      Hi,
      
      I would like to know how to grep the first occurrence of a string above(upwards) a reference string.
      
      for (e.x.)in the below set, I would like to get the first occurrence of "AAA" above "XXX".
      
      AAA
      BBB
      AAA
      CCC
      XXX
      
      Thanks,
      Venkat
      
      


      To the OP - if you rephrase your problem, you have the solution almost there:

      I want to keep track of a pattern I seek for, and output the found string after finding a reference string

      which is what ww did for you above already. Hence a possible solution is

      #!/usr/bin/perl # file grep.pl my $sought = shift; # from @ARGV my $reference = shift; $sought && $reference or die "usage: $0 soughtstring refstring files\n +"; my $found; while(<>) { chop; # strip newline /$sought/ and $found = $_; # or, if you want the very first occurence, don't # overwrite the variable (see perlop for '||='): # /$sought/ and $found ||= $_; if (/$reference/) { print "$ARGV: '$found'\n" if $found; $found = ''; } }

      to be used as

      $ perl grep.pl AAA XXX example.txt

      which invoked upon this example.text

      1 YYY 2 first AAA 3 BBB 4 CCC 5 second AAA 6 freida 7 XXX 8 GGG 9 FFF 10 third AAA 11 XXX 12 ozymandis 13 BBB 14 blorflydick 15 XXX 16 fourth AAA

      produces this:

      example.txt: ' 5 second AAA' example.txt: ' 10 third AAA'

      Lines 2 and 16 are not printed. If you use the commented alternative, line 2 is printed instead of line 5.

      perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'