in reply to Re^2: looking for speed!! large file search and extract
in thread looking for speed!! large file search and extract

Thanx for answer but now have to make a small change
I only want to extract the lines on condition that the
line directly above it starts and ends with the
following 5 chararacters "xyzdf"
basically looking for 2 line match
thanx
  • Comment on Re^3: looking for speed!! large file search and extract

Replies are listed 'Best First'.
Re^4: looking for speed!! large file search and extract
by Tanktalus (Canon) on Jan 12, 2005 at 17:07 UTC

    A couple minor points (and maybe I'm a bit too new to PM to make the comments):

    1. You probably should have this in a new question, not a reply on the previous question, since it's now a new question.
    2. You probably should try something yourself, and then come back if it doesn't work. Or even if it does - share your answer and get feedback on it.
    One WTDI is to use a simplified state machine:
    my $match; while (<FH>) { print C $_ if $match and /^abcde.*PARTNAME$/; $match = /xyzdf$/; }
    This will set $match to true if the current line matches xyzdf at the end, false otherwise. The next time through the loop, we only check your second-line regexp if $match is already true (that is, the previous line matched the other regexp).

      It's not uncommon for people to modify their requirements. I think it belongs in the same thread as long as it's pretty close to the original problem.

      Caution: Contents may have been coded under pressure.
Re^4: looking for speed!! large file search and extract
by kutsu (Priest) on Jan 12, 2005 at 17:08 UTC

    then change holli's command line statement from perl -n -e "print if /^abcde/ && /PARTNAME$/" c:\somefile.txt>k:\1\somefile.txt to perl -n -e "print if /^xyzdf/ && /xyzdf$/" c:\somefile.txt>k:\1\somefile.txt

    If you don't understand this I really recommend you read perlre

    Update: Somehow missed reading "line directly above", so ignore the rest...except for reading perlre that's always a good idea if you haven't

    "Cogito cogito ergo cogito sum - I think that I think, therefore I think that I am." Ambrose Bierce

Re^4: looking for speed!! large file search and extract
by holli (Abbot) on Jan 12, 2005 at 19:19 UTC
    if i get your comment right, this could be:
    c:\> perl -n -e "print $last, $_ if /xyzdf$/ && $last; $last= /^xyzdf +/ ? $_ : ''" file1>file2
    Assuming file1 looks like
    abc xyzdf def hij xyzdf klm xyzdf nop qrs xyzdf
    file2 will end up as
    xyzdf def hij xyzdf xyzdf nop qrs xyzdf
    Is that what you want?

    Update:
    if not, post some sample data and the desired output.
      Thanx but not what I would like
      Input file

      xyzdfhhlhlljjlxyzdf
      PARTNAMEhjjhhjhjkjkjkjkjPARTNAME
      hjill''';
      hgkjlklj
      xyzdfhhlhll666666jlxyzdf
      PARTNAMEhjjh88888888888jkjkjkjkjPARTNAME
      xyzdfh
      PARTNAMEh_not_to_be_extracted_jkjkjkjPARTNAME
      ghghjhj
      jlkjpkj
      xyzdfhhlh888888888ljjlxyzdf
      PARTNAMEhjjh8888iiiiiiiiiiiii888jkjkjkjkjPARTNAME

      Output file

      PARTNAMEhjjhhjhjkjkjkjkjPARTNAME
      PARTNAMEhjjh88888888888jkjkjkjkjPARTNAME
      PARTNAMEhjjh8888iiiiiiiiiiiii888jkjkjkjkjPARTNAME

      only 3 lines extracted because line above condition not
      meet
        c:\> perl -n -e "print if /^PARTNAME/ && /PARTNAME$/ && $last; $last = + /^xyzdf/ && /xyzdf$/ ? $_ : ''" file1>file2 # file2: #PARTNAMEhjjhhjhjkjkjkjkjPARTNAME #PARTNAMEhjjh88888888888jkjkjkjkjPARTNAME #PARTNAMEhjjh8888iiiiiiiiiiiii888jkjkjkjkjPARTNAME
        Update: This will also match lines like:
        xyzdf PARTNAMEXXXXPARTNAME
        if that is unwanted you should use:
        c:\> perl -n -e "print if /^PARTNAME/ && /PARTNAME$/ && $last; $last = + /^xyzdf/ && /.xyzdf$/ ? $_ : ''" file1>file2