alienhuman has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I've got a string that looks like this:

blah @QUERY meaningful data meaningful data ... etc... @ENDQUERY @QUERY2 more data more data ... etc ... @ENDQUERY2

What I'd like to do is strip out all those blank lines before @QUERY (note I don't care about the blanks elsewhere).

As an added wrinkle, these lines can either have \r\n or \n at the end of 'em.

Your help appreciated,

AH

----------
Using perl 5.6.1 unless otherwise noted. Apache 1.3.27 unless otherwise noted. Redhat 7.1 unless otherwise noted.

Replies are listed 'Best First'.
Re: strip out lines until match
by bart (Canon) on Apr 15, 2004 at 22:30 UTC
    I'd be tempted to use
    while(<>) { tr/\r//d; # get rid of the CR print if /^\@QUERY/ .. /^\@ENDQUERY/; }
    assuming the '@QUERY' and '@ENDQUERY' are prefixes that have to appear everywhere. This will only print the data lines, and the delimiter lines — not the junk inbetween.

    The flipflop operator, .. in scalar context, is a particular one which has built in memory of its previous state. Its initial state is false. So it'll return true from the first time the expression on its left hand side returns true, and every time you'll test it after that, it'll still remain true, up to and including the first time the expression on its right hand side becomes true, since this start. After that, it'll return false again... and you can start again from the top.

    If you really want to do it your way, try

    while(<>) { tr/\r//d; print if /^\@QUERY/ .. eof; }
    in which the expression will evaluate to true from the first time '@QUERY' is encountered, up to the end of the current file.
Re: strip out lines until match
by muba (Priest) on Apr 15, 2004 at 22:10 UTC
    You say you don't care about it. So why not delete them too?

    Put the lines in an array @lines. Now:
    @lines = grep { $_ !~ /^[ \t]*\n?\r?$/ } @lines
    And all blanks will be gone. Note that this one will harm your data sections if they contain and need blank lines.

    Update: Changed regexp according to Enlil's reply.
      I am guessing you meant
      @lines = grep { $_ !~ /^\s*\n?\r?$/ } @lines
      instead of
      @lines = grep { $_ != /^\s*\n?\r?$/ } @lines
      (i.e., s/!=/!~/).

      Also as a nitpick, \s is an abbreviation to the character class [\ \t\r\n\f] (reference: perlreftut)so the \n and the \r are never matched as the * is greedy (i.e. they are unnecessary).

      -enlil

Re: strip out lines until match
by BrowserUk (Patriarch) on Apr 15, 2004 at 23:15 UTC

    You did say this was in a string (scalar)?

    print $string; blah @QUERY meaningful data meaningful data ... etc... @ENDQUERY @QUERY2 more data more data ... etc ... @ENDQUERY2 $string =~ s[(?:\r?\n\s*)+(\@QUERY)][\n$1]g; print $string; blah @QUERY meaningful data meaningful data ... etc... @ENDQUERY @QUERY2 more data more data ... etc ... @ENDQUERY2

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

      Yes, it's a string, so I ought to be able to do what you suggest... however that regexp doesn't seem to match my string. I threw together the following to test it:

      $string = " blah @QUERY meaningful data meaningful data ... etc... @ENDQUERY @QUERY2 more data more data ... etc ... @ENDQUERY2"; print "string1: $string\n\n"; $string =~ s[(?:\r?\n\s*)+(\@QUERY)][\n$1]g; print "string2: $string\n\n";'

      And my output is something like this. Can you help me tweak the regexp?

      string1: blah meaningful data meaningful data ... etc... more data more data ... etc ... string2: blah meaningful data meaningful data ... etc... more data more data ... etc ...

      Thanks,

      AH

      P.S. bart I'm looking at ".." and it seems very powerful... I may end up using it, once get my head around it.

      ----------
      Using perl 5.6.1 unless otherwise noted. Apache 1.3.27 unless otherwise noted. Redhat 7.1 unless otherwise noted.

        If you turned "use strict" on, you'd see the problem.

        string = " blah @QUERY meaningful data meaningful data ... etc... @ENDQUERY @QUERY2 more data more data ... etc ... @ENDQUERY2"; Possible unintended interpolation of @QUERY in string at (eval 1) line + 1, <> line 16. Possible unintended interpolation of @ENDQUERY in string at (eval 1) l +ine 1, <> line 16. Possible unintended interpolation of @QUERY2 in string at (eval 1) lin +e 1, <> line 16. Possible unintended interpolation of @ENDQUERY2 in string at (eval 1) +line 1, <> line 16.

        Replace the "s with 's when you initialise $string and you will see the correct result.

        Also, as Enlil points out elsewhere in the thread, my regex was more complex than necessary. It can be replaced with s[\s+(\n\@QUERY)][$1]g;.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
Re: strip out lines until match
by ccn (Vicar) on Apr 15, 2004 at 22:40 UTC
    $string =~ s/\A\s+@QUERY/@QUERY/
    or
    $string = substr($string, index($string, '@QUERY'));