Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Match the starting \>

by sandy1028 (Sexton)
on May 15, 2009 at 05:14 UTC ( [id://764185] : perlquestion . print w/replies, xml ) Need Help??

sandy1028 has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone please tell me how to check of $lines =~ matches \> or \< in a single statement
if($lines =~ m/\>/){ $lines =~ s/^\>\s//gi; if($lines =~ m/\</){ $lines =~ s/^\<\s//gi;

Replies are listed 'Best First'.
Re: Match the starting \>
by CountZero (Bishop) on May 15, 2009 at 06:19 UTC
    There is no reason to escape < or >. They are not special in a regex.

    This is the list of the so-called meta-characters: {}[]()^$.|*+?\

    If you want to match one of those you must escape it.


    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      They are not special in a regex.

      That is true for Perl, but some regex engines (GNU) use \< and \> as word boundary anchors, where \< is beginning of word and \> is end of word, whereas Perl only has \b. The argument against \b is that it is easily confused with a backspace character, and that it does not differentiate between start and end of word.

      So, using \< and \> could be doubly confusing to anyone coming from other RE regimes. Matching a string starting with a word-ending?
Re: Match the starting \>
by almut (Canon) on May 15, 2009 at 05:45 UTC

    Not entirely sure I understand, but maybe you want a character class [...], i.e.

    for ("< foo\n", "> bar\n") { my $lines = $_; if($lines =~ m/[<>]/){ $lines =~ s/^[<>]\s//; # do something else... } print $lines; } __END__ foo bar

    Update: as grizzley suggested, you could also use /<|>/ (alternation). While that would even be one char less to type in your match, it would become slightly more unwieldy in case the regex needs to contain other stuff, like in the substitution:

    $lines =~ s/^(?:<|>)\s//;

    (the (?:...) groups without capturing)

Re: Match the starting \>
by grizzley (Chaplain) on May 15, 2009 at 05:48 UTC
    Just if($lines =~ m/<|>/)
Re: Match the starting \>
by ww (Archbishop) on May 15, 2009 at 11:16 UTC

    Given lack of clarity in OP and the possibility that the OP is trying to convey the notion of word boundaries, it seems fair to ask:

    sandy1028: do you mean text which actually includes literal instances of backslash followed by a greater_than or less_than sign? (and what sort of textual material uses such a notation?)

    ...or did you use well-intentioned (but incorrect, see CountZero's and cdarke's notes above) code notation in the title and text?
    ...or do you mean something else entirely?

    Note that most of the regexen assume OP unnecessarily backwhacked < and > in confusion over Perl's regex syntax; if the text does indeed include sequences like those shown in OP and greater_than or less_than symbols which are not preceded by backslashes, you will have to modify the regexen to:

    • use alternate delimiters (such as m!/[<>]!
    • escape the backslashes.

    and, just BTW, since I don't see it mentioned elsewhere (...cleans coke bottle lenses), your /i modifier is useless: case insensitivity is not a relevant concept for the chars you discuss.

Re: Match the starting \>
by Crian (Curate) on May 15, 2009 at 08:47 UTC

    You do not have to test the appearance of < and > bevor deleting them.

    if (/</) { s/<//; }

    does nothing different to

      ...does nothing different

      it could do something different, though, if you stick more closely to what the original code was:

      if (/</) { s/^<\s//; # remove "< " at beginning of line # something with lines that do/did contain a '<' anywhere }

      (not sure whether that is what the OP intended, though...)