in reply to Phone Number Regular Expression

/800-555-(?!9999)[0-9]{4}/ See also Using Look-ahead and Look-behind.

Caution: Contents may have been coded under pressure.

Replies are listed 'Best First'.
Re^2: Phone Number Regular Expression
by lev36 (Sexton) on Mar 16, 2006 at 17:41 UTC

    Thanks, Roy, that's most helpful!

    Of course, it's not quite as simple as all that - I want to substitute a different area code, and make sure to capture things that are formatted slightly differently. So I tested out this substitution, and I'm a little trouble with the scalars:

    $_ = "800 555-9998"; s/800( |-|.)555-(?!9999)([0-9]{4})/888$1555-$2/; print $_;

    This doesn't seem to work, when I try to include $1 (the separator), though the second scalar comes through fine. I'm guessing I need something to set off $1 from '555'?

    Apologies if I'm missing some basic syntax rules; I'm fairly new to perl's regexp lingo.

      s/.../888${1}555-$2/ will do it for you. The braces around the variable name, specifically, are what you're looking for.

      Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
      How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart

        Righteous! That should give me all the building blocks I need for my script. Thanks again to all!

        And thanks also to whomever gave the thread a more useful name...

      Unrelated to the immediate problem, but you probably want ( |-|\.) instead of ( |-|.) near the start of your match, since the dot has special meaning within a regex and will match any character (almost. see perlre).

      This means that currently that part of your regex is equivalent to just (.), and will happily match something like 8003555-1234

      You can set off variables from their surrounding text by enclosing the variable name in braces:
      s/800( |-|.)555-(?!9999)([0-9]{4})/888${1}555-$2/;

      Caution: Contents may have been coded under pressure.
Re^2: Phone Number Regular Expression
by lev36 (Sexton) on Mar 16, 2006 at 20:00 UTC
    One more question: what if the phone number is split over two lines? Is there any way to easily deal with that?
      Sure. If you are saying that $_ has newlines in it, you can just remove them with tr/\n//d. Then process as before.

      It may be slightly trickier if you're saying the number is split across lines in an input file that you're reading one line at a time. Basically, you'll want to remember some of the previous line while you read the next line, stick them together, then look for the phone number. Exactly how you work that out depends on what you know about your input.


      Caution: Contents may have been coded under pressure.