Re: Phone Number Regular Expression
by Roy Johnson (Monsignor) on Mar 16, 2006 at 16:55 UTC
|
| [reply] [d/l] |
|
|
Thanks, Roy, that's most helpful!
Of course, it's not quite as simple as all that - I want to substitute a different area code, and make sure to capture things that are formatted slightly differently. So I tested out this substitution, and I'm a little trouble with the scalars:
$_ = "800 555-9998";
s/800( |-|.)555-(?!9999)([0-9]{4})/888$1555-$2/;
print $_;
This doesn't seem to work, when I try to include $1 (the separator), though the second scalar comes through fine. I'm guessing I need something to set off $1 from '555'?
Apologies if I'm missing some basic syntax rules; I'm fairly new to perl's regexp lingo. | [reply] [d/l] |
|
|
s/.../888${1}555-$2/ will do it for you. The braces around the variable name, specifically, are what you're looking for.
| [reply] [d/l] |
|
|
|
|
Unrelated to the immediate problem, but you probably want
( |-|\.) instead of ( |-|.) near the start of your match, since the dot has special meaning within a regex and will match any character (almost. see perlre).
This means that currently that part of your regex is equivalent to just (.), and will happily match something like 8003555-1234
| [reply] [d/l] [select] |
|
|
You can set off variables from their surrounding text by enclosing the variable name in braces:
s/800( |-|.)555-(?!9999)([0-9]{4})/888${1}555-$2/;
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
|
|
One more question: what if the phone number is split over two lines? Is there any way to easily deal with that?
| [reply] |
|
|
Sure. If you are saying that $_ has newlines in it, you can just remove them with tr/\n//d. Then process as before.
It may be slightly trickier if you're saying the number is split across lines in an input file that you're reading one line at a time. Basically, you'll want to remember some of the previous line while you read the next line, stick them together, then look for the phone number. Exactly how you work that out depends on what you know about your input.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] [select] |
Re: Phone Number Regular Expression
by ikegami (Patriarch) on Mar 16, 2006 at 18:27 UTC
|
There's also
/800-555-(?!9999)/
and
/(\d+)-(\d+)-(\d+)/ && $1 == 800 && $2 == 555 && $3 != 9999
| [reply] [d/l] [select] |
Re: Phone Number Regular Expression
by injunjoel (Priest) on Mar 16, 2006 at 17:02 UTC
|
| [reply] |
|
|
| [reply] |
Re: Phone Number Regular Expression
by TedPride (Priest) on Mar 16, 2006 at 20:39 UTC
|
use strict;
use warnings;
my $area = 800;
my $pre = 555;
my $not = 9999;
$_ = join '', <DATA>;
print "$area-$pre-$1\n" while m/$area(?:\)?\s|\.|-)$pre[\s\.-]((?!9999
+)\d{4})/g;
__DATA__
1-800-555-9999
1-800-555-3456
(800) 555-3456
800 555 3456
800 555-3456
800.555.3456
800
555.5555
I think this covers all eventualities. | [reply] [d/l] |
|
|
Elegant, thanks!
What I'm doing is editing a bunch of files on a web server to reflect the new area code - so I had planned to read each line of the input file, make the replacement, and then save the new version (and a backup of the old version). Then I thought about the line-break issue.
So I think Roy's solution of grabbing a couple lines together, checking them, and them splitting them again before writing to the new file is what I need to do, in order to catch numbers that split over two lines. I see some stuff in your example that can help me with that.
| [reply] |
| A reply falls below the community's threshold of quality. You may see it by logging in. |