Simple string parsing help with regex

shadowfox has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl Monks! I seek some knowledge in the form of regex. I have input lines read from a file that reads something like

<A_1234>1234/12/12345</A_1234>

I'd like to strip the slashes from the middle digits to get this format output.

<A_1234>12341212345</A_1234>

I'm using this below which is close, but it also takes away the slash from my closing tag giving me

<A_1234>201111097260<A_1234>

if (/<A_1234>\d+\/\d+\/\d+(.*)<\/A_1234>/) {
    s/\///g;
}
[download]

I tried to writing the match and regex parts several different ways with unchanging results so I come to smarter people!

Comment on Simple string parsing help with regex Download Code

Replies are listed 'Best First'.
Re: Simple string parsing help with regex by choroba (Cardinal) on Dec 22, 2011 at 14:47 UTC
Use negative look-behind assertion (documented in perlre): `s%(?<!<)/%%g` [download] No `if` needed.	[reply] [d/l] [select]
Re^2: Simple string parsing help with regex by cavac (Prior) on Dec 22, 2011 at 16:11 UTC
"Negative look-behind"? This is (except for the name) a rather cool feature i just now learned. I'd give you more than one "++", but i'm not allowed to... BREW /very/strong/coffee HTTP/1.1 Host: goodmorning.example.com 418 I'm a teapot	[reply]
Re^2: Simple string parsing help with regex by shadowfox (Beadle) on Dec 22, 2011 at 15:10 UTC
Now that is very cool, I can see using that a lot! Somwhow I've never ran across that ability before, despite skimming pealre countless times. Thanks very much!	[reply]
Re^2: Simple string parsing help with regex by aaron_baugher (Curate) on Dec 23, 2011 at 23:43 UTC
The various new look-ahead/behind assertions from 5.10 are very cool, but I'll admit I haven't used them enough to find them handy yet, so in a case like this, I still think, "I want to get rid of every slash that follows a character other than a less-than, so I'll capture that character and replace them both with it," leading to this: `s\|([^<])/\|$1\|g;` [download] However, I just benchmarked that compared to your look-behind method, and yours is 75% faster. Guess I need to start learning those newer assertions, and not just for when it's impossible to do something the old way! Aaron B. My Woefully Neglected Blog, where I occasionally mention Perl.	[reply] [d/l]
Re: Simple string parsing help with regex by umasuresh (Hermit) on Dec 22, 2011 at 17:20 UTC
I find this site very useful to first test my regex, before trying it in a code: regexpal	[reply]
Re: Simple string parsing help with regex by lepht (Novice) on Jan 03, 2012 at 07:28 UTC
The first thing that came to my mind when reading your description: I'd like to strip the slashes from the middle digits to get this format output. Was this simple substitution: `s{(\d)/(\d)}{$1$2}g;` [download] Which will work on older versions of Perl that don't support the negative look-behind syntax, but may be less robust than those solutions depending on the rest of the input file's contents.	[reply] [d/l]