shadowfox has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl Monks! I seek some knowledge in the form of regex. I have input lines read from a file that reads something like

<A_1234>1234/12/12345</A_1234>

I'd like to strip the slashes from the middle digits to get this format output.

<A_1234>12341212345</A_1234>

I'm using this below which is close, but it also takes away the slash from my closing tag giving me

<A_1234>201111097260<A_1234>
if (/<A_1234>\d+\/\d+\/\d+(.*)<\/A_1234>/) { s/\///g; }
I tried to writing the match and regex parts several different ways with unchanging results so I come to smarter people!

Replies are listed 'Best First'.
Re: Simple string parsing help with regex
by choroba (Cardinal) on Dec 22, 2011 at 14:47 UTC
    Use negative look-behind assertion (documented in perlre):
    s%(?<!<)/%%g
    No if needed.

      "Negative look-behind"? This is (except for the name) a rather cool feature i just now learned. I'd give you more than one "++", but i'm not allowed to...

      BREW /very/strong/coffee HTTP/1.1
      Host: goodmorning.example.com
      
      418 I'm a teapot
      Now that is very cool, I can see using that a lot! Somwhow I've never ran across that ability before, despite skimming pealre countless times.

      Thanks very much!

      The various new look-ahead/behind assertions from 5.10 are very cool, but I'll admit I haven't used them enough to find them handy yet, so in a case like this, I still think, "I want to get rid of every slash that follows a character other than a less-than, so I'll capture that character and replace them both with it," leading to this:

      s|([^<])/|$1|g;

      However, I just benchmarked that compared to your look-behind method, and yours is 75% faster. Guess I need to start learning those newer assertions, and not just for when it's impossible to do something the old way!

      Aaron B.
      My Woefully Neglected Blog, where I occasionally mention Perl.

Re: Simple string parsing help with regex
by umasuresh (Hermit) on Dec 22, 2011 at 17:20 UTC
    I find this site very useful to first test my regex, before trying it in a code: regexpal
Re: Simple string parsing help with regex
by lepht (Novice) on Jan 03, 2012 at 07:28 UTC

    The first thing that came to my mind when reading your description:

    I'd like to strip the slashes from the middle digits to get this format output.

    Was this simple substitution:

    s{(\d)/(\d)}{$1$2}g;

    Which will work on older versions of Perl that don't support the negative look-behind syntax, but may be less robust than those solutions depending on the rest of the input file's contents.