tshabet has asked for the wisdom of the Perl Monks concerning the following question:

For my daily installment of "I'm doing something dumb, I can just feel it" I present the following conundrum to the helpful Monks:
I am working with some XML, and I have several instances of lines like these:
<heading> level=2, Introduction to Arguments</heading>
or
<index> primary-key="listed-arguments", secondary-key="passing-functions", <paragraph> Arguments allow you to pass information to functions. There are two categories of arguments: </paragraph> </index>
In these snippets, I'd like to turn the
<foo> something=bar, Code is cool </foo>
into
<foo something=bar> Code is cool </foo>
that is, turn the something=bar into an attribute of the tag. So in order to do this, I implemented this regex:
$text =~ s/>\s?([\w\-]*?)\=([\w\-\"]*?)\,?/ $1\=$2>/gixs;
which, in my mind anyway, looks for the tag ending > followed by an optional space, then a "word"(plus numerals)/dashes followed by a mandatory equals sign followed by another word/dash/quotation mark combination followed by an optional comma. If the space or comma are found, they're eaten. The other stuff is replaced as an attribute.
Well, if it was working I wouldn't be asking, would I? :-)
So if anyone can give me a hint as to where my regex goes wrong, I'd appreciate it a lot. I have to buy that "mastering regular expressions" book and get it over with :-) Anyway, any and all help appreciated. The folks on this board are probably my most invaluable perl tool......thanks Monks!

Replies are listed 'Best First'.
Re: Something's awry in my regex...
by MZSanford (Curate) on Aug 22, 2001 at 19:26 UTC
    I am SURE there is a better way, but this seems to work (has not been tested with all possible strings) :
    while ($text =~ s/>\s+([\w\-]*?)\=([\w\-\"]*)\,?/ $1\=$2>/ixs) {}

    can't sleep clowns will eat me
    -- MZSanford
      a tiny bit cleaner...although i could swear there was a way to do it without the while...
      one of merlyn's doins i could swear :) i could be smokin crack though

      while ($test =~ s/>\s*([\w-]*?="[\w-]*?"),?/ $1>/s) {}