in reply to Preferred Methods (again)

I guess the only comments I would have on the regexp would be to sprinkle it with \s* and that "(\S+?)" is a weird way to capture the content of an attribute:

m#\s(\S+)\s*=\s*"([^"]*)"#g

Replies are listed 'Best First'.
Re(2): Preferred Methods (again)
by FoxtrotUniform (Prior) on Jan 17, 2002 at 03:53 UTC

    "(\S+?)" is a broken way to capture an attribute. (Hint: what happens if an attribute contains whitespace chars?) Consider using "([^"]+)" instead. Even better, consider profiling to make sure that using a proper XML-parsing module (whose author has already gone looking for this sort of bug) is enough of a slow-down to merit going to hard regex-based chunking.

    Update: Yeah, if you care about empty attributes (debatable; I usually don't), "([^"]*)" is the way to go. Thanks Matts!

    --
    :wq
      That would be "([^"]*)", otherwise you miss empty attributes!