in reply to Conditional regex

I am parsing a webpage right now to make it 508 compliant.

Doesn't your parser have a means of locating IMG elements, and for each of them add an attributes if the attribute doesn't already exist? Looks like 3-4 lines of code.

Update: For XML::LibXML, it would be something similar to the following snippets. I expect something similar from HTML parsers.

for my $ele ($doc->findnodes('//img')) { if (!defined($ele->getAttribute('alt'))) { $ele->setAttribute(alt => ...); } }
or even
for my $ele ($doc->findnodes('//img[count(@alt)==0]')) { $ele->setAttribute(alt => ...); }

Replies are listed 'Best First'.
Re^2: Conditional regex
by sherab (Scribe) on May 08, 2009 at 20:22 UTC
    I see your point but it's a regex question.

      Not really. Like you said so well, your real problem is

      I am parsing a webpage right now to make it 508 compliant. If an IMG tag has an alt element, I need to leave it alone but if it doesn't, it needs one added

      I see writing an regexp-based parser as your (broken) solution, not your problem. I'm not gonna make a lot of work for myself reinventing an HTML parser when I can skip that step and go straight to changing the HTML you want changed.

      You could do some nested matching to make this "work" with regular expressions, as you were trying above but it is the wrong way to solve the problem for two reasons: it's as difficult as the related parser code and unless you're Jeffrey Friedl, it will never work as well as the related parser code.