Well, actually not so obfuscated. The classic thing just happened, because I wrote this piece of code months ago and now I've completely forgotten how or why it actually works. Oh, well. It doesn't look complex, but I just can't get a hold of the idea anymore.
# A simple script to extract HTML/XML tag options use strict; use warnings; foreach my $file (@ARGV) { open (FILE, "< $file") or die "No such file: $file"; my @d = <FILE>; close FILE; my $t = join(' ', @d); $t =~ m/\<(\?)? ([a-zA-Z0-9]+)(?:\s+) (?{ print "<$^N>\n"; }) ( ((?>[a-zA-Z0-9]+)=(?>(".*?")|([^ "<>]+))) (?{ my ($opt, $val) = split(\/=\/, $^N); print "\t$opt\t\t$val\n"; }) \s*?)* ([ ]*\/)?> (.*?) (<\/\2>)? \1 /isx; }