If you want to create Latex files I think you should consider LaTeX::Parser, LaTeX::TOM or any other specific modules for this work. Html is not the same as xml but anyways I think that you could find useful also to take a look to gnuhtml2latex, a perl script to parse html to latex.