can consume a lot of memory as it first builds a representation of the document using oodles of objects. It also burdens the CPU more.
on the other hand parses the HTML input stream as it comes along, reducing the overhead quite a bit.
is easy for a quick oneshot script, or nice when you intend to do heavyduty transformation or mangling on the input document's structure, but not a good choice to build an all purpose tool on top of.