The name of what you are trying to capture is DOM (Document Object Model). You can access it in Perl via one of the HTML parsers, e.g.
HTML::TreeBuilder. You can access it natively in JavaScript. So either grab the file and parse it with Perl or use JavaScript to send the DOM tree as a JSON string to a Perl script and use one of the CPAN JSON modules to turn it into a Perl data structure.