How do we remove specific HTML element

abdan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How do we remove specific HTML tag by haukex (Archbishop) on Nov 07, 2021 at 05:35 UTC
a certain html element located by its nth order from the top/start of file The CSS `:nth-of-type()` selector might be what you are looking for, which is supported by Mojo::DOM::CSS; try changing the `1` to a `2` in the following to see the effect: `use Mojo::DOM; my $dom = Mojo::DOM->new($html); $dom->at('nav:nth-of-type(1)')->remove; print $dom->to_string;` [download]	[reply] [d/l] [select]
Re: How do we remove specific HTML tag by choroba (Cardinal) on Nov 07, 2021 at 21:40 UTC
XML::LibXML can open HTML. `#!/usr/bin/perl use warnings; use strict; use XML::LibXML; my $html = 'XML::LibXML'->load_html(location => 'file.html', recover => 1); my $nav2 = $html->find('(//nav)[2]')->[0]; $nav2->parentNode->removeChild($nav2); print $html->toString; # c="d" is gone.` [download] Or, more succinctly in XML::XSH2: `open :r :F html file.html ; rm (//nav)[2] ; ls / ;` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re: How do we remove specific HTML tag by Marshall (Canon) on Nov 07, 2021 at 03:16 UTC
I don't know about a completely general HTML solution because I am not an HTML expert. However, it could be that something simple would work ok? Here is some code that stops printing <nav sections after it has seen the first one. You could adapt this to your desired nth parameter functionality. `use strict; use warnings; my $nav_seen =0; while (<DATA>) { # if inside of <nav> section, print it # unless we have seen a <nav> section before if (my $status = /<nav/ ... /<\/nav/) { print unless $nav_seen; $nav_seen++ if $status =~ /E/; } else {print} } =PRINTOUT <body> <nav a=b> <div> </div> </nav> <div> </div> </body> =cut __DATA__ <body> <nav a=b> <div> </div> </nav> <div> </div> <nav c=d> <li> </li> </nav> </body>` [download] To understand how this works, I direct you to Flipin good, or a total flop?.	[reply] [d/l]
Re^2: How do we remove specific HTML tag by haukex (Archbishop) on Nov 07, 2021 at 05:36 UTC
What'd be reliable perl lib / module ... ... it could be that something simple would work ok? No. Why a regex really isn't good enough for HTML and XML, even for "simple" tasks.	[reply]
Re^3: How do we remove specific HTML tag by Marshall (Canon) on Nov 07, 2021 at 07:16 UTC
We don't really have any idea of how general purpose that the OP's function needs to be. The OP's test input is very simple and doesn't demo anything complex. It would be appropriate for the OP to post an extended test case. I like your link+ and the discussion therein. I certainly don't propose my simple code to be anything other than perhaps a "hack" to deal with one particular webpage.	[reply]
Re^4: How do we remove specific HTML tag by Fletch (Bishop) on Nov 07, 2021 at 09:38 UTC
Re^5: How do we remove specific HTML tag by Bod (Parson) on Nov 07, 2021 at 12:58 UTC
Some notes below your chosen depth have not been shown here