Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: How do we remove specific HTML tag

by haukex (Bishop)
on Nov 07, 2021 at 05:36 UTC ( #11138539=note: print w/replies, xml ) Need Help??


in reply to Re: How do we remove specific HTML tag
in thread How do we remove specific HTML element

What'd be reliable perl lib / module ...
... it could be that something simple would work ok?

No.

Why a regex *really* isn't good enough for HTML and XML, even for "simple" tasks.

  • Comment on Re^2: How do we remove specific HTML tag

Replies are listed 'Best First'.
Re^3: How do we remove specific HTML tag
by Marshall (Canon) on Nov 07, 2021 at 07:16 UTC
    We don't really have any idea of how general purpose that the OP's function needs to be.
    The OP's test input is very simple and doesn't demo anything complex.
    It would be appropriate for the OP to post an extended test case.
    I like your link+ and the discussion therein.
    I certainly don't propose my simple code to be anything other than perhaps a "hack" to deal with one particular webpage.

      That's kind of the point. He might have matching cruft in a CDATA section, or (more likely) inside a comment because the web designer decided to move something around but left the old location in place for reference and never cleaned up afterwards. You've handed him a ticking bomb prossibly starting him off with a bad habit and sooner than later that's going to go boom.[1] You don't know what his actual data is so the best answer is the most generally correct one: don't try and wing it handling HTML with regexen, use a proper parser.

      [1] – "No boom today. Boom Tomorrow. There's always a boom tomorrow."

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        No boom today. Boom Tomorrow. There's always a boom tomorrow

        As it is is 6th November, the Boom was yesterday here in the UK!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11138539]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2022-01-24 20:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (65 votes). Check out past polls.

    Notices?