in reply to Re: convert characters
in thread convert characters

#!/usr/bin/perl while($line = <DATA>){ @valid_entities= ('<a>','<abbr>','<acronym>'); my %htmlenties = map { $_ =>1 } @valid_entities; @valid_entities= ('a','abbr','acronym','br');# remove tags my %htmlenties = map { $_ =>1 } @valid_entities; $line =~ s/<(\/?([^>]*?)\/?)>/exists $htmlenties{$2} ? "<$1>" : defin +ed ($1) ? "&lt;$1&gt;" : "&lt;"/xeg; print $line; <helloe>How r u <a> www.google.com</a> <hi>How r u </hi><et,-2><>
From the above code the output which i got is
&lt;helloe&gt;How r u <a> www.google.com</a> &lt;hi&gt;How r u &lt;/hi&gt;&lt;et,-2&gt;&lt;&gt;
But the expected output is
&lt;helloe&gt;How r u <a> www.google.com</a> <hi>How r u </hi>&lt;et,-2&gt;&lt;&gt;
As '<hi>' as the '</hi>' I shouldn't replace it.

Replies are listed 'Best First'.
Re^3: convert characters
by Utilitarian (Vicar) on Aug 27, 2009 at 12:39 UTC
    Then add 'hi'to your valid entities array
    Did you write any of this code yourself?

    Update:My addition is not a complete solution by any means, for example it will fail to correctly interpret the following, (organising it so that it does could be a worthwhile exercise for you):

    <hi > <a href="http://permonks.org">Link</a> <input value="Next>" type="submit">
      $line =~  s/<(\/?([^>]*?)\/?)>/exists $htmlenties{$2} ? "<$1>" : defined ($1) ? "&lt;$1&gt;" : "&lt;"/xeg; How to just check whether ending tad exists for a tag and if it exists do not replace with &lt; and &gt;.
        You are getting around to writing a complete parser, your original script seemed aimed at not converting specified tags.
        You will need to make a list of opening tags you encounter, and on encountering a closing tag, pop tags off your array converting them unless they match the closing tag.
        On encountering end of file, you will have to pop all remaining tags off and convert them.

        Have a go at writing that yourself and when you hit an issue get back to us.