in reply to convert characters

Substitute the following
@valid_entities= ('a','abbr','acronym','br');# remove tags my %htmlenties = map { $_ =>1 } @valid_entities; $line =~ s/ < # a tag open ( # begin capture group 1 \/? # an optional slash ( # begin capture group2 [^>]*? any number of characters that aren't closing tags ) # end capture 2 \/? # an optional slash (xhtml and all that) ) # end capture group 1 > # a closing tag /exists $htmlenties{$2} ? "<$1>" : defined ($1) ? "&lt;$1&gt;" : "&lt; +"/xeg;# different captures => different process

Replies are listed 'Best First'.
Re^2: convert characters
by Anonymous Monk on Aug 27, 2009 at 12:37 UTC
    #!/usr/bin/perl while($line = <DATA>){ @valid_entities= ('<a>','<abbr>','<acronym>'); my %htmlenties = map { $_ =>1 } @valid_entities; @valid_entities= ('a','abbr','acronym','br');# remove tags my %htmlenties = map { $_ =>1 } @valid_entities; $line =~ s/<(\/?([^>]*?)\/?)>/exists $htmlenties{$2} ? "<$1>" : defin +ed ($1) ? "&lt;$1&gt;" : "&lt;"/xeg; print $line; <helloe>How r u <a> www.google.com</a> <hi>How r u </hi><et,-2><>
    From the above code the output which i got is
    &lt;helloe&gt;How r u <a> www.google.com</a> &lt;hi&gt;How r u &lt;/hi&gt;&lt;et,-2&gt;&lt;&gt;
    But the expected output is
    &lt;helloe&gt;How r u <a> www.google.com</a> <hi>How r u </hi>&lt;et,-2&gt;&lt;&gt;
    As '<hi>' as the '</hi>' I shouldn't replace it.
      Then add 'hi'to your valid entities array
      Did you write any of this code yourself?

      Update:My addition is not a complete solution by any means, for example it will fail to correctly interpret the following, (organising it so that it does could be a worthwhile exercise for you):

      <hi > <a href="http://permonks.org">Link</a> <input value="Next>" type="submit">
        $line =~  s/<(\/?([^>]*?)\/?)>/exists $htmlenties{$2} ? "<$1>" : defined ($1) ? "&lt;$1&gt;" : "&lt;"/xeg; How to just check whether ending tad exists for a tag and if it exists do not replace with &lt; and &gt;.