yskmonk:
#!/usr/bin/perl use strict; use warnings; my $outfile="out.html"; my $final="final.txt"; open OUTFILE, "<$outfile"; open FINAL, ">$final"; while (<OUTFILE>) { s/<br>//gi; print FINAL if /(line\d*=(\d|[A-Za-z]{3}\s).)/; }
Input File: (out.html)
line1=10 line3=20 line3=30 line4=Mon May 18 02:28:58 EDT 2009 line5=60 line6=Mon May 18 02:28:58 EDT 2009 line7=Mon May 18 02:28:58 EDT 2009 line8=Mon May 18 02:28:58 EDT 2009 line20=Mon May 18 02:28:58 EDT 2009 line30=60 line40=Jambalaya #erroneous input, should not print to +final.txt line100=45 line 200=Mon May 18 02:27:58 EDT 2009 line1000=Mon May 18 02:28:58 EDT 2009 line1001=90 line 2000=Mon May 18 02:28:58 EDT 2009 #erroneous input, shoul +d not print to final.txt line2001=100 line10000=50 line10001=Mon May 18 02:28:58 EDT 2009
Output file (final.txt):
line1=10 line3=20 line3=30 line4=Mon May 18 02:28:58 EDT 2009 line5=60 line6=Mon May 18 02:28:58 EDT 2009 line7=Mon May 18 02:28:58 EDT 2009 line8=Mon May 18 02:28:58 EDT 2009 line20=Mon May 18 02:28:58 EDT 2009 line30=60 line100=45 line1000=Mon May 18 02:28:58 EDT 2009 line1001=90 line2001=100 line10000=50 line10001=Mon May 18 02:28:58 EDT 2009
Note: This does not take care of removing duplicates

In reply to Re^4: removing <br> from my output by raisputin
in thread removing <br> from my output by yskmonk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.