in reply to Re^3: removing <br> from my output
in thread removing <br> from my output

yskmonk:
#!/usr/bin/perl use strict; use warnings; my $outfile="out.html"; my $final="final.txt"; open OUTFILE, "<$outfile"; open FINAL, ">$final"; while (<OUTFILE>) { s/<br>//gi; print FINAL if /(line\d*=(\d|[A-Za-z]{3}\s).)/; }
Input File: (out.html)
line1=10 line3=20 line3=30 line4=Mon May 18 02:28:58 EDT 2009 line5=60 line6=Mon May 18 02:28:58 EDT 2009 line7=Mon May 18 02:28:58 EDT 2009 line8=Mon May 18 02:28:58 EDT 2009 line20=Mon May 18 02:28:58 EDT 2009 line30=60 line40=Jambalaya #erroneous input, should not print to +final.txt line100=45 line 200=Mon May 18 02:27:58 EDT 2009 line1000=Mon May 18 02:28:58 EDT 2009 line1001=90 line 2000=Mon May 18 02:28:58 EDT 2009 #erroneous input, shoul +d not print to final.txt line2001=100 line10000=50 line10001=Mon May 18 02:28:58 EDT 2009
Output file (final.txt):
line1=10 line3=20 line3=30 line4=Mon May 18 02:28:58 EDT 2009 line5=60 line6=Mon May 18 02:28:58 EDT 2009 line7=Mon May 18 02:28:58 EDT 2009 line8=Mon May 18 02:28:58 EDT 2009 line20=Mon May 18 02:28:58 EDT 2009 line30=60 line100=45 line1000=Mon May 18 02:28:58 EDT 2009 line1001=90 line2001=100 line10000=50 line10001=Mon May 18 02:28:58 EDT 2009
Note: This does not take care of removing duplicates

Replies are listed 'Best First'.
Re^5: removing <br> from my output
by yskmonk (Initiate) on May 20, 2009 at 18:47 UTC
    Thank you! it worked.I could even get rid of the duplicate lines.

    one more question:can I use the same thing if the input file has lines like below:

    line50=false
      Is it the same file, or a different file that has line30=false? Either way, the regexp would need to be modified to account for that... maybe something like print FINAL if /(line\d*=(\d|[false]|[A-Za-z]{3}\s).)/;