Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: parsing with regex

by mkmcconn (Chaplain)
on Nov 17, 2001 at 01:02 UTC ( [id://125916]=note: print w/replies, xml ) Need Help??


in reply to parsing with regex

Everyone is so much quicker than me. Oh well, here's my attempt for what it's worth, constructed to handle some (by no means all) HTML-legal variations in the text.

#!/usr/bin/perl -w use strict; $/ = ''; my %h; while (<DATA>){ while ( s/((\d+) is good.+?)<(?:hr|HR)>//s ){ my $good = $1; my $key = $2; $good =~ s/\n?\s?<(?:BR|br).?.?>\n?/|/g; my @pot = split /\|/, $good; shift @pot; $h{$key} = [@pot]; } } use Data::Dumper; print Data::Dumper->Dump([\%h],[qw(*h)]); __DATA__ <HR> 1 is good<BR> useless data<BR>useless data<BR> useless data <BR>useless data<BR> <hr> 2 is good<BR> useless data<br> useless data<BR> useless data<br> useless data<BR> <hr> 3 is not good <BR> useless data <br />useless data<br />useless data<BR> useless data<BR> <HR> 4 is good<BR> useless data<BR>useless data<BR> useless data<br>useless data<BR> <HR> 5 is not good <BR> useless data<BR> useless data<BR> useless data<BR> useless data<BR> <HR>

By the way, you asked your question very well and complete with a good data example. It's appreciated.
(better Data::Dumper, thanks to sacked and tilly).
mkmcconn

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://125916]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-03-29 06:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found