Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: parsing with regex

by mkmcconn (Chaplain)
on Nov 17, 2001 at 01:02 UTC ( #125916=note: print w/replies, xml ) Need Help??

in reply to parsing with regex

Everyone is so much quicker than me. Oh well, here's my attempt for what it's worth, constructed to handle some (by no means all) HTML-legal variations in the text.

#!/usr/bin/perl -w use strict; $/ = ''; my %h; while (<DATA>){ while ( s/((\d+) is good.+?)<(?:hr|HR)>//s ){ my $good = $1; my $key = $2; $good =~ s/\n?\s?<(?:BR|br).?.?>\n?/|/g; my @pot = split /\|/, $good; shift @pot; $h{$key} = [@pot]; } } use Data::Dumper; print Data::Dumper->Dump([\%h],[qw(*h)]); __DATA__ <HR> 1 is good<BR> useless data<BR>useless data<BR> useless data <BR>useless data<BR> <hr> 2 is good<BR> useless data<br> useless data<BR> useless data<br> useless data<BR> <hr> 3 is not good <BR> useless data <br />useless data<br />useless data<BR> useless data<BR> <HR> 4 is good<BR> useless data<BR>useless data<BR> useless data<br>useless data<BR> <HR> 5 is not good <BR> useless data<BR> useless data<BR> useless data<BR> useless data<BR> <HR>

By the way, you asked your question very well and complete with a good data example. It's appreciated.
(better Data::Dumper, thanks to sacked and tilly).

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://125916]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2023-03-24 06:35 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (60 votes). Check out past polls.