Re: HTML Formattting

Update: Ah, I see, then look at grab only if pattern matches previous line, there are a bunch of solutions to your problem.

~~I really don't understand, what you're trying to do, but for what I understand, the follwing should work (untested, I simply assume you want capturing parenthesis in your regexp):~~

my $test = ";Town ; London;";
if ($test =~ /Town ;([^;]+);/) {
    print $1; # $1 contains 'London '
}
[download]

regards,
tomte

An intellectual is someone whose mind watches itself.
-- Albert Camus

Comment on Re: HTML Formattting Download Code

Replies are listed 'Best First'.
Re: Re: HTML Formattting by minixman (Beadle) on May 13, 2004 at 10:31 UTC
Sorry , Well the html file that comes back to my program lets say test.html has certain entries like location country etc... so the file looks like this ;Cost Center Desc. ;FX IT EFX; ;Town ;London ; So what i want to do is search for town, and then grab the answer of the next line.. Something like $line =~ /Town/ , and the answer i am looking for is the next line which is London. You see these came from a table in html format, and i stripped out the html so i just have text, so this is what i am left with.. `#!/home/cuthbe/bin/perl open(T,"test.html") \|\| die ("Unable to open file. $!\n"); foreach (<T>) { s/<[^>]+>//g; s/&nbsp//g; s/<!//g; if($_ =~ /Town/) { printf $$; } #printf; }` [download] Edited by Chady -- added code tags.	[reply] [d/l]
Re: Re: Re: HTML Formattting by muntfish (Chaplain) on May 13, 2004 at 11:09 UTC
How about something like this: `open(T,"test.html") \|\| die ("Unable to open file. $!\n"); while(<T>) { # preprocessing with s/.../ as above if (/^;Town/) { $town = <T>; chomp $town; $town =~ s/;//g; print "Found a town: $town\n"; } }` [download] Note, I've only tested this very quickly. It could probably do with better error handling etc...	[reply] [d/l]
Re: Re: Re: Re: HTML Formattting by minixman (Beadle) on May 13, 2004 at 14:35 UTC
i Did find a quick way that does work very well. If you have an file that looks something like this ; Notes ID ; bhenry this will work `open(R,"file.html") \|\| die("errro, $1n\"); while (<R>) { $pattern = "Notes ID"; next if ! /$pattern/; die "End of EOF \n" if ! defined ($_=<R>); $_ =~ s/<[^>]+>//g; # This just removes HTML Tages $_ =~ s/<!//g; # This removes more HTML tages $_ =~ s/&nbsp//g; # Remove HTML spaces $_ =~ s/;//g; # remove ; printf ("$pattern = $_ \n"); # Print patter` [download]	[reply] [d/l]