in reply to HTML Formattting

Update: Ah, I see, then look at grab only if pattern matches previous line, there are a bunch of solutions to your problem.

I really don't understand, what you're trying to do, but for what I understand, the follwing should work (untested, I simply assume you want capturing parenthesis in your regexp):

my $test = ";Town ; London;"; if ($test =~ /Town ;([^;]+);/) { print $1; # $1 contains 'London ' }

regards,
tomte


An intellectual is someone whose mind watches itself.
-- Albert Camus

Replies are listed 'Best First'.
Re: Re: HTML Formattting
by minixman (Beadle) on May 13, 2004 at 10:31 UTC
    Sorry ,

    Well the html file that comes back to my program lets say test.html has certain entries like location country etc...

    so the file looks like this ;Cost Center Desc.
    ;FX IT EFX;

    ;Town
    ;London ;

    So what i want to do is search for town, and then grab the answer of the next line..

    Something like $line =~ /Town/ , and the answer i am looking for is the next line which is London. You see these came from a table in html format, and i stripped out the html so i just have text, so this is what i am left with..

    #!/home/cuthbe/bin/perl open(T,"test.html") || die ("Unable to open file. $!\n"); foreach (<T>) { s/<[^>]+>//g; s/&nbsp//g; s/<!//g; if($_ =~ /Town/) { printf $$; } #printf; }

    Edited by Chady -- added code tags.

      How about something like this:

      open(T,"test.html") || die ("Unable to open file. $!\n"); while(<T>) { # preprocessing with s/.../ as above if (/^;Town/) { $town = <T>; chomp $town; $town =~ s/;//g; print "Found a town: $town\n"; } }

      Note, I've only tested this very quickly. It could probably do with better error handling etc...

        i Did find a quick way that does work very well.

        If you have an file that looks something like this
        ; Notes ID
        ; bhenry

        this will work

        open(R,"file.html") || die("errro, $1n\"); while (<R>) { $pattern = "Notes ID"; next if ! /$pattern/; die "End of EOF \n" if ! defined ($_=<R>); $_ =~ s/<[^>]+>//g; # This just removes HTML Tages $_ =~ s/<!//g; # This removes more HTML tages $_ =~ s/&nbsp//g; # Remove HTML spaces $_ =~ s/;//g; # remove ; printf ("$pattern = $_ \n"); # Print patter