What you're attempting as a first program is too tough for a complete beginner IMHO ... So, like kcott, I suggest you read perlintro or some of these Learning Perl links.

Then write some simpler programs first, to gain some confidence. Feel free to ask more questions if you get stumped. Once you've done that (will probably take a week or two) return to your original problem.

That said, I can see you're very determined to try to solve your real world problem immediately! If so, try running this simple program:

use strict; use warnings; my $ca = "california.html"; open(my $f1, "<" , $ca) or die "Can't open file '$ca': $!"; while ( my $line = <$f1> ) { print "line: $line"; if ( $line =~ m{Employee +([^<]+)</th><th>([^<]+)} ) { my $name = $1; my $two = $2; print " name='$name' two='$two'\n"; } } close ($f1);
on your original test california.html file:
</tr></table></body><body bgcolor="black"><h1> Summary</h1><table border="1"><tr><th>Employee A</th><th>-0.82</th> </tr><tr><th>Employee B</th><th>-5.02</th> </tr><tr><th>Employee C</th><th>19</th> </tr></table></body><body bgcolor="black"><h1> Summary</h1><table border="1"><tr><th>Employee A</th><th></th> </tr><tr><th>Employee B</th><th></th> </tr><tr><th>Employee C</th><th></th>
which should produce the following output:
line: </tr></table></body><body bgcolor="black"><h1> line: Summary</h1><table border="1"><tr><th>Employee A</th><th>-0.82</ +th> name='A' two='-0.82' line: </tr><tr><th>Employee B</th><th>-5.02</th> name='B' two='-5.02' line: </tr><tr><th>Employee C</th><th>19</th> name='C' two='19' line: </tr></table></body><body bgcolor="black"><h1> line: Summary</h1><table border="1"><tr><th>Employee A</th><th></th> line: </tr><tr><th>Employee B</th><th></th> line: </tr><tr><th>Employee C</th><th></th>
Now, take the time to understand how the above program works by reading the introductory Perl links above. Feel free to ask any questions about it.

Please note that I am NOT endorsing the above program as a sound way to solve your real world problem. It is just a simple program, directly related to your real world problem, to help motivate you to learn some Perl basics. For a sound solution to your problem, I suspect HTML-Parser is the way to go.


In reply to Re: HTML::Parser / Regex by eyepopslikeamosquito
in thread HTML::Parser / Regex by MissPerl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.