Perl Newby has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file that I am trying to run a sub on to get the data out. I just want to be able to get the following out of the file and stop after the second totals. I have more data below that but I just need the first four chunks. I have begun writing a sub but seem to have hit a wall. Any suggestions would be appreciated.
24|St. Louis Cardinals 1|4995|Fernando Vina|2B|4|0|0|0|0|0|0|.317|.397|.425 2|5602|Edgar Renteria|SS|4|0|1|0|1|0|1|.260|.321|.470 3|5151|Jim Edmonds|CF|3|1|2|1|5|1|0|.409|.538|.864 4|3866|Mark McGwire|1B|3|1|1|1|4|1|0|.323|.494|.903 5|4547|Ray Lankford|LF|4|0|0|0|0|0|3|.193|.297|.420 6|5046|Craig Paquette|3B|3|0|1|0|2|1|1|.264|.325|.486 7|6117|J.D. Drew|RF|4|0|1|0|1|0|1|.327|.462|.596 8|5205|Mike Matheny|C|3|0|0|0|0|0|0|.289|.375|.421 8|4506|Thomas Howard|PH|1|0|1|0|1|0|0|.280|.357|.760 8|5897|Eli Marrero|PR|0|0|0|0|0|0|0|.250|.343|.643 9|4379|Andy Benes|P|2|0|0|0|0|0|0|.167|.167|.167 9|3625|Eric Davis|PH|1|0|0|0|0|0|1|.281|.388|.491 9|4625|Heathcliff Slocumb|P|0|0|0|0|0|0|0|.000|.000|.000 9|4979|Mike Mohler|P|0|0|0|0|0|0|0|1.000|1.000|1.000 9|5880|Larry Sutton|PH|1|0|0|0|0|0|0|.000|.000|.000 TOTALS|33|2|7|2|14|3|7 17|Cincinnati Reds 1|5746|Pokey Reese|2B|3|1|1|1|1|0|0|.343|.408|.429 2|5930|Sean Casey|1B|3|0|1|0|1|1|0|.200|.344|.320 3|4305|Ken Griffey Jr.|CF|4|1|2|2|5|0|1|.210|.326|.448 4|5699|Dmitri Young|LF|4|0|1|0|1|0|1|.287|.339|.446 4|5540|Alex Ochoa|LF|0|0|0|0|0|0|0|.297|.357|.595 5|4615|Eddie Taubensee|C|4|0|0|0|0|0|1|.325|.387|.542 6|4159|Dante Bichette|RF|3|0|2|0|2|0|0|.219|.265|.324 6|6174|Scott Williamson|P|0|0|0|0|0|0|0|-|-|- 7|5838|Aaron Boone|3B|3|0|0|0|0|0|0|.270|.389|.438 8|5508|Juan Castro|SS|3|1|1|0|2|0|0|.333|.333|.667 9|5346|Ron Villone|P|1|0|0|0|0|0|0|.222|.222|.222 9|5299|Michael Tucker|RF|0|0|0|0|0|0|0|.167|.375|.500 TOTALS|28|3|8|3|12|1|3
I am trying to use the following code to get this information out of the file. Like I said above, there is a lot of data in the file. Although, I just need the first four sections.
open(INPUT,"c:/MLB_boxscore.TXT") or die "Can't open file"; print "<html><head><title>Box Score</title></head><body>"; print "<table>"; my $insection = ""; my $hr, $db, $hrd, $y, $l, $linescore; $hr = 0; $db = 0; $hrd = 0; $y = 0; $l = 0; $v = 0; $h = 0; $linescore = 0; while (<INPUT>) { if ($y eq 4){$linescore = -1} else {$linescore = 0} if ($linescore eq 0){ linescore($y, $l); $y = $x; } if ($linescore ne 0){ sub linescore { if (/^\s*$/) { $l = 0; $y += 1; $b = 0; } else { if ($y eq 0){ players($y, $l); $v_team = $team; $b = $b; $l = $l; $v_line[$b] = $line; return $v_line[$b], $l; } if ($y eq 1){ totals($y); $b = $b; $v_total[$b] = $total; return $v_total[$b]; } if ($y eq 2){ players($y); return $h_team = $team; $b = $b; return $h_line[$b] = $line; } if ($y eq 3){ totals($y); $b = $b; return $h_total[$b] = $total; } } $x = $y; return $x, $l; } sub players{ if ($l eq 0){ chomp; @LS = (); push @LS, split('\|',$_); $team = $LS[0]; $l = 1; return $team, $l; } if ($l ne 0){ print "mike"; my @linescores = (0,2,3,4,5,6,7,8,9,10,11,12,13); chomp; @LS = (); push @LS, split('\|',$_); my $line = ""; for ($i=0;$i<$#LS;$i++){ $a = $linescores[$i]; $linescore[$i] = $LS[$a]; print $linescore[$i]; } my $line = $line . "<tr bgcolor='#CCCCCC'>"; for ($x=1;$x<=$#linescore;$x++){ $line = $line . "<td>" . $linescore[$x] . "</td>"; $linescore[$x] = undef; } $line = $line . "</tr>"; $b += 1; return $line, $b; } } sub totals{ my @totals = (0,4,5,6,7,8,9,10); chomp; @LS = (); push @LS, split('\|',$_); my $total = ""; for ($i=0;$i<$#LS;$i++){ $a = $totals[$i]; $tot[$i] = $LS[$a]; } my $total = $total . "<tr bgcolor='#CCCCCC'>"; for ($x=1;$x<=$#tot;$x++){ $total = $total . "<td>" . $tot[$x] . "</td>"; $tot[$x] = undef; } $total = $total . "</tr>"; $b += 1; return $total, $b; } print "</table>"; print "</table></body></html>"; close INPUT;

Replies are listed 'Best First'.
Re: Running a Sub for a Text File
by astanley (Beadle) on Apr 19, 2001 at 19:38 UTC
    Why don't you check the $_ at the top of your while loop and if it equals "TOTALS" then increment a counter by one. On the line after that you can check the value of the counter...and if it equals whatever number you want to stop at use continue to break the loop?

    -Adam Stanley
    Nethosters, Inc.
Re: Running a Sub for a Text File
by buckaduck (Chaplain) on Apr 19, 2001 at 19:51 UTC
    Try setting $/ = ""; before you start to read in data. This will cause each <INPUT> to grab one paragraph of data, rather than one line. See perlvar.

    Note that each paragraph will include the blank line at the end.

    So you could say:

    $/ = ""; $players1 = <INPUT>; $totals1 = <INPUT>; $players2 = <INPUT>; $totals2 = <INPUT>;
    ... to grab your data in chunks. You could split each chunk on "\n" characters to get the individual players.

    Update: I should mention that this presumes your blank lines are really blank, with no whitespace!

    buckaduck

Re: Running a Sub for a Text File
by arturo (Vicar) on Apr 19, 2001 at 19:38 UTC

    What's a "chunk"? What's not working? I'm willing to help, but I don't want to have to go over your whole program with a microscope and try to figure out what you *want* to do with it. Please be more specific in your questions. Also, it might help us to help you if we know what your *overall* goal is here (something to do with getting the data out of the format it's in and putting it into HTML tables -- but what parts of the data you have?)

    Following are some pointers that jumped out at me that have probably tangential relations to the problem of structuring your algorithm:

    • eq is for comparing strings, == is for comparing numerical values.
    • put use strict at the top of your script, and turn on warnings with -w on the #! line. This will help you immensely in tracking down errors. In fact, put use diagnostics up there at the top to get nice long explanatory error messages, very useful while you're learning.

    Hope these suggestions help.

Re: Running a Sub for a Text File
by ChemBoy (Priest) on Apr 19, 2001 at 20:18 UTC

    First of all, let me say how happy I am to see somebody else combining two of my favorite subjects. :-)

    More helpfully... I'm suprised nobody's mentioned this yet, but looking at the history of this quest, I don't think anybody has (my apologies if I missed it): use CGI.pm! It will simplify your life. (Better yet, use CGI.pm and use stylesheets to get those color codes out of your HTML--but that's my personal ideology speaking, and off-topic.)

    use CGI qw(:all -nodebug);
    will get you all the functions you need. It looks like you might have enough time invested in this already that it's not worth going back and changing everything now, but if you start another project like this one, give it a look.

    But on the particular problem you're having right now, I'm not sure what to say, so I'll degenerate into nitpicks a little bit: if you comment the code, and use descriptive variable names, it will be easier for poor sods like us to figure out, and we'll be more helpful. And everything that arturo said is most wise and to be heeded (as usual).

    Astanley's remarks are also very wise with the exception that he spaced which language he was in--the keyword in question is last, not continue. ;-)



    If God had meant us to fly, he would *never* have give us the railroads.
        --Michael Flanders

      Hear, hear! use CGI and format with (linked) stylesheets.

      If you are working with even a small number of pages, stylesheets pay off bigtime. But getting them to work across browsers can be a bear. Check out CSS Resources, it may help.

      Back on the subject of things more Perlish, I use CGI for form handling, and parameter parsing. But I prefer to use some type of templates for the html, unless I am doing very simple HTML. If I embed code, I always make heavy use of custom classes (a very nice feature of CSS). This way I don't have to edit the script to change the appearance of the page.


      TGI says moo

Re: Running a Sub for a Text File
by dvergin (Monsignor) on Apr 19, 2001 at 22:55 UTC
    Perl Newby, you have clearly mastered several concepts and you are at the point where you are generating enough complexity in your code that some care on several points will really help you keep track of what you are doing and what is happening.

    Here's where I would start:

    First and most important: insert the line

    use strict;

    near the top of your code. You do declare some variables, but you also have a number of cases where variables spring into existance without warning.

    As others have suggested, use meaningful variable names. This will help us see what is happening and will also help you avoid slip-ups with a program of this size where you have a fair number of values floating around.

    Explicitly pass the values needed in each sub rather than depending on grabbing them from global variable space. This will go a long way toward helping manage what is happening with your various values.

    Were you aware that you have built your subs to return values but you are not doing anything with those returned values? Perhaps you are assuming that they are returned without your explicitly capturing them where the sub is called.

    Give these and the other suggesions in the other responses a try and I think you will be able to get more control over what is happening in your code.

    Actually encouragement and congratulations are in order. You have advanced to the point that you are grappling with a problem that is complex enough that the quick-and-dirty approach no longer works. A wee bit of discipline at this point will allow you to make very effective use of the skills you have already mastered as you attack larger, more challenging problems.

    Cheers,
    dvergin

Re: Running a Sub for a Text File
by suaveant (Parson) on Apr 19, 2001 at 21:36 UTC
    Just a word of advice.. in your code you do...
    if ($linescore eq 0){ linescore($y, $l); $y = $x; } if ($linescore ne 0){
    ne and eq and gt and lt and cmp (and others) are designed for comparing strings... and though it probably won't hurt you here, you probably want to get in the habit of using == and != when comparing numbers instead of strings. Otherwise you may have problems if you have:
    $_ = '02'; print "equal\n" if $_ eq 2;
    because 02 as a string is not equal to 2 as a string...
    I know it has bit me before.
                    - Ant