Re: Printing data from a file in various ways

I gather you have a large percentage of "unoccupied" elements in your source data, because based on what you say, the big difference in response time is mainly a matter of whether or not this condition fires (and skips the rest of the loop block):

    next if( $options{"unoccupied"} eq "0" && 
             $_[6] eq "unoccupied\n" );
[download]

If that doesn't fire, you end up doing four more "if" tests on every iteration, and then deleting an element from an array (which probably doesn't do what you think it does -- check "perldoc -f delete").

So there are ways to reduce the overall amount of work that perl is doing in the loop while still getting things done just as well (or even better). I don't know how much execution time would be saved, but I think the following version is at least a bit cleaner and not any harder to maintain than your original.

As mentioned previously, you haven't shown how you initialize some of the variables in the posted snippet, so I'm going to guess that your "$planets" value was intialized to the number of elements in the "@data" array (i.e. your for loop covers every array element).

Another thing I have to guess about is that elements in @data either have 7 fields (0-6) with the last one having the string "unoccupied", or else they have 9 fields (0-8). If there are more fields than that, then the code below would be inappropriate.

my $dispmode = $options{display} * 2;
$dispmode |= $options{coordinate};   # bitwise OR

# $dispmode values: 0 == display:0 and coordinate:0
#                   1 == display:0 and coordinate:1
#                   2 == display:1 and coordinate:0
#                   3 == display:1 and coordinate:1

# create a distinct print sub for each mode:
# (in the second one, the regex places parens around 
#  the last two comma-delimited fields on the line if
#  the last field is numeric)

my @modesub = ( sub { print $_[0], "<br>\n" },
                sub { ($_=$_[0]) =~ s/([^,]*,-?[\d]+)$/($1)/; print "$
+_ <br>\n"},
                sub { print "<tr><td>" . join("</td><td>",split /,/, $
+_[0])
                          . "</td></tr>\n" },
                sub { print "<tr><td>" . join("</td><td>",split /,/, $
+_[0], 8)
                          . "</td></tr>\n" }
              );

for ( @data ) {

# the following regex will check if this element of @data has
# "unoccupied" following the 6th comma, and nothing thereafter:

    next if( $options{"unoccupied"} == 0 && /^(?:[^,]*,){6}unoccupied$
+/ );

    $modesub[$dispmode]->( $_ );
}
@data = ();  # added this to get rid of the data 
             # though maybe that doesn't really matter?

print "</table>" if($dispmode & 2);
[download]

That version does only one "if" test on each loop iteration, economizes the removal of elements from the array (and actually makes the memory available for garbage collection, unlike "delete"), and only uses split when it really has to.

Also, I think that using @data and the field values on each row as lists (rather than always addressing specific array elements by index numbers) could save some time. (It certainly makes the coding easier.)

Still, if the proportion of "unoccupied" rows is extremely large, the process will just take that much longer when all those rows are being printed. Maybe you need to allow for paging the output (e.g. 100 rows per page, or whatever)?

(updated to fix punctuation, and to change from "while (@data)" to "for (@data)" -- I think doing "shift" on each iteration in a while loop is more work than necessary.)

Another update: I noticed your links to the code and data after I posted this, and was glad to see that I had guessed right. ;-) One extra little detail: it looks like your data file might contain blank lines, but you don't seem to be checking for that -- I don't know if this might be part of the "delay" problem, but just to be sensible, you should read from the file like this:

    @data = grep /\S/, <DAT>;  # skip blank lines
    chomp @data;             # remove newline chars
[download]

Final update (sorry...): I tighted up the second "modesub" so it only adds parens if the line ends with digits. Also, I deleted a couple lines of code that had been superceded by updates, rather than just commenting them out.

Comment on Re: Printing data from a file in various ways Select or Download Code

Replies are listed 'Best First'.
Re^2: Printing data from a file in various ways by oomwrtu (Novice) on Oct 14, 2005 at 02:18 UTC
This is exactly what I am looking for; it works like a charm. `:-D` Now, the question is: How did you learn this type of effiecient coding? I have looked for Perl tutorials (I only started using it this summer when I setup my webserver) and all of them are the same old "how do Perl variables work" and such. Perhaps it is a matter of experience... I can at least understand what this code is doing now that I see it, I just have to learn how to write it. Actually, I do have three questions: When $dispmode equals 3 (or display:1 and coordinate:1), it doesn't show the coordinates as an ordered pair like I would like it to. How could I fix this?(`sub { print "<tr><td>" . join("</td><td>",split /,/, $_[0], 8) . "</td></tr>\n" }`) How would I need to change the code if I wanted to add another display option, such as filtering by name? If you have time, how would you like to review the code I use to get the data from an HTML page and parse it? (would be much appreciated, I know you might have a whole lot of better things to do `:)` ) Thank you very much for all of your help!	[reply] [d/l]
Re^3: Printing data from a file in various ways by graff (Chancellor) on Oct 14, 2005 at 12:50 UTC
How did you learn this type of effiecient coding? It's basically the principle of "informed, constructive laziness": you know what needs to be done, you figure out the minimum number of steps required to get it done, and you structure the program in such a way that you never write a given chunk of code (e.g. an "if" condition or quoted string or any multi-step procedure) more than once. For me, a key step is to document (in coherent, human language) what the code is going to do first, then write the code according to the documentation. When $dispmode equals 3 ..., it doesn't show the coordinates as an ordered pair I gather you mean it's not putting parens around the "x,y" in the last column of the table. If it's really important to get the parens in there, I guess I'd elaborate that sub a little: `sub { my @cels = split( /,/, $_[0], 8 ); $cels[7] = '('. $cels[7] .')'; print '<tr><td>'. join('</td><td>',@cels) ."</td></tr>\n" }` [download] How would I need to change the code if I wanted to add another display option, such as filtering by name? You could filter entries as you read from the file, by altering the grep condition that I suggested for skipping blank lines; e.g. if you add a cgi param like "$name_pattern" (which the user can set to a string or leave blank), you can say: `my $match = ( $name_pattern =~ /^(\w+)$/ ) ? $1 : "\\S": my @data = grep /$match/i, <FILE>;` [download] (updated to fix typo in sub snippet; also switched to using a regex capture in the second snippet, in accordance with the standard idiom for untainting data)	[reply] [d/l] [select]