jonnyfolk has asked for the wisdom of the Perl Monks concerning the following question:

I have the following script which I'm using to view text files in a browser.
#!/usr/bin/perl -w use strict; use CGI::Carp qw(fatalsToBrowser warningsToBrowser); use CGI ':standard'; my $data2="/home/users/some_path/edit/data2.txt"; my $data1 ="/home/users/some_path/edit/data1.txt"; my $data3 ="/home/users/some_path/edit/data3.txt"; my $keys; my $values; my %dbs = ( Data1 => $data1, Data2 => $data2, Data3 => $data3, ); print "Content-type: text/html\n\n"; foreach $keys (sort keys %dbs) { $values = $dbs{$keys}; print "<p><font face=\"Arial, Helvetica, sans-serif\" size=\"3\" c +olor=\"#CC0000\">$keys contents as follows:</font></p>\n"; open FILE, "$values" or die "Can't open $values: $!"; flock (FILE, 2) or die "Can't lock $values for reading: $!"; + foreach my $line (<FILE>) { chomp $line; $line =~ s/<br>/ /g; print "$line<br>\n"; } close FILE; }
The problem is that it takes about a year and a half for it to finish loading into the browser (250k). How can I do this more efficiently?

Replies are listed 'Best First'.
Re: Increase efficiency of script
by davido (Cardinal) on Mar 20, 2004 at 07:52 UTC
    You may be looking in the wrong place for optimization.

    If you run the script from the command line how long does it take to execute? The script doesn't look excessively slow. But loading 250k into your browser (over a telephone line) takes awhile.

    On a style point, do you really need to create $data1, $data2, and $data3, only to load them into a hash later?


    Dave

      Thanks for looking it over Dave - I thought it might be a browser issue (IE5 for Mac is not the quickest around) but thought I'd check out the code.

      I created the hash manually to associate the file name with the file - if there's another/better way to do that I'd be very interested to know.

        I created the hash manually to associate the file name with the file - if there's another/better way to do that I'd be very interested to know.

        If you must hardwire the filenames, I don't have anything against using a hash to store them. I just might not have bothered with the intermediate scalar variables:

        my $data2="/home/users/some_path/edit/data2.txt"; my $data1 ="/home/users/some_path/edit/data1.txt"; my $data3 ="/home/users/some_path/edit/data3.txt"; my %dbs = ( Data1 => $data1, Data2 => $data2, Data3 => $data3, );

        ...could be written more legibly as:

        my %dbs = ( Data2 => "/home/users/some_path/edit/data2.txt", Data1 => "/home/users/some_path/edit/data1.txt", Data3 => "/home/users/some_path/edit/data3.txt", );

        But that's, as I mentioned, just a style issue. I may be out of style myself. ;)


        Dave

Re: Increase efficiency of script
by Abigail-II (Bishop) on Mar 20, 2004 at 10:52 UTC
    Why do you acquire an exclusive lock on a file you have open for reading? If multiple copies of the program act at the same time, they have to wait for each other to finish.

    Abigail

Re: Increase efficiency of script
by runrig (Abbot) on Mar 20, 2004 at 07:56 UTC
    Can you just use <pre> tags and maybe just:
    system('cat', $file)
    Reading line by line is bound to slow you down a bit, though I don't know how much, if any, this will help.

    Update: Why even use perl here...just:

    #!/bin/sh echo "(html header stuff)" echo "</pre>" cat file1 file2 file3 echo "</pre></html>"
      No, really, you guys are looking for optimization in the wrong place. How long does it take on your system for a Perl script to read a 250k text file line by line and print it line by line? Almost no time at all. I had to benchmark 100 iterations of that sort of routine just to measure a two-second execution time.

      If a script, reading line by line 250k of data and outputting it via CGI, results in page loads in the order of "an eternity", there's definately a problem.

      My guess is that your problem will be one of the following:

      • Your flock is waiting for the file to become available. This could be awhile if there are lots of processes competing for it. The POD for flock states: Two potentially non-obvious but traditional flock semantics are that it waits indefinitely until the lock is granted, and that its locks merely advisory.
      • It takes 64 seconds to load 256k of data through a 45kbps telephone modem connection.
      • Long ping times, packet loss, and other network latency issues could contribute to slow page loads.
      • Server load could be an issue.

      But I wouldn't be too quick to blame the segment of code that reads the file line by line and prints it; that's not doing anything that would take an eternity to execute.


      Dave

        You're still forgetting one possible cause, davido. If a browser like the MSIE 5.x on the Mac, reads such a long file, it simply takes a long time to render. I'm almost sure that loading the 250k file from local disk may easily take 10 to 20 seconds, maybe even longer, before it shows up in the browser.

        Solution? Cut up the text in smaller pieces. People tend to prefer it that way.

Re: Increase efficiency of script
by techy (Scribe) on Mar 20, 2004 at 22:53 UTC
    jonnyfolk,

    As others have commented on here, it sounds likely that the problem may be with download or rendering time. If the problem is with download time, it may be possible to use a module like CGI::Compress::Gzip or making your own with one of the gzip compression libraries on CPAN in addition to checking and setting the correct output headers. Compressing the content of a text file should in most cases dramatically reduce the download time.

    On the other hand, if the performance problem is with rendering, you may want to consider making the returned html merely point to links to download the content, and then specify a "Content-disposition: attachment" in the CGI to prompt the user with a open/save dialog box. If the user is able to save the text file on their computer and use an external program to open it, it will cut out the time for slow browser rendering.

    Thanks,
    techy