vsmurthy has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have just begun to use Perl for CGI programming. I have a situation where I run a command on the server through my perl script and try to retrieve the result of the command into a perl variable and then I try to display it on the web page:
my $var = `property <dataid>`; print "<td align="middle">$var</td>";
This command (property) sometimes produces an output of size almost 0.5Mb to 1Mb for some values of <dataid>(I checked the size of the output by executing the "property" command on the web server CLI).
I have 2 problems in capturing this output in the perl variable:
1] Only part of the output of the command is captured in the variable when the output is large (i.e, when output size is greater than 0.1Mb). How should I take care of this? Is there any feature in Perl to capture very large output?
2] The other problem is that the output of the command has some XML in it and when I capture it in the perl variable, the XML part is lost (From the example below, all text between and including <List> and </List> don't appear in the variable). How should one capture such data since its not completely XML? Should I be using any PERL/XML functions?
Sample output: Data <dataid> is not empty Content has xxx entires: ------ similar text ------- ------ similar text ------- <List> <List1>123</List1> <List2>345</List2> ------------------ ------------------ <Listn>23</Listn> </List> Loc has 40 files: filename-1 filename-2 ---------- ---------- filename-40
Any help is greatly appreciated. Thanks, Vinay

Replies are listed 'Best First'.
Re: How do I capture large output in a perl variable
by jZed (Prior) on Mar 22, 2005 at 20:41 UTC
    You need to distinguish what you are capturing in the variable (which is almost certainly everything) from what you can see in a browser. For example, of course you don't see the stuff inside XML codes, the browser hides tags it doesn't recognize. I suspect that if you use view source in your browser, all of the stuff you thought you weren't capturing will be there. You need to use HTML to display things properly.
      Thanks for the suggestion. It is true that the XML tags and data within them is present if i check "view source" in my browser. I finally got that working by replace "<" by "&lt;"before displaying it in browser.
      But that is not the only problem. Other than the XML part, there is a large amount of data missing from the output which I normally get if i log onto my web server and run the same command on the command line, i.e, a large part of the output seems to be truncated when I run the command through the web page. What should I do to capture all the output into the perl variable?
        I'm still not convinced that the variable isn't capturing all of the output. Perhaps the output is different for different users and the CGI user's output is only a subset of what you get as yourself. It's also possible that the browser times out in downloading a large amount of data though I kind of doubt that. Change your CGI to print the output directly to a file on the server and examine that file to see what the variable is actually capturing.
Re: How do I capture large output in a perl variable
by brian_d_foy (Abbot) on Mar 22, 2005 at 20:54 UTC

    Are you sure you're missing data in the output? When I do this sort of thing, I include some size information in some debugging output:

    my $data = ` ... `; print "Got data length [$data]\n";

    However, if you are expecting a lot of data, you don't need to store it all in memory (unless you need it for something else). You can open a pipe and read from it. When you get a line from the pipe, you send it to the browser.

    open PIPE, " ... | "; while( <PIPE> ) { ... do stuff ... print; } close PIPE or die "Got error [$!|$?]\n";

    While debugging these things, I tend to output everything as "text/plain" so I see exactly what's going on, or I use a perl script (perhaps using WWW::Mechanize) to grab and save the output so I can inspect it later.

    Good luck :)

    --
    brian d foy <bdfoy@cpan.org>
      "text/plain" is great for debugging stuff like this, just be careful of things like internet explorer (and others? are there any), that don't follow the standards, and may or may not decide that it's text, and may instead try to display it as html, or whatever else it feels like.
Re: How do I capture large output in a perl variable
by PodMaster (Abbot) on Mar 23, 2005 at 04:24 UTC
    Only part of the output of the command is captured in the variable when the output is large (i.e, when output size is greater than 0.1Mb).
    This is quite trivial to test and disprove
    my $bigstuff = `$^X -e'print 1x(1024*1024)' `; print length $bigstuff,$/; print $bigstuff,$/; __END__

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

Re: How do I capture large output in a perl variable
by naChoZ (Curate) on Mar 23, 2005 at 17:02 UTC

    You've received more intelligent answers, but in the spirit of TIMTOWTDI:

    print "<pre>"; for (split("\n", qx(cat somefile))) { # or other output producing comm +and s/</&lt;/g; print; } print "</pre>";

    --
    "This alcoholism thing, I think it's just clever propaganda produced by people who want you to buy more bottled water." -- pedestrianwolf

Re: How do I capture large output in a perl variable
by graff (Chancellor) on Mar 23, 2005 at 14:18 UTC
    my $var = `property <dataid>`; print "<td align="middle">$var</td>";
    This command (property) sometimes produces an output of size almost 0.5Mb to 1Mb for some values of <dataid>(I checked the size of the output by executing the "property" command on the web server CLI).

    So, your plan is to up upwards of 1MB of data into a single "<td>" element in the html that you output? Okay, I guess.

    Regarding the presence of XML-tagged stuff in the middle of a plain-text stream: are you trying to show that verbatim, making all the XML tags visible?

    And regarding your testing of the "property" program on the command line: are you sure that all the data ouput from this program is text? (Could there be non-printing control characters or other non-text, binary content? If so, what sort of operating system are you using, and do you need to specify "binary mode" for reading the data?)

    It might be better (or at least, easier to debug) with a pipeline open statement:

    my $var; open( PROG, "property $dataid |" ); # binmode PROG; # might need to do this? { local $/; # set input rec. separator to undef $var = <PROG>; # slurp output from property } close PROG; $var =~ s/</&lt;/g; # make sure browsers don't see XML tags as tags # if there is binary data, there's more you'll need to do # to make it presentable to a browser...
    If you still have problems with that, you can switch to reading the "property" output line by line, and/or add some diagnostics, and/or move the data to the output HTML stream in smaller chunks.
      Hi, Thanks for the suggestion. I got the html formatting working. Regarding the incomplete output, I tried using pipe open like you suggested but I am still not able to get the entire output(just the same incomplete output). Also I tried to redirect the output from the "property" command to a file on my account on the webserver instead of directly sending it to the browser and still the file didn't contain the entire output. It just contained the incomplete output that I used to get earlier on the browser. But if I log onto my account on the webserver and run the "property" command and redirect it to a file, I get the COMPLETE output in the file. I also observed that the file was about 400Kb in size which is not too big. Any suggestions on what else I could try? Thanks, vinay
        Hi guys, Thanks for all your suggestions. I did learn some good debugging techniques in perl. I finally got the script working to retrieve the large output. The problem was that the web admin had put a cap on the apache memlimit directive and hence I wasn't able to retrieve the complete output. Once we increased that limit observing the memory requirements of the "property" command, everything started working!! :) Thanks a lot :) Regards, Vinay