sondagar_nilesh has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I am implementing the functionality to generate the CSV file (reports) in the perl - Apache environment. This functionality seems to work improperly for some data. We are using the following code to get the things done.
my $r = Apache->request; $r->content_type('text/csv'); my $headers_hash = $r->headers_out; $headers_hash->{'Content-disposition'} = "attachment; filename=" . $re +port_name . ".csv";
This code is expected to ask to save the generated .CSV file , but it is not doing the same. Any advice where I should concentrate so that I can get the better and correct output in the CSV format Thanks, Nilesh

Replies are listed 'Best First'.
Re: Problem In generating the CSV file in the perl code using Apache
by Old_Gray_Bear (Bishop) on Jan 23, 2008 at 19:27 UTC
    You say "This functionality seems to work improperly for some data." This tells me that you have some data-sets where the functionality works as expected and other data-sets where it does not. To me this sounds like an issue in your data: there are characters in the string that your CSV generator is not coping with properly.

    I would start my debugging with the question: "What is the difference between data that works and data that doesn't?"

    To answer this question, split up your data-sets into two camps: those that work correctly and those that don't. Take one of the "don't work" camp and pare it down by removing lines until you get down to a line or two that fails. (Binary-spitting will be the tool of choice here -- split the data-set in two pieces and apply the test to each piece; take the piece the failed and repeat the process. This will quickly get you down to a line containing the 'bad' character or characters, but see the note later on).

    Look at the data in the line. Are there characters there that you did not expect? Do any of the fields embed your separator character? Are there 'control' characters embedded in one of the fields? Figure out "What Is Wrong Here"....?

    Once I have an hypothesis ('When I see XXXX in my data, my CSV generation fails'), I check my other failing data-sets. If they also have the same XXXX character(s), then I feel confident that I have found a Real Bug, one that is common to all of my failing data. Fix the bug. Do your happy-dance. But...

    Note: When you do fix the Bug, that is no assurance that you don't have another bug that was masked by the bug you just fixed. You may have to do this process several times before you get a completely clean run. Even then, you may run into other input data later on that cause your CVS generator grief (Naive Users are infinitely imaginative). This is a really good argument for going to CPAN and looking at the CSV modules there -- they are more likely to have all of the 'normal' errors already fixed, and a lot of the edge-cases (ones that you haven't thought of yet) handled as well.

    ----
    I Go Back to Sleep, Now.

    OGB

Re: Problem In generating the CSV file in the perl code using Apache
by bart (Canon) on Jan 24, 2008 at 13:05 UTC
    $headers_hash->{'Content-disposition'} = "attachment; filename=" . $re +port_name . ".csv";
    If your filename contains "weird" characters, you should best put double quotes around the filename. And Perl has powerful string interpolation syntax features, so you don't have to concatenate all the loose parts yourself. It makes code just a little bit more tidy.
    $headers_hash->{'Content-disposition'} = "attachment; filename=\"$repo +rt_name\".csv";
    In case you wish to embed double quotes, it may be beneficial to use qq with any delimiter or delimiter pair you like, so you can drop the backslashes:
    $headers_hash->{'Content-disposition'} = qq(attachment; filename="$rep +ort_name".csv);

    Also, be aware that there's no way to force a browser to dowload a file, all you can do is make a polite request. It's the browser itself, and the user behind the keyboard, that has the final word. See the late Alan J. Flavell's view on this matter, a view shared by many other experts, in the last post in this thread:

    The original idea of the interworking protocols was that it was the *recipient's* business to decide whether they wanted to render a resource or to download it. The author/publisher's job was just to advertise its content-type honestly (as you note). The idea of a mischievous author proposing a download, to some sensitive file on the recipient's system, and the naive user acceding to the request, is just too attractive to a certain kind of pondlife on the 'net.

    Re: the subject line: generally speaking, "force" does not work, on the WWW. And it's good that it is so.