Polyglot has asked for the wisdom of the Perl Monks concerning the following question:

I want to send a file to be downloaded via a "Save as..." type dialog in the client's browser. This requires a special set of HTTP headers. However, I would also like to update the requesting webpage at the same time, AJAX-style. Unfortunately, the two sets of headers must be different, and are not mutually compatible. My attempts so far have simply broken the file download, which is otherwise working.

Can I use a fork { ... } type of command to send back the file while still updating the webpage via AJAX? If so, how is this done? Which response should go back to the browser first, or does fork { ... } end up creating a race condition?

I'm not sure where to start looking for the direction to go with this, so any coding suggestions are welcome. Please note that I am not looking for a package--I have the downloading working already, and have learned to program the AJAX myself. I'm just needing to know how, if this is possible, to do both at the same time.

Blessings,

~Polyglot~

  • Comment on Can two separate responses be sent to the client's browser from Perl, such as via fork{}?
  • Select or Download Code

Replies are listed 'Best First'.
Re: Can two separate responses be sent to the client's browser from Perl, such as via fork{}?
by afoken (Chancellor) on Oct 15, 2023 at 09:09 UTC
    I would also like to update the requesting webpage at the same time

    I don't think that will work. HTTP is a request-response-protocol. One request, one response. A HTTP client (browser) can't accept two responses for one request. That can not be changed.

    To make it look like two things happened at the same time, you need to make the browser issue two requests, one for the update, one for the download. That can be done using client-side Javascript (timer) or maybe also using a HTTP "Refresh" header. Typically, you first request the update, and the update triggers the download request. Simply because it is very hard to trigger anything from a download request.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      I appreciate those insights. To help clarify, what I am actually attempting to do is to have a LaTeX server action run which generates its PDF for download. At the same time, it produces a log output of what worked or did not, as is typical for LaTeX, and I want to feed that back to the client along with the file. So the PDF file would be created at the same time the log data is created, and I'm not sure how I would get them both from separate requests.

      Blessings,

      ~Polyglot~

        There are several old postings that may help:


        So the PDF file would be created at the same time the log data is created, and I'm not sure how I would get them both from separate requests.

        I've just updated an old SANE CGI frontend wrapping scanimage to run on an embedded system. It does something roughly similar: In the scan handler, scanimage emits progress messages that are sent to the browser, while the scanned image is stored in a temp file on the server. The last action of the progress display is to emit a download link (for browsers with Javascript disabled) and a Javascript redirection to that link. The handler for the download link just sends the content of the temp file.

        In theory, there should be some code to remove old temp files (e.g. a cron job, a cleanup routine invoked for every request, or simply a call to unlink at the end of the download handler). The CGI should also use individual temp files for each scan. But there is exactly one user for the scanner (me), the CGI frontend is only available in my local network, and I don't care about having an old scan remaining on the scan server, so I use the same temp file for all scans, and I don't even bother to lock the file.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        Unless you want to do something klunky like wrap the response as JSON (return an Object with a key for the log text and one for the PDF contents (maybe base64-encoded)) you're thinking along the wrong lines. You're going to need to keep context on the server and handle returning things in multiple HTTP requests.

        The way I'd approach is to assign some kind of "job id" to a set of results (the input file, the output dvi or pdf, the log from processing). When you process an input file you'd maintain the context (the results) in some way keyed by the job id (save everything into a temporary directory named after it perhaps). You'd then come up with an API that you can use from clients to request back a result type for a given job id and either link directly or (possibly using some sort of JS) provide an all in one page (maybe fetching the log and pdf and showing those, with a separate link to the pdf results for download).

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Can two separate responses be sent to the client's browser from Perl, such as via fork{}?
by bliako (Abbot) on Oct 16, 2023 at 23:10 UTC

    Polyglot,

    user asks server via XHR. Server does the processing and sends back various data items (logfile + pdf) in a blob tar/zip-like archive or base64-encoded data within a JSON. User's browser runs JS which exctracts logifile and PDF and then sets up a dynnamic client's-browser-based download link for both, see, e.g., https://www.alexhadik.com/writing/xhr-file-download/. The difference to the already posted answers Re^3: Can two separate responses be sent to the client's browser from Perl, such as via fork{}? and Re^3: Can two separate responses be sent to the client's browser from Perl, such as via fork{}? is that latex file has been processed and data has already been fetched to the browser and the browser's JS creates download links for something that has already arrived. I.e. no temp files in server to wait for, no watching over server-running processes, etc.

    If you have managed to work out a setup whereas the server robustly processes the latex file or fails within some set time out which your client can tolerate and always responds back via a json containing logfiles, errormessages and possibly a PDF, then this is can be OK. Latex can take, say, 40 seconds to process a file. Usually 15secs (my use-cases). This is the time a client must wait for data from its request. Is this OK? can the connection be broken in that interval? if yes, this means failures which the client will click and click and click to run again, your server is at risk of bombing.

    Bottom line, AFAIC, think this over. 1) you, perhaps, need to let the server know that a second request from an impatient client is already being handled (md5 of latex source file plus client ID?). 2) Does latex processing takes long time which risks broken client-server connection? In this case you need to follow what afoken and fletch suggested. 3) Is it a fast processign? then just process and send back a JSON of PDF+logfiles. JS handles it and creates download links (for pdf) and/or displays logiles in a div, dynamically. 4) cache of produced pdf files for not processing latex again? requires storing md5 of latex source.

    note: "md5" is used generically for a hash.

    Some edits within 10 mins

    bw, bliako (with a bitten glo(tt|ss)a)

      Bliako,

      Your response looks difficult to implement, but seems to come nearest to understanding the situation. In my case, I've been running an example TeX file on this (via copy/paste into a form field, not file upload--though I'm on an internal LAN), and the script invokes the XeLaTeX command twice in order to produce a proper TOC. The server returns the PDF in about 6-7 seconds for a 330+ page book. This is quite tolerable, and is in no danger of timeouts. This is a fairly straightforward dictionary-style layout, without images, etc.--just text. Because it had returned the results so promptly, I had not even really grasped why people were giving me answers for long-running processes; but perhaps I should not assume that my example would be representative for all use-cases. FWIW, I have created this functionality on a dedicated VM, with full install of LaTeX components, fonts, etc. specifically for this. Perhaps this is why it runs so quickly, though I had not chosen to use a separate VM for purposes of speed, but rather to keep the LaTeX server configuration separate from my other CGI routines.

      I'm still puzzled by the thought that it is possible to send two files in a single base64-encoded lump, to be dissected client-side with JavaScript. If I could do this, though, it would definitely solve the problem. I'll have to look into this more. The way the script presently runs, the user receives the PDF and the server no longer needs to keep it. No need for tracking the time window to keep a particular file in case of a subsequent user request, and, in fact, the files do not need to be uniquely named (though I'm adding a timestamp to the filename as a courtesy to the user for versioning purposes).

      Looking at the link you included, I'm left wondering how to indicate a separation between two or more files or segments in the blob data returned. I think I'll have to experiment a little with this, because if it works, it would do what I want. Thank you.

      Blessings,

      ~Polyglot~

        The server returns the PDF in about 6-7 seconds for a 330+ page book. This is quite tolerable, and is in no danger of timeouts.

        For requests arriving serially. Try 20 in parallel and see what happens.


        🦛

        Your server will be sending a JSON:

        my $PDF = ...; # read PDF binary contents from file my $LOGSTR = ...; # read LOG file contents use MIME::Base64; my $data = { # this is binary that's why we encode it to fit in a JSON string 'pdf' => encode_base64($PDF), # this is text but it does not harm: 'logstr' => encode_base64($LOGSTR) }; # send the data to client: # CGI version: use CGI qw(:standard); use JSON; print header('application/json'); my $json_text = to_json($data); print $json_text; # OR render JSON to go to client with Mojolicious my $c = ... # controller ... $c->render(json => $data); # $data as above

        Here is javascript for the client (more like untested hints):

        function ajaxit( url, // server endpoint method, // e.g. POST data // any data to send to the server for this request? ){ var xmlhttp = new XMLHttpRequest(); xhr.open(method, url, true); // this is the function which handles the xhr data transfer xhr.onreadystatechange = function () { // xhr.readyState can be 0 to 4 and each one comes here // 4 means transaction is all done, // 3 means data is received. // for error or success, state is 4 if( xhr.readyState !== 4 ){ // data is being transfered still, wait ... return; } // xhr.readyState=4, all is done, either on error or success: if( xhr.status === 200 ){ // data was received, make it a JS object (basically a hashtable + aka JSON) var data = JSON.parse(xhr.responseText); // check integrity here // ... // handle the data your html must have a div to put the <a>'s // with this adivID json_data_handler(data, adivID); } else { // we did not get a HTTP code 200 OK, // if we have a callback to handle errors, call it if( onError == null ){ console.log("ajaxre() : error, "+method+"ing to '"+url+"':\n +readyState="+xhr.readyState+", status="+xhr.status); } else { # handle failed request #onError(xhr.readyState, xhr.status); } } }; // now do the XHR request with caller-supplied data xhr.send(data); // this call will be handled by the function above // function is asynchronous and leaves here while request is being m +ade } // this is called when XHR was successful and we got data from server // that data was converted to a JS object (see above ajaxit()) // and this data is given here as dataobj // your html should contain a DIV somewhere (whose id you supply) // to add two <a> tags in there for the user to click function json_data_handler( dataobj, divID ){ // we have already sent an XHR request to server // and it gave us back data (as a xhr.responseText) var obj = JSON.parse(data); // do here some checks if this was successful // ... // atob decodes base64 strings to bytes, // var pdf is now the actual binary pdf content var pdf = atob(dataobj['pdf']); var logstr = atob(dataobj['logstr']); // client-side save-as filename, whatever var saveAsFilename = '...'; // do whatever with pdf data var el = document.getElementById(divID); var a = document.createElement("a"); el.appendChild(a); # edit: set the right headers for pdf data #var blob = new Blob([pdf], {type: "octet/stream"}); var blob = new Blob([pdf], {type: "application/pdf"}); # edit: setting an intermediate variable which should be # revoked when not needed anymore with # window.URL.revokeObjectURL(objurl); var objurl = window.URL.createObjectURL(blob); a.href = objurl; a.download = saveAsFilename+'.pdf'; // do the same for logstr // or, since it is text, display it in the div verbatim }

        above is totally untested but the idea was tested

        Apropos processing latex from Perl, I always use LaTeX::Driver to run latex.

        bw, bliako

Re: Can two separate responses be sent to the client's browser from Perl, such as via fork{}?
by Anonymous Monk on Oct 15, 2023 at 07:25 UTC