Itatsumaki has asked for the wisdom of the Perl Monks concerning the following question:

Howdy fellow monks,

This is the basic outline of what I am trying to accomplish:

  1. Get most recent data from DB
  2. Add data to Excel Spreadsheet
  3. That data updates a chart in Excel
  4. Export chart to .gif file
  5. Display .gif file with some wrapping text in CGI

I have the first four parts working, but I am struggling with how to design the last one for good performance in a CGI environment.

There are two considerations. First, I need to ensure that the file-names don't get clobbered. Second, I would like to ensure that if a .gif has been created recently (say within 48 hours) it doesn't get recreated. This latter part is needed because the Excel-portion (especially the actually export) takes several seconds, so the server could get really bogged down on this.

The approach that comes to mind is to name the files according to a primary key from the database (well, the data is from multiple tables, but I can just concatenate the various keys together). I would then check the file-date, and if it is less than n-hours old, just display, otherwise unlink and make a new one.

Is that a sound approach? Are there other hidden gotchas in this kind of thing? Any/all advice/suggestions welcome!

-Tats
  • Comment on Strategies for temporary image files in CGI

Replies are listed 'Best First'.
Re: Strategies for temporary image files in CGI
by liz (Monsignor) on Nov 21, 2003 at 10:15 UTC
    Approach sounds ok to me.

    ...but I can just concatenate the various keys together...

    That should work, but make sure there is some delimiter between the keys so that you cannot get unintentional overlaps. For instance:

    key1 key2 concat concat. 123 45 12345 123.45 12 345 12345 12.345

    Furthermore, you would probably be better off to not expose this key to the outside world, as it contains information about the way your database is structured. You probably want to make an MD5 string out of that so that you have a unique identifier and not expose any information to the outside world.

    Liz

Re: Strategies for temporary image files in CGI
by Roger (Parson) on Nov 21, 2003 at 10:07 UTC
    Here's a sound approach - drop the .gif files. Use .png files instead. PNG files were invented as a free alternative to replace GIF files as the GIF files have proprietary LZW compression algorithm.

    Here's one algorithm I can think of -

    Step 1 - Get most recent data from DB
    Step 2 - Compute MD5 hash of data, or CRC32
    Step 3 - Compare it with previous MD5 hash stored, together with timestamp
    Step 4 - If hash values differ, new data found
    and so on...

    Here's a more efficient approach I can think of, if your database supports triggers -

    Step 1 Implement database triggers on insert, delete and updates
    Step 2 Update the modified flag in a user defined table, say, DATABASE_STATUS, that has two columns, modified_flag and last_modify_time.
    Step 3 CGI script checks for the MODIFIED_FLAG in the database
    Step 4 Generate new excel & PNG image if necessary by looking at the timestamp and modified_flag.
    Step 5 Clear the modified_flag, and set the last_modify_time.


      The proprietariness of gif has an expiry date. That expiry date has passed for those of us in the USA. It will be over in every country on June 20, 2004. See this timeline. Given the legal climate in Europe, I doubt that the European version was ever enforcible. (IANAL and all that.) Those two between them cover most of Perlmonks.

      Therefore the issue of not using .gif for intellectual property reasons is lessening in importance, and all things considered, I would not worry about it too much. (FWIW I am in the USA, where it expired already.)

      Excel made it remarkably easy to change from GIF to PNG formats (just changing a parameter). And the image quality is pretty much the same, so that worked well. Thanks!

      I like your first approach, since the chart data is coming from a query returning multiple records. If I read what you are suggesting correctly, I don't have to worry about time-stamps at all, just check the hash values. That would allow much longer time-to-refresh values for the files.

        Personally I don't think the time-stamp is necessary. But your original post said there was a requirement of not generating a new image within 48 hours. The timestamp is used for that purpose. If your data is pretty static, you can certainly drop that 48 hour checking bit.

Re: Strategies for temporary image files in CGI
by davis (Vicar) on Nov 21, 2003 at 10:29 UTC

    If any suggestions are welcome, I suggest ditching Excel. GD::Graph appears to cope with creating the bargraphs (from CB conversation) you're after, but I've not used it.

    cheers

    davis
    It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.

      Ditch Excel? No, I'd shoot Excel slowly and painfully, throw it in a meatgrinder, toss it in between the offensive and defensive lines during a game-ending goal-line stand, slap it a few times, snitch on it to the IRS, soak it in lemon juice, make it listen to modern poetry, THEN throw it in the ditch (either at Love Canal or Times Beach, MO).

      That said (and boy do I feel better) I used GD:Graph a while ago and had no problems. If you can use it instead of Excel for this (ie, you have it available or can have it installed quickly enough), I would recommend it.

      (And don't get me started on Word's XML ...)

      --
      tbone1
      Rock stars. Is there anything they don't know?
      - Homer Simpson

      I hate to say it, but I tested GD::Graph and it wasn't as flexible as Excel for charting. For instance, it doesn't allow for error-bars on bar-charts, couldn't mix chart types as easily, required manual calculation of regression lines. Excel's charting is slow, but actually isn't too bad.

Re: Strategies for temporary image files in CGI
by l3nz (Friar) on Nov 21, 2003 at 13:19 UTC
    I guess if you are asking this question you have no choice, still I wonder whether it's sane to generate online images by updating an Excel worksheet and then extracting images from it. I'm saying this because on a job I had somebody had this devious idea of composing letters by automating (via OLE) MS-Word document compositions. This idea proved awful, because:
    • It was quite slow (but we can forgive this)
    • It often broke with no apparent reason
    • It was unable to cope with moderate-to-heavy concurrency
    MS-Office programs were designed to be invocated and used by humans, so are not really fit as server-side components. If you have a choice, I believe you should investigate graphic creation modules (like GD) that are much more suit to the task.

    Please note: I'm not saying you should not automate via Excel, just that it does not seem very fit to me for serving images through a (busy) web server.

    About your question, I'd create the file and store it within a table with a key that's related to its information and a timestamp. If - to say - we are talking about stock data, I'd store a key like "XXX-60" to mean it's the stock symbol XXX and the graph is for 60 days. Then I'd store the blob with your GIF and a timestamp. On loading, I'd query the database for a "XXX-60" created within the last five minutes; if not found, I'd create it and replace the old version in the database. I would not go for "real" files; they're harder to keep track of, will fragment the file system, and you need a CGI with a database connection to serve the request in any case.

      I mainly went with Excel for convenience: with enough time I'm pretty sure I could get GD to do what I needed (manually add & size error-bars, calculate a regression line, manually overlay graphs to add it, etc.). I guess I'll try adding a simple chart and see how Excel responds.

      I like the idea of storing the file in the DB, but won't I need to write that data to disk when I want to display it? If I try to embed it in an HTML page, I'll need a static link for that, will I not?

•Re: Strategies for temporary image files in CGI
by merlyn (Sage) on Nov 21, 2003 at 15:35 UTC
Re: Strategies for temporary image files in CGI
by jZed (Prior) on Nov 22, 2003 at 02:01 UTC
    Have you looked at DBIx::Chart? It can't do everything Excel can, but it can do many kinds of things directly from a database into a graphic.