tilly has asked for the wisdom of the Perl Monks concerning the following question:

At $work, there are a lot of spreadsheets floating around that have defined Access queries as data sources. Those queries in turn use tables that are linked to a PostgreSQL database. When the user of the spreadsheet refreshes the spreadsheet, Excel re-runs the Access query and automatically pulls it in.

I don't like this architecture for a variety of reasons, not the least of which is that the people who wrote those Access queries often didn't know what they were doing, and sometimes break each other's spreadsheets by changing the queries. However they really like the ease of setting up spreadsheets that automatically refresh their data, and I completely see their point.

I'd like to be able to develop reports, preferably in Perl and preferably in Linux, that can similarly be linked into Excel spreadsheets and automatically refreshed (causing my reports to be re-run). If this involves adding a plugin to Excel, that's fine. If I need to do something like provide a URL that serves XML, that's fine as well.

Has anyone actually done anything like this, and if so can you point me at the appropriate alphabet soup of technologies (preferably as simple a stack as possible) that I need to make it work?

  • Comment on Excel automatically pulling data from Perl?

Replies are listed 'Best First'.
Re: Excel automatically pulling data from Perl?
by Corion (Patriarch) on Jun 28, 2007 at 13:55 UTC

    I've done that once (in Python though), but it was plain simple - Excel can read an HTTP url and will use a table returned there as the data. So, glue together something with HTTP::Server::Simple, and make Excel re-read that URL. Or fire up some VBA code to use Shell() to launch a Perl program periodically.

    You add the HTTP data source as a real data source (Menu Data->External Data->HTML file->http://tilly.internal/magic_url). I believe there was a way to make Excel read a value into a single cell via HTTP but I don't find the code I used for that anymore.

      Thank you, that works perfectly. :-)

      That's really neat. I just tried it with OO-Calc and it works a treat. The menu option Insert->link_to_external_data accepts a url and presents you with a list of entites html_all, html_tableetc. Click that and your away. Nice.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Excel automatically pulling data from Perl?
by jbert (Priest) on Jun 28, 2007 at 14:57 UTC
    Can you get what you need by defining a view in a Mysql server (5.0 or higher)? This gives you a kind of 'virtual table' defined by an SQL query, in much a similar way to Access.

    You can then (obviously) get to the MySQL views as tables from Access and Excel via the ODBC connector for MySQL.

    This means that you're writing the 'report' in SQL rather than perl, but it might give you most of what you need. You can always use perl to populate some of the tables which the view references, if you need to get data from perl-accessible places.

      If I was going the view route, there is no need to use MySQL.

      But that said, views are an alternative to consider. Unfortunately I've already encountered cases where PostgreSQL chooses bad query plans, and so I need to create a temporary table then query it. That is not something that I can do in a view. I've also encountered cases where it is easier to do some simple calculations outside of the database. (Example: what is the spread from one event to another in months? The catch is that if the first event happened Jan 31, and the second one Feb 2, the spread is 1 month. That kind of date arithmetic is very messy in SQL.)

      In fact both of these happened on the very data set that I'd like to be able to pull into Excel. :-(

      Now there is another workaround, and that is to create stored procedures in the database and select from that. I have reasons for not doing that as well, but if I can't make an intermediate layer work then I'd seriously consider going there.

        Apologies for mentioning MySQL, I'm less familiar with PostgreSQL.

        Is the date manipulation issue that much more cumbersome with SQL than perl?

        select period_diff(extract(year_month from date1), extract(year_month +from date2)) as month_spread from foo;

        I guess a big consideration for this is whether the data is allowed to be a little stale, in which case you don't need any complex technology and can simply periodically recreate tables in the db via a scheduled perl job.

        If it does need to be 'on demand' (i.e. someone hitting refresh in Excel needs/wants/expects the *live* data rather than something 5 mins old) then this might be accomplished with a db trigger which shells out to run a perl script to repopulate a table. Although the HTTP route might still be better here I guess.

        It also depends on data set sizes, time to run the 'report', etc.

      FYI, there's no reason to use MySQL. tilly said he was using PostgreSQL, and views exist there, as well as in numerous other databases.

      But if views do the job, then that may be the way to go. I'd also think about how often the data really needs to be "refreshed", and then maybe just generate the data daily/weekly/monthly/whatever.