in reply to Tabular Data, Group By Functions

Can this data be made available via a database (mySql or SQLite?) instead? The tasks you request are really very easy in SQL.

You could take the tab-delimited file and convert to a CSV file and then use DBI and DBD::CSV (also see the Tutorials) ...
(Update: DBD::AnyData can be used to read your tab-delimited data as-is ; You can even construct in-memory tables from a string!)

Example SQL statements:
SELECT ... FROM your_table ORDER BY in, out; SELECT day, SUM(in + out) FROM your_table GROUP BY day;

Replies are listed 'Best First'.
Re^2: Tabular Data, Group By Functions
by husker (Chaplain) on Jun 14, 2005 at 15:29 UTC
    This is just my opinion, but implementing a relational database to process a four-field tab delimited data file, with fairly simple sorting requirements, seems like rabbit hunting with an 8-inch Howitzer. Any time I start building a solution that ends up being a one-table DB, I begin to question if I'm overengineering the whole thing.

      For truly one-off things, I'd agree. But how many of these do you want to reinvent: sorting by multiple columns, grouping, aggregate functions, file locking, deleting, inserting, updating, selecting by multiple criteria, etc. etc. A six line DBD::CSV or DBD::AnyData script can accomplish all of those. And when it comes to output or integrating the handling of that table into a larger context, do you want to reinvent all of that or leverage the many DBI extension modules for those purposes?

      I think it's foolhardy to say that either the database way or the non-database way is always the right way, it depends entirely on context. And as far as howitzers go, it is a matter of trading the howitzer of including modules (which may make very little difference in the long run) against the howitzer of coding all the details from file opening to parsing to sorting.

      i agree w/jZed's response to this, too.. For me, i basically had 3 reasons (some of the overlap w/jZed) for the DBI suggestion:
      • i naturally thought DBI beacuse of how the OP worded the requirements -- it screamed RDBMS, and i strongly suspect the OP whad SQL in his mind when envisioning the requirements.
      • The DBI solution could easily turn out very elegant, as opposed to a slew of (possibly tedious) loops/hashing up (though it may be a good excercise for OP)
      • The scope/size of OP's problem was unclear -- it seemed like OP might have just given us a very small example, and could very well have a lot more data (and "tables"), which goes towards making the DBI solution more appealing. (and faster once it's set up, cause of indexes)
Re^2: Tabular Data, Group By Functions
by Anonymous Monk on Jun 15, 2005 at 01:48 UTC
    Thank you. That helped. I get the data as output from another program. Is there any way that I need not save it the output in a temp file and directly pipe it to DBD:CSV ?
      With either DBD::CSV or DBD::AnyData (both of which use exactly the same SQL engine and can handle the kind of data you show) you can query an arrayref. With DBD::AnyData you can also use the ad_import() method to query a string. Either one can be done without writing to a temporary file.