Dear Monks,

As part of my personal journey to enlightenment, I have started a personal project that just seems to keep on growing. So, knowing that in good software "Architecture is not an Afterthought", I think I need to take this opportunity to solicit some advice.

The project is to create a near-real time picture of the events on the earth and to display that as the screen background on my laptop (Windows XP SP2). Currently this stage of the project is trying to plot the known locations of cruise ships and other maritime vessels on the seas (and possibly their track - previous locations over x period of time).

We may assume that the current days' data is available as a .csv file. Recently other monks have helped me work through the whole page scraping, table-extracting thing. I save that data as a .csv file for subsequent querying via DB-style select statements in the same script, so I already have two I/O's to marshal/un-marshal the data. (Not very efficient, I think). I want to save (persist) the days' data for subsequent recall (for example, if asked for the ships' track) in another session at a later date, so I use a .csv file.

My question is: would it be more performing (performant?) over time to try and merge the day's data into a large hash (keyed on ship's name, for example)? Can I save a hash back to the file system and recall it later? (I might not run this script every day). Or would it be better to just save the daily .csv files and (re)assemble the hash in memory? A day's data is about 10KB so this isn't going to get very large, even after a month; which is about all I want to save. It doesn't seem like I could trim a hash of old data by day when the key is by ship name, though.

It also seems to me that treating this like a database (of .csv data) would be an easy thing to query (just build the select statement and go). DBI::CSV would work well here. I'm not quite sure how to fetch multiple .csv files and append them together to get one large db, however. Can I use DBI on a hash like that or should I use something different? I know there are many ways to do this and the (holy) documentation tells me so, but which one would work well? Since I'm not a programmer by trade or training, my approaches to this project tend to be somewhat of a "hack". I'm learning Perl as I go. As a hobby, at least my wife knows that I'm not just surfing for "inappropriate content".

Thanks, Matt

Note: This "earth" drawing program takes input in the form of: a simple marker file: <lat>, <long>, <ship_name>. I can also create an "arc file" for the ship's historical track (<lat1>, <long1>, <lat2>, <long2>). I already plot earthquakes (NRT from the USGS); volcanoes; satellites; clouds and storms; and even some airplanes in flight. So to construct these input files, I just query and print by row. Voila!


In reply to More than one way to skin an architecture by mcoblentz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.