HI Gtforce,

I suppose that this flat data file is coming from some fancy corporate sales/inventory DB. This may sound flippant, but buying some dinner and drinks for the person who generated file for you might yield the most efficient/effective solution for you! But I guess you have already considered that...

It sounds conceivable that your data processing could all be done in an Excel Spreadsheet with no Perl programming at all. I haven't done any serious spreadsheet work in years, but spreadsheets can be huge now, 2 million rows is possible.

Let's talk about Perl:
You are inexperienced at Perl and sounds like you have no SQL experience. However, I believe that a solution that involves learning the "least amount of new stuff" will involve learning a targeted subset combination of both Perl and SQL. Using an SQLite DB will simplify the data structures that the Perl code has to work with (less fancy Perl to learn). I believe that learning basic DBI will simplify your Perl code.

SQlite is the most used DB in the world because it is on every smart phone. SQlite doesn't require any fancy server setup and admin - it uses a simple file for its work. So huge admin hassles just disappear. You will need to learn how to create tables, insert new records, select (i.e. get) records from the DB. Only a very,very small subset of SQL needs to be learned. For the Perl I/F with SQLite, you will need to learn a subset of Perl data structures. I recommend only one: how handle an AoA, a 2D array structure or a reference to such a thing. Don't start with learning everything, just learn this fetchall_arrayref() function well.

From what I see so far, a basic idea could be:

The idea of this running for 4 hours is insane. Something is seriously wrong if this doesn't run in <4 minutes.

In reply to Re: creating and managing many hashes by Marshall
in thread creating and managing many hashes by Gtforce

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.