in reply to Please provide a hint for me to continue with the rest of my program

For greatest efficiency, and minimal resource usage, make mysql do the heavy lifting. Get the domain counts by:
my @domain_counts = GUI::DB::query ( $dbh, "SELECT date(now()), substr(addr,locate('@',addr)+1) as maildomain +, count (*) as mailcount FROM mailing GROUP BY maildomain ORDER BY mailcount DESC" );
After that, the storage, and top-50 selection becomes easy. Here is the relevant SQL...
INSERT INTO dailydomaincounts (maldate,maildomain,mailcount) VALUES +(?,?,?); SELECT SUM(mailcount) as TOTAL from dailymailcounts WHERE maildate >= date(now()) - INTERVAL(30 days); SELECT maildomain, sum(mailcount) * 100.0 / $total as monthlymailpct + from dailymailcounts WHERE maildate >= date(now()) - INTERVAL(30 days) GROUP BY maildomain ORDER BY monthlymailpct DESC LIMIT 50;
Of course, the e-mail splitting at the first '@' is not the worlds most robust implementation, but given the other artificial constraints imposed, it should suffice.
SQL not tested.

SQL "query complexity" is a rather vague, subjective term - I do not consider the above queries to be "complex".

             "I'm fairly sure if they took porn off the Internet, there'd only be one website left, and it'd be called 'Bring Back the Porn!'"
        -- Dr. Cox, Scrubs

Replies are listed 'Best First'.
Re^2: Please provide a hint for me to continue with the rest of my program
by Yary (Pilgrim) on Apr 24, 2013 at 12:21 UTC
    About daily counts & NetWallah's proposal: Alas the (rather arbitrary) problem constraints include a table that has no time/date- it only has one column "addr" for email addresses. I assume that it grows continually.

    So for the daily count, have the program run once a day- the same program that counts the # of domains can also count how many rows are in the table, and store it (in a file, or in the database, whatever is allowed). Then on the next run it can subtract to find the number of new records. That matches the Anonymous Monk's earlier suggestion.

    In fact, you could make the program more efficient- though more complicated- by also storing the domain count, and re-reading it on startup, then skipping over the old records on the next run. Then you only need to add up the new domains, and add those to the old totals. That will only work if addresses are returned in the order they are created! If you go that way, document that assumption!

      The problem statement says:
      New addresses will be added on a daily basis.

      I assumed that to mean that the table will be emptied daily, and only new addresses would be added.

      They said the table would be clean initially. (If the person putting forth the constraints can impose conditions, so can I).

      With this assumption, no additional tracking is necessary.

                   "I'm fairly sure if they took porn off the Internet, there'd only be one website left, and it'd be called 'Bring Back the Porn!'"
              -- Dr. Cox, Scrubs

Re^2: Please provide a hint for me to continue with the rest of my program
by pooyan (Initiate) on Apr 25, 2013 at 22:10 UTC
    Hi Yari. Thanks for your input, it is definitely a great use of SQL and more learning; however, the question say all processing needs to be done in Perl and that complex SQL and subqueries are not allowed. I think we need to get @returndata and go through it with loops in Perl.
      Hmm, you may have my answer mixed up with another. What I meant when I said "It could be simpler than you have it" is that, after you write your program as you initially described it (which should work), you can simplify it by removing some unnecessary steps, and it will still work, and be easier to understand.

      And you don't even need to worry about these simplifications now- write the code and you might even find them as you're writing it. You have a plan, make something that works first. Then post the code here and get some critiques, if you like.