Fellow monks, Greetings!!

I posted a wrong email data, sorry for that. The following is the actual.

I get an email everyday with an attachment that indicate the following activity:

1) Authenticated Admin logins
2) Authenticated User logins
3) Authenticated Failed logins
The email attachment indicate the activity for the previous day.

The email attachment comes as HTML (with an image and data in tables) and the following data in the attachment need to be parsed:

1) Peripheral IP Address
2) Data
3) Time
4) Source
There will be two scripts

i) First script to parse the incoming data and insert the data into DB.
ii) Second script to aggregate the data once a week and send an email to a specified list. (This will be invoked by a cron)

Once we have the data, further the following operations need to be done.

1) Insert the parsed values into Oracle/MySQL DB.
2) Generate a aggregated report on every Sunday, that includes all the authentication information (i.e, Authenticated Admin logins, user logins and failed logins). The reports should be generated based on a Sunday-Saturday weekly period
The email attachment looks like the following:
|------------------------------------------------------------| | GIF image Firewall Reports ->(also a gif) | | -------------------- | + | Scheduled report for Peripheral at IP | | address 192.168.100.1 (David - 00027u) | | | |------------------------------------------------------------| | | | Authenticated Admin Logins for 2004-11-01 | | | | Time Source | | | | 09:10:58 192.168.100.10 | | 10:12:21 192.168.100.25 | | | | | |* Report producted for timezone: Pacific Time(US & Canada) | |* Report generated based on data on 11/01/2004 15:27:54 UTC | + |-------------------------------------------------------------

The weekly generated report should look as following:
|--------------------------------------------------------------------- +-------| | GIF image | | | + | | | Authentication Report | + | | |--------------------------------------------------------------------- +-------| | Successful Admin Logins (11/15/04 - 11/21/04) + | |--------------------------------------------------------------------- +-------| |Peripheral IP Address Date Time Source + | | 192.168.100.1 (David - 00027u) 11/04/2004 09:10:54 192.168.100 +.10 | | 11/04/2004 10:19:34 192.168.100.12 | |--------------------------------------------------------------------- +-------| | Failed Logins (11/15/04 - 11/21/04) + | |--------------------------------------------------------------------- +-------| |Peripheral IP Address Date Time Source + | | 192.168.100.1 (David - 00027u) 11/04/2004 09:05:50 192.168.100 +.10 | | 11/04/2004 15:19:34 192.168.100.12 | |--------------------------------------------------------------------- +-------| | Successful User Logins (11/15/04 - 11/21/04) + | |--------------------------------------------------------------------- +-------| |Peripheral IP Address Date Time Source + | | No Data No Data No Data No Data | |--------------------------------------------------------------------- +--------

Note:
1)Peripheral IP address comes in the header of the the email attachment.
2) Date: Date format should be in the format of month/day/year. If no data occurs for an entire week for a particular section, then "No Data" should appear in this column for the section with no data.
3)Time: Time should be in the format hour:minute:second. Hour field shoule be 0 to 23, minute is 0 to 59 and second should be 0 to 50. If no data occurs for an entire week for a particular section, then "No Data" should appear in this column for the section with no data.
4) Source: This column reflects the source IP found in each report. There many be multiple IP addreses listed in the authentication report. If no data occurs for an entire week for a particular section, then "No Data" should appear in this column for the section with no data.

I've come up with the following table design:

Admin_logins
admin_id auto_increment
peripheral_ip
date
time
source

failed_logins
failed_id auto_increment
peripheral_ip
date
time
source

successful_logins
s_id auto_increment
peripheral_ip
date
time
source

aggregate_logins
a_id auto_increment
admin_id FK to admin_logins (admin_id) table
failed_id FK to failed_logins (failed_id) table
s_id FK to successful_logins (s_id) table

Please let me know if the above table design is sufficient or any changes to be made.

Please let me know your inputs about building the scripts based on the above inputs or if possible put some sample code which will help me complete this task faster

Thanks for the help in advance.

In reply to Parsing email, inserting into database and creating an aggregation report. by Alligator

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.