in reply to Problem Parsing Log Files

"I need to do it with two dimensional arrays". Why? Is it a school assignement? Otherwise a Hash (of Arrays) would make a lot more sense to find duplicate lines. The error message is the key, the timestamps are collected in the array.

If you only need the last timestamp and the count, you can drop the array and store a counter and the last timestamp either as a concatenated string or a two-value array or hash

Replies are listed 'Best First'.
Re^2: Problem Parsing Log Files
by ajay.awachar (Acolyte) on Jun 18, 2009 at 00:27 UTC
    Hi Jethro

    It's not an assignment. This script is for production logs. I don't just need a timestamp and count. I want time-stamp, error and count as a I mentioned in desired output. It's not necessery to do it with 2D array. As I need to store 3 components timestamp, error and count. I am not sure of what structure would be best to do this and how to do it.

    Thanks,

    Ajay

      If you use a hash of arrays as suggested above, the size of the array is your error count, the timestamps are stored in the array, and the error is the key of the hash.

      How to use a hash of arrays is described in the perl5 book for example, or in the manpage perldsc.

      I would go with HoA, given the description of the problem so far. The hash would use error messages as its keys, with references to two-element arrays (latest timestamp, count) as its values.

      Alternately, you might find it easier to just push timestamps onto the arrays. The size would then be the count and (assuming a sorted input file) the last-added timestamp would be the most recent.