in reply to Re: Log parsing by timestamp dilema
in thread Log parsing by timestamp dilema

I am not sure I understand how your code processes the log files in parallel. The following logic came to mind as I read your post, which may be what your code snippet is - correct me if I am wrong.

  • Open up all files and read the first line
  • Determine which line/file had the earliest stamp
  • Print that line, while keeping the other lines in the array
  • Read the next line from the matched file back into the array
  • Repeat step 2

    This appears to be good logic, even though I can't discern how this works from your code. Of course, in order for this to work for me, I would have to add a lot more code since I need to include the file name/path in each array - because we know that you can't get a filename/path from a filehandle. For this reason, I would probably use a hash. If this is not what you meant and I am completely off base - please let me know.

    I was thinking that I was going to have to resort (pun intended) to telling them to | sort.

    Cheers - L~R

    UPDATE: - edited to make logic clear

    • Comment on Re: Re: Log parsing by timestamp dilema
  • Replies are listed 'Best First'.
    Re: Re: Re: Log parsing by timestamp dilema
    by DaveH (Monk) on Feb 01, 2003 at 18:29 UTC

      Hi.

      Sorry, I couldn't resist rewriting your code. :-) The problem "got at me". It uses adrianh's solution, but translating it into your script, you would end up with something like the rewrite below.

      First, I removed the whole while loop, around lines 86-90, and all the code inbetween was cut out and saved for later. Alot of the repeated code was moved into subroutines. I have tested it as best as I can, and it works for me. I took advantage of the fact that you had already done the work of finding the files, which were stored in @Logs. This was used instead of @ARGV. I tried not to impose my coding style on the script, but it has been run through PerlTidy. This may have moved stuff around a bit.

      The other main change was the way of handling specified date ranges. Whilst the 'if' logic remains, I generalised it into a subroutine, and made use of a new %Range hash to store the 'begin' and 'end' dates (which are updated if the '-t' option is specified). By defaulting appropriately, this allows the code to check for dates being in the range the user want in just one line of code. Also, this means that the complicated regexes to parse command line args are only performed once, ranther for every line of every file.

        DaveH,
        Thanks!
        My logical interpretation of adrianh's solution was pretty much correct - I just couldn't see it in the code. This works as is, but I am going to test its speed against tall_man's suggestion as it runs considerably slower. I know that it is doing a lot more work, so this is expected and with the $|++ - the humans viewing it shouldn't really notice a difference. None the less, I am going to code my own version of the logic to see if I can't speed it up in addition to benching it against a version using File::MergeSort. If I can't do any better than your integration of adrianh's solution, the only change I will make is having it being an option and not the default. This way it will not effect the overall speed if someone chooses to do a -c and only look at one connector log.

        Cheers - L~R

          I'd be interested to see what you come up with. :-) That was definitely a "quick" hack of your original code, so there was definitely room for improvement.

          Glad it helped.

          Cheers,

          -- Dave :-)


          $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print
    Re^3: Log parsing by timestamp dilema
    by adrianh (Chancellor) on Feb 01, 2003 at 21:19 UTC

      Summary of logic spot on. Sorry my code wasn't clear enough :-)

      However, as tall_man pointed out, File::MergeSort (which I have somehow managed to miss) does exactly the same thing - and is nicely encapsulated. So I'd use that instead if it were me :-)