I have no problem with your regex approach for the timestamp.

And I have no problem with your parser approach for the timestamp. TIMTOWTDI. ☺

Did you happen to view Dave Rolsky's brief presentation of the slide titled Don't Use a Parser? He explains his qualified recommendation perfectly.

I agree that ISO 8601 format for the timestamps would be preferable.

Yep. Unless jasonl is required to maintain fidelity to the original representation of the timestamps in the logs for some peculiar reason, this is his best opportunity to improve the data by keeping the reformatted timestamps in ISO 8601 format.

The OP didn't show real data for "<data1> <data2>". I suspect your '(\S+) (\S+)' may well be an oversimplification of what's really required; however, no more so than my splitting on whitespace. :-)

My '(\S+) (\S+)' was an intentional simplification, not an inadvertent oversimplification. Although I didn't explicitly state it, I was tacitly making the point that jasonl could parse the whole log record, including the timestamps, with a single regex. The pattern I used in my code snippet to match <data1> and <data2> was just a placeholder—one that happens to match the placeholder strings "<data1>" and "<data2>" literally. ☺ As you've pointed out, jasonl didn't include in his post any verisimilar example data besides just the timestamps, so we can't possibly know how to parse his actual log records properly.

Jim


In reply to Re^6: sorting logfiles by timestamp by Jim
in thread sorting logfiles by timestamp by jasonl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.