That is a nice problem you have. Still, it's not defined enough to solve it. You only specified the input, and the output format, but not the output itself. You also did not provide any parsing rules.

Taking what you have written as it stands, you could just run a simple substitution on every log line, substituting ">" with "," and you have a comma separated csv. This is exactly what you have written, but probably not exactly what you want.

I could make some assumptions. Let's assume that you want to extract the user names and the message. Where are they? Well, maybe the rules are as follows:

#!/usr/bin/perl use v5.14; my $input_line = 'T 1310234540 19<24SomeUserName>19 This is user chat. + '; $input_line =~ m/ \< # after the '<' sign, and \d+ # any digits that follow it, (\w+) # capture all letters as $1 \> # then expect the '>' sign \d+\s+ # followed by some other digits and whitespace (.+) # everything that follows is the message /x; my $username = $1; my $message = $2; say "Username: [$username], Message: [$message]";

But are those the correct rules? What if the user name is '1983_Mike'? Will his name be parsed correctly, or will it just be "Mike"? It won't be either of those. Maybe the user names cannot contain anything other than letters? Maybe not. What if the numbers before the message have their own meaning? Does 19 mean that it's the 19th message sent by SomeUserName in this session? Is it important? Should it form part of your desired output?

Start by asking yourself those questions, and figuring out what you want to do, exactly. Then, you could use the code above as a starting point, and look at the relevant sections of perlintro, as per the suggestion that was already given.

Good luck, and don't forget that the chat messages can probably contain commas.

regards,
Luke Jefferson


In reply to Re: Parse Chat Log to CSV by blindluke
in thread Parse Chat Log to CSV by Redfish76

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.