Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

If I have a file that looks like this:
12345: 12:34:00,890,987,9876,... 12346: 12:35:02,789,987,234,... 12344: 12:35:04,345,987,1234,... etc.
And let's say that the 12:34:00 in line 1, 12:35:02 in line 2, and 12:35:04 in line 3 represent time. I want to separate this column for each line and do a comparison to it, keeping hours, minutes and seconds separate. I am first thinking of doing a split to separate each line by commas and then go in to each line that matches a condition '987' (for example) and for each line with '987' I want to take the time and manipulate it. This is what I have so far:
LINE: while (<THATFILE>) { next LINE if (!/^\d+: \d{2}:/ ); foreach $field(split /,/, $line) { chomp $field; if ($field == '987') { foreach $subtime (split /:/, $field) {
How do I keep each field separate so that I can determine hours, minutes, and seconds?

Replies are listed 'Best First'.
Re: split into hash
by buckaduck (Chaplain) on Mar 20, 2001 at 04:02 UTC
    First, you use $line without defining it. That's just a typo, I'm assuming.

    Next, if you want to test only the 4th column for the '987' condition, you'll want to change your 'if' test to compare only the correct field.

    After that, you could easily use a regex to grab the desired time components.

    I'm also going to remove the unnecessary label and invert the logic of the 'next' conditional test, for readability only. I'm also going to remove the quotes from around '987' because I hate unnecessary quotes around numbers.

    while (<THATFILE>) { next unless (/^\d+: \d{2}:/ ); my @fields = split /,/; if ($fields[2] = 987) { my ($hour,$min,$sec) = /^\d+: (\d\d):(\d\d):(\d\d)/; ... } }
    buckaduck
Re: split into hash
by busunsl (Vicar) on Mar 20, 2001 at 04:04 UTC
    Well, first, if $field == '987' then there is nothing to split by ':', right?

    If you can be sure about your data, why not do the following:

    while (<THATFILE>) { if (/\d+ (\d+):(\d+):(\d+)((,\d+)+)/) { my ($hour, $min, $sec, $line) = ($1, $2, $3, $4); if ($line =~ /,987/) { do-something-with-time-values; } } }
Re: split into hash
by fpi (Monk) on Mar 20, 2001 at 04:04 UTC
    This might be helpful: use this concept:
    my @matches = ($line =~ /$regexp/);
    where whatever is matched within parentheses in the regexp will be output to the elements of @matches.

    So, if the '987' that you are matching always occurs in the same place, then within your loop you could do something like:
    my $regexp = "\d+: (\d+):(\d+):(\d+),\d+,\d+,(\d+)\.+$"; my ($hr,$min,$sec,$match) = ($field =~ /$regexp/); if ($match == '987') { ... #do whatever you have to do with $hr,$min,$sec

    I'm sure there are better ways to do the regexp, but I do it like that because that's the way I understand it.

    My point is that I prefer to split it up all at once by fishing out your values with one regexp, as opposed to splitting and splitting.

    Hope this helps....
Re: split into hash
by greenFox (Vicar) on Mar 20, 2001 at 08:06 UTC
    I would be tempted to mung the data to make the field seperator consistant and then parse it much as others have suggested-
    while (<THATFILE>){ s/^(\d+): /$1,/; # convert first field seperator from ': ' to ' +,' my @fields = split /,/; next unless $fields[3] == 987; my ($hr, $min, $sec) = split /:/, $fields[1]; # do something with time here }

    I think that makes the code clearer and I doubt any-one would look at the source data and miscount the fields, the comment would be mandatory however.

    my $chainsaw = 'Perl';