manipulating array

gavintokyo has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: manipulating array by GrandFather (Saint) on Aug 23, 2007 at 01:45 UTC
Probably you don't want grep, but instead map and a regular expression substitution (see perlretut and perlre): `use strict; use warnings; my @call = ( 'A19284 hostname 07/09/07 moredata moredata', 'A19384 hostname 06/09/07 moredata moredata', 'A19234 hostname 07/08/07 moredata moredata', ); my @callClean = map {$_->[1]} sort {$a->[0] cmp $b->[0]} map {[(join '', @{[split '/', $_->[0]]}[2, 1, 0]), $_->[1]]} map {m\|^\w+\s+(.?(\d+/\d+/\d+).)\|; [$2, $1]} @call; print "$_\n" for @call; print "$_\n" for @callClean;` [download] Prints: `A19284 hostname 05/09/07 moredata moredata A19384 hostname 06/09/07 moredata moredata A19234 hostname 08/08/07 moredata moredata hostname 08/08/07 moredata moredata hostname 05/09/07 moredata moredata hostname 06/09/07 moredata moredata` [download] Update: sort by the dates too. DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: manipulating array by ysth (Canon) on Aug 23, 2007 at 04:07 UTC
It sounds like you have one line in each array element. If so, your grep is filtering out all lines that have /A1924/ in them. But from your results, I'm guessing you actually say /A192.4/. /A1924/ means zero or more 2's between a A19 and a 4, and none of your examples match that. /A192.4/ means match zero or more of any characters (except a newline) between a A192 and a 4. Instead, loop through your array, removing the column you don't want: `for my $line (@call) { # remove A192.4 and following whitespace from the beginning of eac +h line $line =~ s/^A192.4\s+//; }` [download]	[reply] [d/l]
Re^2: manipulating array by pysome (Scribe) on Aug 23, 2007 at 06:57 UTC
Try that: `use strict; use warnings; my @call = ( 'A19284 hostname 07/09/07 moredata moredata', 'A19384 hostname 06/09/07 moredata moredata', 'A19234 hostname 07/08/07 moredata moredata', ); print "$_,$/" for map {s/^\w+\s//;$_} @call;` [download]	[reply] [d/l]
Re: manipulating array by jbert (Priest) on Aug 23, 2007 at 08:31 UTC
Given that you're going to want to look at the date anyway, you might as well pick your data apart into columns. If you're sure that you'll have no whitespace in your 'moredata', you could just do a split without a limit, but to play it safe I'll just split the first few columns (untested): # Each eat of splitLines will hold an array ref my @splitLines; foreach my $line (@calls) { my @bits = split(/\s+/, $line, 4); # Discard the first column shift @bits; # We want to sort by date, so we'll parse # the date column and prefix with a sortable value my ($mday, $month, $year) = split(m!/!, $bits[1]); $month -= 1; # mktime wants month from 0 $year += 100; # mktime wants year from 1900 my $when = POSIX::mktime(0, 0, 0, $mday, $month, $year); unshift @bits, $when; push @splitLines, [ @bits ]; } # Sort by first elt @splitLines = sort { $a->[0] <=> $b->[0] } @splitLines; # And puts the lines back together again (discarding # the leading 'when'. @calls = map { shift @$_; join(' ', @$_) } @splitLines; [download] This seems over-long, but unless you write a more complex sort comparator (and hide the date parsing and processing etc in there) I'm not sure it can get much shorter. I don't like debugging complex sort comparators, so there you go.	[reply] [d/l]
Re: manipulating array by ikegami (Patriarch) on Aug 23, 2007 at 20:48 UTC
The following speeds up the `sort` the others presented, and is probably faster overall for anything but trivial data: `my @sorted = map substr($_, 6), sort map { local $_ = $_; s/^\S+\s*//; my $ymd = join '', reverse split '/', (split / /)[1]; "$ymd$_" } @data;` [download]	[reply] [d/l] [select]
Re: manipulating array (TMTOWTDI!) by Codon (Friar) on Aug 23, 2007 at 19:59 UTC
If you know that every element begins with the A19\d+ (or you only care about these lines) you can filter / clean the records with a grep. You can then pipe those (matched/cleaned) items to sort (via a Schwartzian Transform) using a quick (simple) `by_date` subroutine. Read more... (1295 Bytes) I know I could have saved some characters on the sort line, but this was more aesthetically pleasing to me. Ivan Heffner Sr. Software Engineer WhitePages.com, Inc.	[reply] [d/l] [select]
Re^2: manipulating array (TMTOWTDI!) by ikegami (Patriarch) on Aug 23, 2007 at 20:44 UTC
I don't like how it not only clobbers `@data`, but relies on it. The `grep` could be replaced with: `map { local $_ = $_; s/^A\d+ // ? $_ : () }` [download]	[reply] [d/l] [select]
The big point, too... by chaggalag (Initiate) on Aug 24, 2007 at 07:59 UTC
I like that solution because it points out the most significant lesson: that "*" has a specific meaning the original poster did not want. The star means "zero or any repeats of the previous item" in a regex context. The specific string one wants to match is "the letter A followed by the numbers 1 then 9 followed by two digits of the 0 through 9 series followed by a 4...". It's very important to take the time to think about a regex in that plodding way or you get matches you won't want. For example, the rest of the line could match. I hope I'm not being redundant with my bandwidth. Y'all have a good day.	[reply]
Re: manipulating array by lyklev (Pilgrim) on Aug 24, 2007 at 22:38 UTC
The file glob '' that works for files works differently for regular expressions. 'A1924' means "A19, then 2 repeated zero or more times, then a 4", so nothing matches.	[reply]