comment on

Two questions really. The first question is recently I found myself with an array of hashes and I need to create a new array, but each hash in this array had to had less elements. To solve this I eventually just did:

my @old_hashes;
my @new_hashes;
my @wanted_keys = qw/foo bar baz/; #etc
@new_hashes = map{ my $x = $_; +{ map { $_=>$x->{$_} } @wanted_keys } 
+} @old_hashes;
[download]

which solved the problem, but nesting maps seemed a tad hackish and it seemd to be that some solution could be devised using hash slices and be much cleaner.. but I couldn't think of one, so I ask the assorted monks here if anyone can think of a better way to do this?

My second question is I have some strange non-fixed width and non-delimited strings I need to parse that look like this:

 6     2   78 testing stuff         0 69.68.119.54:28960    34756 2500
+0
 7     4  118 [:EsU:]|BLaZE|        0 24.86.4.164:28960      7248  500
+0
6     2   78 tessssssstinggggggggggg REAAAAA     40 69.68.119.54:28960
+    34756 25000
[download]

You'll notice that most of the fields are seperated by white space, except that the middle field can contain embedded whitespace! My solution to this was to devise a regex that basically looks like this:

my @cols = m/
(\d{1,3})
\ +
(-?\d+)
\ +
(\d{1,4}|CNCT)
\ 
(.+?)(?:\^7)?
\ +
(\d{1,6})
\ 
(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:-?\d{1,5})
\ +
(\d{1,5})
\ +
(\d{3,5})
/x;
[download]

Anyone see a better way to parse the above data?

After thinking about the above, it occurred to me that I could split on white space, pop off the last 4 fields and unshift the first three fields and then join whatever's left, but that would be destructive on the middle field as I would have no way of knowing exactly how much white space the split consumed before finding the next field.

Note, the desired output from the above input input lines, delimted by quotes and commas should be basically:

"6", "2", "78", "testing stuff", "0", "69.68.119.54:28960", "34756", "
+25000"
"7", "4", "118", "[:EsU:]|BLaZE|", "0", "24.86.4.164:28960", "7248", "
+5000"
"6", "2", "78", "tessssssstinggggggggggg REAAAAA", "40", "69.68.119.54
+:28960", "34756", "25000"
[download]

Note 2: You'll notice that the sixth field is an ip address, I just use a simple regex to match 4 sets of 1-3 digits followed by some kind of port as I already know the ip is a valid ip so I just need to extract it, not validate it.

In reply to Parsing bizarre non delimted data and hash slices by BUU

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.