help with data structure to use and how to implement it

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks I have some data I need to parse and I am not to sure on the best data structure to use store this data. I have a file which is several thousand lines long in the following format; hostname, location, server fault type:

Server1:london:network_interface
Server1:london:diskspace
Server1:london:diskpace
Server1:london:kernel
Server2:paris:diskspace
Server3:new_york:Kernel
Server3:new_york:diskspace
Server3:new_york:diskspace
Server3:new_york:kernel
[download]

I am not interested in the middle column (location) and need to create a report that lists server name, type of fault, and a count of each type of fault per server. Report would output like this:

Server1 network_interface  1
Server1 diskspace 2
Server1 kernel 1
Server2 diskpace 1
Server3 kernel 2
Server3 diskspace 2
[download]

The problem I have is that I was thinking of reading in file and storing this in a hash so hostname is key and fault type is value but I cant do this as there are duplicate hostname entries in file so i cant use that as key. I have the same issue if i reverse this and use fault type as key as there would be duplicates of those so i cant use that as key either. I would really appreciate some help on this please on how i would go about parsing this data and outputting it in the report format i have shown above. Kind regards

Comment on help with data structure to use and how to implement it Select or Download Code

Replies are listed 'Best First'.
Re: help with data structure to use and how to implement it by 2teez (Vicar) on Sep 25, 2014 at 04:29 UTC
Hi, Of course you can use hash like so: `use warnings; use strict; use Data::Dumper; my %data; while(<DATA>){ my ($server_name,$fault) = (split/:\|\s+/,$_)[0,2]; $data{$server_name}{$fault}++; } print Dumper \%data; __DATA__ Server1:london:network_interface Server1:london:diskspace Server1:london:diskpace Server1:london:kernel Server2:paris:diskspace Server3:new_york:Kernel Server3:new_york:diskspace Server3:new_york:diskspace Server3:new_york:kernel` [download] Output: `$VAR1 = { 'Server3' => { 'kernel' => 1, 'diskspace' => 2, 'Kernel' => 1 }, 'Server1' => { 'kernel' => 1, 'diskpace' => 1, 'diskspace' => 1, 'network_interface' => 1 }, 'Server2' => { 'diskspace' => 1 } };` [download] The rest will then be just to print out! Over to you. Update: Please note that 'kernel' is not the same with 'Kernel', neither is diskpace same with diskspace If you tell me, I'll forget. If you show me, I'll remember. if you involve me, I'll understand. --- Author unknown to me	[reply] [d/l] [select]
Re: help with data structure to use and how to implement it by McA (Priest) on Sep 25, 2014 at 04:31 UTC
Hi, IMHO you're on the right direction: Just add another hashref indirection and you're done: `my %REPORT = ( 'Server1' => { 'network_interface' => 1, 'diskspace' => 2, }, );` [download] You should get the idea. Best regards McA	[reply] [d/l]
Re^2: help with data structure to use and how to implement it by Anonymous Monk on Sep 25, 2014 at 04:40 UTC
Thanks a lot guys that's perfect. That's exactly how to do it. Been scratching my head for a while trying to work out the best data structure to hold this data in	[reply]