Re^2: What is the best way to handle this data?

Thanks for the reply. I am trying to think of a way to describe the data. Let's say for instance we are talking about used car dealerships. this would be the data:

lot1,new,honda,civic,cincinnati,oh
lot1,used,chevy,impala,cincinnati,oh
lot1,new,honda,civic,cincinnati,oh
lot1,used,chevy,impala,cincinnati,oh
lot1,new,cadillac,escalade,cincinnati,oh
lot2,new,buick,sentry,houston,tx
lot2,used,ford,ranger,houston,tx
lot2,new,buick,sentry,houston,tx
lot2,used,ford,ranger,houston,tx
lot2,used,ford,ranger,houston,tx
lot3,new,ford,ranger,lexington,ky
lot3,used,cadillac,escalade,lexington,ky
lot3,used,cadillac,escalade,lexington,ky
lot4,new,ford,f150,chicago,illinois
lot4,new,ford,f150,chicago,illinois
lot4,new,ford,f150,chicago,illinois
as you can see there are different lots. each lot has different makes and models, but each lot has a unique city/state. I dont care if they are new or used. and i dont care about the model. I need to know what lots have how many different makes, and the city/state of that lot. so, the output might be...

lot1,cincinnati,oh has 2 hondas, 2 chevys, 1 cadillac
lot2,houston,tx has 2 buicks, 3 fords
lot3,lexington,ky has 1 ford, 2 cadillacs
etc...

that is as best as I can explain it. I hope that helps.. thanks!

Comment on Re^2: What is the best way to handle this data?

Replies are listed 'Best First'.
Re^3: What is the best way to handle this data? by toolic (Bishop) on Apr 28, 2009 at 17:09 UTC
Your input is simple enough that you could parse it using split, but for anything more complex, you should heed kennethk's advice. Here is the parsing piece; I'll leave the printout as an exercise for you: use strict; use warnings; use Data::Dumper; my %data; while (<DATA>) { chomp; my ($lot, undef, $make, undef, $city, $state) = split /,/; $data{$lot}{$make}++; $data{$lot}{location} = "$city,$state"; } print Dumper(\%data); __DATA__ lot1,new,honda,civic,cincinnati,oh lot1,used,chevy,impala,cincinnati,oh lot1,new,honda,civic,cincinnati,oh lot1,used,chevy,impala,cincinnati,oh lot1,new,cadillac,escalade,cincinnati,oh lot2,new,buick,sentry,houston,tx lot2,used,ford,ranger,houston,tx lot2,new,buick,sentry,houston,tx lot2,used,ford,ranger,houston,tx lot2,used,ford,ranger,houston,tx lot3,new,ford,ranger,lexington,ky lot3,used,cadillac,escalade,lexington,ky lot3,used,cadillac,escalade,lexington,ky lot4,new,ford,f150,chicago,illinois lot4,new,ford,f150,chicago,illinois lot4,new,ford,f150,chicago,illinois [download] which prints out; `$VAR1 = { 'lot3' => { 'location' => 'lexington,ky', 'ford' => 1, 'cadillac' => 2 }, 'lot1' => { 'location' => 'cincinnati,oh', 'cadillac' => 1, 'chevy' => 2, 'honda' => 2 }, 'lot2' => { 'location' => 'houston,tx', 'ford' => 3, 'buick' => 2 }, 'lot4' => { 'location' => 'chicago,illinois', 'ford' => 3 } };` [download]	[reply] [d/l] [select]
Re^4: What is the best way to handle this data? by terminaljunkie (Initiate) on Apr 28, 2009 at 19:16 UTC
awesome. that works great. one more quick question... what would be the best way to remove any trailing/leading whitespace from the elements, as some of them have it? i know how to remove leading/trailing whitespace... but is there a way to do it on the current hash?	[reply]
Re^5: What is the best way to handle this data? by toolic (Bishop) on Apr 28, 2009 at 20:08 UTC
You should remove whitespace before loading data into the hash, not after it is in the hash, especially if the hash keys have whitespace.	[reply]