in reply to Re: What is the best way to handle this data?
in thread What is the best way to handle this data?

Thanks for the reply. I am trying to think of a way to describe the data. Let's say for instance we are talking about used car dealerships. this would be the data:

lot1,new,honda,civic,cincinnati,oh
lot1,used,chevy,impala,cincinnati,oh
lot1,new,honda,civic,cincinnati,oh
lot1,used,chevy,impala,cincinnati,oh
lot1,new,cadillac,escalade,cincinnati,oh
lot2,new,buick,sentry,houston,tx
lot2,used,ford,ranger,houston,tx
lot2,new,buick,sentry,houston,tx
lot2,used,ford,ranger,houston,tx
lot2,used,ford,ranger,houston,tx
lot3,new,ford,ranger,lexington,ky
lot3,used,cadillac,escalade,lexington,ky
lot3,used,cadillac,escalade,lexington,ky
lot4,new,ford,f150,chicago,illinois
lot4,new,ford,f150,chicago,illinois
lot4,new,ford,f150,chicago,illinois
as you can see there are different lots. each lot has different makes and models, but each lot has a unique city/state. I dont care if they are new or used. and i dont care about the model. I need to know what lots have how many different makes, and the city/state of that lot. so, the output might be...

lot1,cincinnati,oh has 2 hondas, 2 chevys, 1 cadillac
lot2,houston,tx has 2 buicks, 3 fords
lot3,lexington,ky has 1 ford, 2 cadillacs
etc...

that is as best as I can explain it. I hope that helps.. thanks!
  • Comment on Re^2: What is the best way to handle this data?

Replies are listed 'Best First'.
Re^3: What is the best way to handle this data?
by toolic (Bishop) on Apr 28, 2009 at 17:09 UTC
    Your input is simple enough that you could parse it using split, but for anything more complex, you should heed kennethk's advice. Here is the parsing piece; I'll leave the printout as an exercise for you:
    use strict; use warnings; use Data::Dumper; my %data; while (<DATA>) { chomp; my ($lot, undef, $make, undef, $city, $state) = split /,/; $data{$lot}{$make}++; $data{$lot}{location} = "$city,$state"; } print Dumper(\%data); __DATA__ lot1,new,honda,civic,cincinnati,oh lot1,used,chevy,impala,cincinnati,oh lot1,new,honda,civic,cincinnati,oh lot1,used,chevy,impala,cincinnati,oh lot1,new,cadillac,escalade,cincinnati,oh lot2,new,buick,sentry,houston,tx lot2,used,ford,ranger,houston,tx lot2,new,buick,sentry,houston,tx lot2,used,ford,ranger,houston,tx lot2,used,ford,ranger,houston,tx lot3,new,ford,ranger,lexington,ky lot3,used,cadillac,escalade,lexington,ky lot3,used,cadillac,escalade,lexington,ky lot4,new,ford,f150,chicago,illinois lot4,new,ford,f150,chicago,illinois lot4,new,ford,f150,chicago,illinois

    which prints out;

    $VAR1 = { 'lot3' => { 'location' => 'lexington,ky', 'ford' => 1, 'cadillac' => 2 }, 'lot1' => { 'location' => 'cincinnati,oh', 'cadillac' => 1, 'chevy' => 2, 'honda' => 2 }, 'lot2' => { 'location' => 'houston,tx', 'ford' => 3, 'buick' => 2 }, 'lot4' => { 'location' => 'chicago,illinois', 'ford' => 3 } };
      awesome. that works great. one more quick question... what would be the best way to remove any trailing/leading whitespace from the elements, as some of them have it? i know how to remove leading/trailing whitespace... but is there a way to do it on the current hash?
        You should remove whitespace before loading data into the hash, not after it is in the hash, especially if the hash keys have whitespace.