evangraj has asked for the wisdom of the Perl Monks concerning the following question:

I need to develop a peice of code, a function, or object that builds a hash of arrays from a subset of arrays comming from a larger hash of arrays based on a list. My current application reads in a file and builds a few arrays from the elements of the file. For example, each line of the file has a country listed, a client, and a identifier. As I read in the file I populate arrays for each of these catagories with unique values. For the country catagory I need to know all of the countries that events happened in. here is the sub:
# ----Build_Country_List ----------------- sub Build_Country_List ($$) { my $j = shift; my $country = shift; print "build country list\n"; $count = 0; if ($j == 0) { print "initial loop\n"; $CountryList[$j] = $country; foreach $tradingcountry (@CountryList) { print "array $j $tradingcountry\n"; } $LengthofCountryList = 1; } else { foreach $tradingcountry (@CountryList) { if ($tradingcountry eq $country) $condition = "FALSE"; $count++; } else { $condition = "TRUE"; } } } if (($condition eq "TRUE") && ($count == 0)) { $LengthofCountryList = scalar(@CountryList); $CountryList[$LengthofCountryList] = $country; } foreach $tradingcountry (@CountryList) { print "country elements $tradingcountry\n"; } $LengthofCountryList = scalar(@CountryList); } # end sub Build_Country_List
I do the same thing for client, and identifier. What I would like to do is build a function which takes as imput a hash of arrays and uses this country array to extract array elements out of the larger array based on the elements of the country array. basically I want to use an array as a filter to build a reduced hash of arrays from a bigger one..... Another bit of code builds an hash of arrays (my first!) and I would like to use the structure produced to filter a larger array witrh the same idea as above. Any suggestions, or condensations, would be appreciated.
sub Build_totals_shares_for_sedol_Sell_side($$$) { #$j = shift; my $sedol = shift; my $shares = shift; $condition = ""; my $mySellfirstflag = shift; if ($mySellfirstflag eq "YES") { my $j = 0; } else { my $j = 1;} print "$sedol $shares\n"; $count = 0; if ($j == 0) { %Sellhash = ($sedol => $shares); #print "$Sellhash{key} \n"; foreach $sedolelement (keys %Sellhash) { print "Sell: The $sedolelement has traded\n"; foreach (@{$Sellhash{$sedolelement}}) { print "$_\n"; } } #$NumTradingSedols = 1; } else { #print "this is j $j\n"; foreach $sedolelement (keys %Sellhash) { #print "the key is: $sedolelement te sedol is $sedol\n"; if ($sedolelement eq $sedol) { $condition = "TRUE"; print "the sedol is in the list so add the shares.\n"; $usekey = $sedolelement; $oldshares = $Sellhash{$sedolelement}[1]; } else { $condition = "FALSE"; $count++; } } if (($condition eq "FLASE") && ($count != 0)) { $Sellhash{"$sedol"} .= $shares; print "the new entry is $sedol $shares\n"; } else { $Sellhash{"$sedol"} = $Sellhash{"$sedol"} + $shares; $newtotal = $Sellhash{"$sedol"}; print "the new total shares for sedol: $sedol is $newtotal +\n"; } foreach $sedolelement (keys %Sellhash) { $total = $Sellhash{"$sedolelement"}; } } }
Has anyone done this..... where should I start .... I appologize for my novice in advance! Evan

Replies are listed 'Best First'.
Re: Building a Hash of arrays from a hash of arrays based on a list?
by goldclaw (Scribe) on Jan 26, 2002 at 03:10 UTC
    OK, I havent really tried condesing someone else's code before, but one time has to be the first, right. As for the question you asked, I will need some more info on the general structure of your hash of arrays and filter array to give some specific advice, but you might want to take a look at map and grep.

    Ok, lets have a look at your sub, first Build_Country_List. First do yourself a favor and use strict and warnings. You have one variable, $condition that I cant really see what does. Anyway, the sub seems to take 2 arguments, a boolean which is supposed to say if this is the first time the function is called, and a string that is to be inserted into the list if the country is not allready there.

    The list seems to be some sort of global variable, so Ill keep it like that. usually not a good idea though, but for a throwaway script....

    There are also lots of prints there, which I guess are debug statements. If so, I allways do something like this at the start of the module:

    sub debug{ print @_; }

    That way, the fact that it is debug statement are easy to see, and redirecting all to file, or removing them all can easily be done from one place.

    There are actually 2 different ways I would have implemented your first sub, depending on requirements(speed vs memory usage):

    This one uses some extra memory, but should be faster:

    my %__privateIndexHash; #key = country read, value is allways 1 my @CountryList; sub Build_Country_List($){ #I do not really care about the boolean, Ill figure this out for my +self. my $country=shift; if(!exists $__privateIndexHash{$country}){ #Havent seen country before, insert into array and hash index $__privateIndexHash{country}=1; push @CountryList, $country; } }

    The second method does not use the hash index, but instead look trough the list each time.
    my @CountryList; Build_Country_List($){ my $country=shift; push @CountryList, $country unless grep { $_ eq $country} @CountryL +ist; }

    Did you get what I did here? Did I get what you tried to do? See if you understand what I did, and let me know if I misunderstood you. Feel free to ask questions while I study the next sub of yours.

    Update Ok, here is a stab at the seond function. I have moved a little around on the if/else's and removed the foreach. Does this still do what you mean? I assume that $total is some sort of global variable.

    my $total; sub Build_totals_shares_for_sedol_Sell_side($$$) { my $sedol = shift; my $shares = shift; my $mySellfirstflag = shift; if ($mySellfirstflag eq "YES") { %Sellhash = ($sedol => $shares); #Do you mean that with mySellfirst that we should replace the shar +es, #If so, the above line should be $Sellhash{$sedol}=$shares. #As it stands now, we replace the entire hash!! print "Sell: The $sedol has traded\n$share\n"; } else { if (exists $Sellhash{$sedol}) { print "the sedol is in the list so add the shares.\n"; $Sellhash{"$sedol"} += $shares; my $newtotal = $Sellhash{"$sedol"}; print "the new total shares for sedol: $sedol is $newtotal\n"; } else { $Sellhash{$sedol} = $shares; #You did a .= here was that a typo? print "the new entry is $sedol $shares\n"; } $total=0; $total += $_ foreach values %Sellhash; } }
    It can probably still be condensed some, but I think I need your feedback on some of the assumptions I have made first.

    GoldClaw

      GoldClaw:
      Thank you for your help.

      I will clairify a bit but it looks like grep and map will help me.

      What I do is this:
      I have a list of trades in a tab seperated file.
      The first and second line of the file looks like:
      Date Sedol A/C # Country Currency B/S Shares Price (local) Cust Comm Rate Gross Comm (local) PNL (local) fx rate Gross PNL (USD) PNL (USD) tkts
      07/04/01 6777007 1022578221 ZA ZAR S 10000 69.900000000 30.00 2097.00 1258.20 8.00 262.09 157.26 1.00

      I need to calculate the profit and loss (PnL) from these trades.
      The output needs to be things like:
      1. a line for each client with the totals of each of the columns in the input file and some others which need to caluated. Client totals.
      2. Total PnL for every client for a given country. Client vs country table with PnL totals as entries. this table is for a given date.
      Process:
      1. Based on the date range (user input) I collect all of the input files, cat them togeter into one.
      2. open the file and line by line (trade by trade) I build the lists up. Client List, Country list, Total shares for a given Sedol list. (this is as far as I have gotten)(first perl application) :)
      3. Now I want to put all of these trades in a big hash of arrays and say for all the countries in the traded country list build me a hash of arrays that countain trades done in those countries. (This is why there can not be two of the same countires listed.)
      I want to pass this hash of arrays to the same function with the client list and have it return a hash of arrays for each client.
      The I want to pass this hash of arrays to my PnL function and have it return the same hash of arrays with some new elements which are the things I need to print out to a file. PnL Costs charges etc....


      This was the idea.

      I wanted the country list, client list and total shares by sedol (this is the number form of a ticker) (actually I need to add date on to the array in the sedol total share function for each sedol total share array) to be global since I will need the for a lot of things.

      The first sub ... build country list I think you understand and I see what you are doing. I thought that I would have to initialize the array by putting in one element the first time it was called. You are building a hash table with keys as the country and then just asking if if the country is allready in there and if not pushing it into an array.
      Thanks!

      The second sub it trying to do the same basic thing ... as I am reading in lines from the input file I want to build a structure, a hash of arrays, that contains the sedol the date of the that trade (which I can pass in as $date), and the total number of shares traded for that sedol. I need this later to caculate PnL for a given day. This was my first attempt at a hash.

      IN reference to the code below:
      The condition $mySellfirstflag is again trying to initialize the hash for the first time the sub is called. Same basic structure as the build client list.
      I am trying to use the sedol as the key so that later I can find the total number of shares traded for a given date based on that sedol.
      if ($mySellfirstflag eq "YES") { %Sellhash = ($sedol => $shares); #Do you mean that with mySellfirst that we should replace the shar +es, #If so, the above line should be $Sellhash{$sedol}=$shares. #As it stands now, we replace the entire hash!!
      Again in reference to the code below:
      I thought that is how you assign the value to a given key???
      $Sellhash{$sedol} = $shares; #You did a .= here was that a typo?
      Any thoughts?


      Chris:
      Is your @really_big_array an array of hashes? or what is this syntax? ... does smaller_array get filled with a list of fruits which point to apples?
Re: Building a Hash of arrays from a hash of arrays based on a list?
by cfreak (Chaplain) on Jan 26, 2002 at 03:18 UTC
    Hmmm well there's a lot of code and I'm not sure what you are trying to do exactly but lets see if I can show you something that tackles your problem on a smaller scale.
    my @really_big_array = ({fruit=>apple},{vegitable=>'lettuce'},{fruit=> +'oranges'},{vegitable=>'carrot'}); # put all entries with a 'fruit' key in a new array my @smaller_array = (); foreach(@really_big_array) { push(@smaller_array,$_) if($_->{fruit} ne ""); }

    If you didn't want to do it by key you could test the value of the key against something, my example if you wanted all the apples (say you have another key for type). You could do:

    push(@smaller_array,$_) if($_->{fruit} eq "apple");

    Hope that helps, Reply if it doesn't make sense
    Chris