Soko has asked for the wisdom of the Perl Monks concerning the following question:

Hello, keepers of Perl Wisdom.

I have the following file I'm trying to read in and then use for configuration info:

0,corkscrews 0,openers 0,toppers 1,glassware 1,stemware 1,tableware 2,bags 2,baskets, 2,boxes 3,books 3,magazines 3,postcards 3,clothing 4,canles 4,candlesticks 5,artwork
These elements need to be kept together, as they are ProductTypeNumber and ProductTypeName. As you can see, keeping this info in a hash won't work, since I'd have duplicate keys. An array of arrays would seem better. This isn't much of a problem, except for the fact that the user might be able to edit the text file and screw up the ordering. I'd like to be able to read this all in and sort the data by element 1, then element 2. Being a relative n00b to Perl, I'm not sure how to apply array refernces and/or the <=> method in sort to get the data straight. Google tuned up nil, as did looking through the code snippets (well, nothing seemed to fit my problem very well). Any suggestions would be greatly appreciated.

Soko

Replies are listed 'Best First'.
Re: 2 dimensional array sorting...
by Zaxo (Archbishop) on Mar 27, 2003 at 08:22 UTC

    Better referred to as elements 0 and 1, you can sort as you want with the or operator. <=> is for numeric comparison, cmp for strings. If @data contains your AoA,

    my @sorted_data = sort { $a->[0] <=> $b->[0] or $a->[1] cmp $b->[1] } @data;

    After Compline,
    Zaxo

      Thanks, Zaxo. That snippet did it for me.

      I'm using the number portion to do two things. First, it builds a longer product number. Second, there are actually several configuration files in a sort of tree structure. These files are keyed by the first number. If the user selects "toppers" from the list, then toppers is inserted into my BrowseEntry widget and the program goes on to read "style0.txt" in, which is another file of the same structure which contains various styles of toppers, corkscrews and openers. Using a 2d array seemed simpler.

      I might migrate this to my PostgreSQL database, but I wanted it easy for people to edit the product build info even on a Windows machine. If anyone is interested in the code snippets, I'll post the relevant parts. Thanks to all that replied!

      Soko

Re: 2 dimensional array sorting...
by Cody Pendant (Prior) on Mar 27, 2003 at 08:27 UTC
    >keeping this info in a hash won't work, since I'd have >duplicate keys

    I'm looking at your info now, and, call me crazy, but you wouldn't have duplicate keys if you used the second item, the baskets and box and whatever, as the key, only if you used the numbers.

    And you can sort a hash by its keys or its values, so ... what's the problem exactly?
    --

    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
Re: 2 dimensional array sorting...
by Lhamo Latso (Scribe) on Mar 27, 2003 at 09:09 UTC
    Personally, I would think of this as an Item Catalog or Item Categories. Category 1 has a few items in it, Category 2 also. With this one-to-many relationship in mind, you could create a hash of arrays.
    use strict; my %HoA; $HoA{0} = [ qw( corkscrews openers toppers ) ]; $HoA{1} = [ qw( glassware stemware tableware ) ]; # From there, sorting the keys is easy, the values is more complex. foreach my $category (sort keys %HoA) { print "Category $category contains:\n"; foreach my $houseware (sort {$a cmp $b} @{ HoA{$category} } ) { print ">>$houseware\n"; } }
      ...this isn't necessarily one-to-many, could be many-to-many..as it is a typical line-item table, ie it's a composite primary key made up of 2 foreign keys. If this database uses or will use any other tables besides this one, then I'd recommend to switch to using a database system now. This will simplify and standardize all your data handling (through SQL,) and keeps the database consistant according to relational rules...so you don't end up with widowed records etc. Chris
Re: 2 dimensional array sorting... (no sort)
by tye (Sage) on Mar 27, 2003 at 16:16 UTC

    It sounds to me like this is not a job for Perl's sort function at all (despite the impressions of the original poster and most of those who have replied so far).

    It looks to me like you just want to "sort" the items into bins, not really "sort" them into a specific order.

    my @data; while( <DATA> ) { chomp; my( $num, $name )= split /\s*,\s*/, $_, 1; push @{$data[$num]}, $name; } my $num= 0; for my $byNum ( @data ) { for my $name ( @$byNum ) { print "$num,$name\n"; } $num++; }
    I might well switch from @data to %data since I suspect it would be difficult to keep type numbers strictly sequential.

    If you do want to sort within each numbered group, then simply add:

    for my $byNum ( @data ) { @$byNum= sort @$byNum; }

                    - tye
Re: 2 dimensional array sorting...
by Anonymous Monk on Mar 27, 2003 at 10:30 UTC
    try an array of hashes:
    my @config = [ { number => 0, name => corkscrews }, { number => 0, name => openers } ]; sub sort_products { sort { $a->{number} <=> $b->{number} or $a->{name} cmp $b->{name} } @_; }
    the reason for avoiding the array of array solutions is that an array of arrays doesn't say anything about the data, and in fact implies some kind of grid structure, which isn't what you have.
Re: 2 dimensional array sorting...
by jonadab (Parson) on Mar 27, 2003 at 14:13 UTC

    Yet another solution: composite key...

    while (<INFILE>) { my $key = $_; my ($num, $name) = ($_ =~ /(\d+)\s*[,]\s*(.*?)\s*$/); my %rec = { productnumber => $num, productname => $name, }; $product{$key} = \%rec; }

    Now you can sort asciibetically keys %product, or if need be do something fancier by sorting on the fields in the records, without any changes to your basic data structure. You can also add in additional fields to the records (price, whatnot), again without changing the fundamental way that you structure your data.


    for(unpack("C*",'GGGG?GGGG?O__\?WccW?{GCw?Wcc{?Wcc~?Wcc{?~cc' .'W?')){$j=$_-63;++$a;for$p(0..7){$h[$p][$a]=$j%2;$j/=2}}for$ p(0..7){for$a(1..45){$_=($h[$p-1][$a])?'#':' ';print}print$/}
Re: 2 dimensional array sorting...
by zenn (Sexton) on Mar 27, 2003 at 13:13 UTC
    Why don't you use the name as the key? That way you won't have duplicate keys in your hash. Then you have to options to keep the sort order. Create a array of ProductTypeName in the right order, or create index hash with order number as the key and ProductTypeName as the other element.
Re: 2 dimensional array sorting...
by leriksen (Curate) on Mar 28, 2003 at 02:16 UTC
    You could do what you want with a one dimensional array, just treat the line as a whole. Dont split at the comma, until you are ready to output a group or an item. e.g.

    open(FILE, $name) or die "orrible death $!"; my @data = sort (<FILE>);

    The lines are now in the array in alpha order, and you can operate on the array with split etc without needing to be worried about the extra dimension.

    For output you could write some sub's for output_group, output_item, output_group_number etc

Re: 2 dimensional array sorting...
by Anonymous Monk on Mar 29, 2003 at 00:53 UTC
    Hello fellow monks and novitiates. What Soko is asking for is a hash that allows for duplicate keys. If you have had any experience with the SGI version of the STL C++ library, you will know exactly what Soko wants. I have looked at the same problem and walked away disappointed. Perl does not give you this capability for free. Here is a small program that illustrates Soko's problem:
    use English; use strict; my $key; my $value; print "HASH WITH DUPLICATES...\n"; my %mapdup = ( 1 => "one", 2 => "two", 5 => "five", # English 1 => "un", 2 => "deux", 5 => "cinq", # French 1 => "ein", 2 => "zwei", 5 => "funf", # German 1 => "bir", 2 => "iki", 5 => "bes", # Turkish 1 => "yi", 2 => "er", 5 => "wu", # Mandarin ); while (($key, $value) = each %mapdup) { print "($key, $value)\n" if defined $value; print "($key, <undef>)\n" if not defined $value; }
    What Soko wants is that when he looks up a hash entry using a specific key, 5 for example, he gets multiple values. Perl does not do this. Instead, he gets the most recently defined value for key 5, "wu" in this instance. The others are gone.

    Allowing hashes with duplicate keys would be a great way to improve Perl. For now, we have to work around its limitations to acheive the same result. Here are some things that you could try. Adding to and removing values from these hash entries is left as an exercise for the reader. :-) You may want to look at the perllol and perldsc manpages.

    use English; use strict; my $key; my $value; my $i; print "\nHASH OF LISTS\n"; my %maplist = ( 1 => [ ("one", "un", "ein") ], 2 => [ ("deux", "er", "two") ], 5 => [ ("five", "cinq", "bes", "funf") ], ); while (($key, $value) = each %maplist) { print "($key, @{$value})\n" if defined $value; print "($key, <undef>)\n" if not defined $value; } print "\nHASH OF ARRAYS\n"; my %maparray = ( 1 => ["one", "yi"], 2 => ["zwei", "iki", "two"], 5 => ["cinq", "five"], ); print "\n"; foreach my $elem (sort(keys %maparray)) { print "$elem: @{$maparray{$elem}}\n"; } print "\nHASH OF HASHES\n"; my %mapmap = ( 1 => { english => "one", french => "un", turkish => "bir" } +, 2 => { french => "duex", english => "two", mandarin => "er" } +, 5 => { german => "funf", french => "cinq", english => "five" } +, ); print "\n"; foreach my $elem ( sort {$a <=> $b} (keys %mapmap) ) { print "$elem: { "; foreach my $item (sort keys %{$mapmap{$elem}}) { print "$item=$mapmap{$elem}{$item} "; } print "}\n"; }

      Allowing hashes with duplicate keys would be a great way to improve Perl.

      Sorry Anonymonk, but this doesn't make much sense (to me)?

      If the hash you show at the top of your post retained duplicate keys, and I printed print $mapdup{1}; what would be printed?

      What you are showing is not a hash with duplicate keys, it's a hash with multiple values for a given key.

      The answer to the question above is either you want all the values for the specified key to be printed, in which case, use a HoA as you show, or even just concatenating the values into a single string if you always wanted to use them as a single entity.

      However, given the values and structure you show in your %mapdup hypothetical duplicate-keys array, what you actually want is a 2d matrix of data, with the individual values selectable by two keys.

      1) the numeric value. 2) the language.

      You show several ways that this could be done, though your "HASH OF LISTS" and "HASH OF ARRAYS" are--barring a little unnecessary punctuation on the former--identical.

      However, if the need is to map all numeric values to their textual equivalents in each of the languages, which isn't what you have shown, but it would seem a strange optimisation to omit some numbers--as you may need them later--and an even stranger thing to store different subsets of numbers in each of the languages.

      You did miss what would seem to me the most obvious solution given that one key is numerical and the other textual a HoA's where the numeric value is a direct index to the required value.

      my @map = { English=>[ qw[one two three four five] ], French=>[ qw[ une deux troix quatre cinq] ], ... }; print $map{French}[21]; # Gives deux

      Which seems to encapsulate the requirements about as concisely as it possible to do. If you really are concerned about space and will never use 'three' or 'four', you could always set these values to 'undef' which would reduce the storage requirements.

      I wouldn't consider this "working around a limitation", just using the tools provided in the right way.

      If you have a better example of where "duplicate keys" would be useful, I'd like to see it. Off the top of my head I find it difficult to envisage an application for them or even how they might work.


      Examine what is said, not who speaks.
      1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
      2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
      3) Any sufficiently advanced technology is indistinguishable from magic.
      Arthur C. Clarke.