in reply to Re: sort hash elements...
in thread sort hash elements...

Hi Particle,

Thanks! I tried out your code and got it works perfectly. Thanks for pointing '$_' out - I got confused with 'for' and 'foreach' and thought it was an index.

I don't understand the following line in your code and would appreciate it very much if you could explain how it works:

my %hash = map { split } @names_marks;
Another question I've is "What if I've two persons with the same name?". When I added another 'john' to the data, only one of them shows up. How do I avoid this problem?

Replies are listed 'Best First'.
Re: Re: Re: sort hash elements...
by particle (Vicar) on Mar 27, 2002 at 16:26 UTC
    first, i'll explain my code:
    my %hash = map { split } @names_marks;
    okay, let's break it down.

     @names_marks is a list of items, containing strings with two items seperated by a single space, such as 'keith 90'. you want to seperate these items.

    split, by default, will split on spaces, and will split $_. it's in the documentation: online at split, or command line at perldoc -f split. so split could be replaced with split ' ', $_

    i can be sure that split will return a list, since the doc (for perl 5.6.1) says scalar context is deprecated. since i know the data only contains one space, an item on either side, i can use these two items to assign to a hash.

    it's perfectly legal to say

    my %hash = ( 'bob', 1 ); # or my %hash = ( bob => 1 );
    where i'm assigning the first list element as a key in %hash, and the second list element as the key's value.

    on a sidenote--you can use Data::Dumper; to see into data structures. add <use Data::Dumper;</code> to the top of the script, and print Dumper [%hash]; after the hash is created. you'll see the keys and values. this is particularly helpful for debugging complex data structures.

    map is somewhat similar to for or foreach, when used to iterate over a list, see the doc: map. (by the way, for and foreach are the same, docs: for, foreach.)

    i could turn my map into a for:

    my %hash; for( @names_marks ) { %hash = split } # or, expanding split for( @names_marks ) { %hash = split ' ', $_ }
    note there's no semi-colon between the braces--this is okay because the final statement in braces does not require a trailing semi-colon.

    okay, i hope that helps you a bit. now, on to the problem of two persons with the same name. my questions to you are: how do *you* know who's who? if you're reporting who got what marks, and there are two johns, wouldn't each want to know which mark he received?

    in this case, you'll need unique names, like 'john1', or 'john q. public'. the former will work with my current implementation. the latter will require some coding changes. for instance, space is no longer a good seperator for @names_marks, because the names might have spaces in them. instead, use a pipe, or colon, or carat, or question mark, or exclamation point. make sure it's some character that won't show up in either data field.

    then, split the data on the seperator instead of space. hash keys can have spaces in them, so the rest of my example should still work. although, if this were important code, i'd recommend a more robust implementation than the one i've provided.

    ~Particle ;Þ

      Hi Particle,

      Many thanks for taking time to explain:)

      I now understand what "my %hash = map { split } @names_marks;" does - splits the array of elements into keys and values - each array element contains a name and a score delimited by space. The 'map' can be replaced by a 'for' loop to achieve the same result.

      With regard to name uniqueness, the program that I'm testing will need to parse a large number of entries (names of pupils for a particular schooling level and their scores for a certain subject). I want to be able to deal with situations where there is more than one person with the same name. My approach was to append a person's name to her score to make the name unique. But as you've pointed out, if the two persons have the same score, the name (key) will still not be unique. I'm still thinking how I could get around that problem. Any advice will be appreciated :)

      cheers,

      kiat
        perhaps you should consider giving each student a unique id number, and use that as the key for all your data. for instance:
        my %id_to_name = ( 1 => 'keith', 2 => 'rob', 3 => 'eric', 4 => 'rob', # etc. ); my %id_to_mark = ( 1 => 90, 2 => 52, 3 => 86, 4 => 71, # etc. );
        as long as you can match a unique id to a name, you can address the proper student's marks. if it's not possible to organize your data this way (for instance names are not given in the same order each time,) then you won't be able to implement it this way, using arrays or hashes.

        if you're tracking any particular student, you'll need something unique to identify her. if that information is not available in the data, there's no way to do it. if the information is there, find it, and use it for the key in your hash.

        if you're not tracking any particular student, and are trending the group as a whole, you can key the hash on the score, keep either a count of students with that score, or an array of student names as hash values. here's an example:

        my %scores_to_names = ( 90 => [ 'keith' ], 52 => [ 'rob', 'dave' ], # etc. );
        if you find this might be a way to go, look in perldsc (perl data structures cookbook) for examples. and use Data::Dumper to see what your structure looks like, it's really handy.

        ~Particle ;Þ p.s. if your data is given in the same order every time, perhaps you could implement this using arrays. the unique id would be the position in the array, and the values could be in different arrays, like @names, @marks, etc. or, the data could live in an array of arrays. again, find examples in perldsc. good luck!