redss has asked for the wisdom of the Perl Monks concerning the following question:

What is the easiest way to accomplish this:

I want a script to read in names from a csv file like:

1 Charlie 1 Edward 2 Bob 2 Alfred 3 Barry

(Note that the numeric index is not unique)

And alphabetize by name and print output like:

2 Alfred 3 Barry 2 Bob 1 Charlie 1 Edward

What is the best type of perl structure to store this data in?

Replies are listed 'Best First'.
Re: How to sort?
by moritz (Cardinal) on Oct 14, 2007 at 17:45 UTC
    With sort, of course ;-)

    #!/usr/bin/perl use strict; use warnings; my @names; while (<DATA>){ chomp; m/^(\d+)\s+(.*)/; push @names, [$1, $2]; } @names = sort { $a->[1] cmp $b->[1] } @names; for (@names){ print "$_->[0] $_->[1]\n"; } __DATA__ 1 Charlie 1 Edward 2 Bob 2 Alfred 3 Barry
      m/^(\d+)\s+(.*)/; push @names, [$1, $2];
      Don't use $1 without testing whether the match succeeded or not. If you expect the match to always succeed, at least add an "or die". Failure to do so will give you a "stale" $1 when you least expect it.
      works great, Thanks!
Re: How to sort?
by mwah (Hermit) on Oct 14, 2007 at 17:45 UTC
    This is something where very idiomatic "perlish" solutions exist, like:
    # open the file open my $fh, '<', 'myfile.dat' or die "$!" # translate the data into a 2D array my @rows = map [ split /\s+/ ], <$fh>; # sort on any column my @sorted = sort { $a->[1] cmp $b->[1] } @rows; # print the result print "@$_\n" for @sorted;

    I think this will do it here. As in Re: How to sort? noted, this doesn't look like a real "csv" data file.

    Regards

    mwa

Re: How to sort?
by FunkyMonk (Bishop) on Oct 14, 2007 at 17:35 UTC
Re: How to sort?
by salva (Canon) on Oct 14, 2007 at 19:51 UTC
    use Sort::Key qw(keysort); my @data = <>; my @sorted = keysort { (split /\s+/, $_, 2)[1] } @data; print @sorted;
Re: How to sort?
by CountZero (Bishop) on Oct 14, 2007 at 20:46 UTC
    Assuming it is indeed a CSV-file (and you just forgot to add the commas:
    use strict; use DBI; my $dbh = DBI->connect('dbi:AnyData:'); $dbh->func( 'test', 'CSV', 'test.csv', {sep_char => ',', eol => "\n", col_names => 'number,name', }, 'ad_catalog', ); my $sth = $dbh->prepare("SELECT number, name FROM test ORDER BY name") +; $sth->execute(); while ( my $row = $sth->fetch ) { print "@$row\n"; }
    Which prints:
    2 Alfred 3 Barry 2 Bob 1 Charlie 1 Edward

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: How to sort?
by perlfan (Parson) on Oct 14, 2007 at 19:05 UTC
    What is the best type of perl structure to store this data in?

    A hash, using each name as a key

    1. read line, split on the space, then put in hash with name as the key pointing to the number as the value
    2. sort on array returned by keys(%namehash) lexicographically
    3. using the sorted list of keys, reconstruct the original data, only sorted
      That assumes names are unique.
Re: How to sort?
by ercparker (Hermit) on Oct 15, 2007 at 02:15 UTC
    Here is yet another way to accomplish the task. Hopefully I understood your question and this gets you your expected results.

    update: this just accomplishes the sorting
    use strict; use warnings; print map { $_->[0] . "\n"; } sort { $a->[1] cmp $b->[1] } map { chomp +; [$_, /\d+\s+([a-zA-Z]+)/]; } <DATA>; __DATA__ 1 Charlie 1 Edward 2 Bob 2 Alfred 3 Barry
Re: How to sort?
by jdporter (Paladin) on Oct 15, 2007 at 01:02 UTC
    exec qq( sort -k 2 "$ARGV[0]" )

    (Doesn't work on all platforms. ;-)

    A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: How to sort?
by Cop (Initiate) on Oct 14, 2007 at 19:02 UTC

    If those numbers are always one-digit, then you can simply sort by the substring starting position 3 (1 based), and don't even bother to split or regexp.