Sorting functions are not that hard - don't be afraid, read the docs, try a simple one or two.

However the logic on something like this can get involved. I often like to transform the original value into a string that will sort easily using a very simple sort function.

In this case, I transform all atoms into a string that will sort ascabetically. A regex matches the different components of the name. Then we look up the sort values from a couple of handy hashes. Fill in default values (no value for a component means sort early).

Caveat: I know nothing about atoms, so I'm working purely from your description. I've two weighting hashes in the script. The one called 'main' doesn't correspond exactly to the 'main chain' bits described, but rather corresponds to the bit that isn't a greek or number part.

Well, the explanatory paragraph may not make sense, but hopefully the code will be clearer:

#!/usr/bin/perl -w use strict; my @unordered = qw( 2HB 3HB C CA CB CG CD1 CD2 CE1 CZ CE2 HE2 HE1 HH HD1 HD2 N O OH ); # weighting hashes my %main = ( N => '1', CA => '2', C => '3', O => '4', S => '5', P => '6', H => '7', ); my %greek = ( B => '1', G => '2', D => '3', E => '4', Z => '5', H => '6', ); sub sortable { my $atom = shift; # get components $atom =~ /(\d)?(N|CA|C|O|S|P|H)(B|G|D|E|Z|H)?(\d)?/; my ($left,$main,$greek,$right) = ( $1, $2, $3, $4 ); # get weights or use default early sorting values $main = $main ? $main{ $main } : '0'; my $main_alone; # need to flag if main element is alone if ( $greek ) { $main_alone = '1'; $greek = $greek{ $greek }; } else { $main_alone = '0'; $greek = '0'; } $left ||= '0'; $right ||= '0'; # I assume that left numbers take precedence over right "$main_alone$main$greek$left$right"; } # use the string "cmp" operator - we are comparing strings, # not numbers. We could have populated # the weighting hashes with letters instead of numbers my @ordered = sort { sortable( $a ) cmp sortable( $b ) } @unordered; # if you have to get the value out of an object it would be: # @ordered = sort { sortable( $a->AtomName ) cmp sortable( $b->AtomNam +e ) } @unordered; print join "\n", map { $_ ."\t". sortable( $_ ) } @ordered;

Which gives:

N 01000 CA 02000 C 03000 O 04000 CB 13100 CG 13200 CD1 13301 CD2 13302 CE1 13401 CE2 13402 CZ 13500 OH 14600 2HB 17120 3HB 17130 HD1 17301 HD2 17302 HE1 17401 HE2 17402 HH 17600

In reply to Re: sorting according to greek alphabet in roman letters by qq
in thread sorting according to greek alphabet in roman letters by seaver

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.