Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: complex sort

by grinder (Bishop)
on Dec 25, 2001 at 03:09 UTC ( [id://134245]=note: print w/replies, xml ) Need Help??


in reply to complex sort

I haven't taken the time to actually download and run this code, but it looks pretty good to me. A few suggestions:

  • You can chomp <DATA> straight away before passing it to the sort routine. The fact that you do so afterwards, and don't match whitespace before the $ anchor makes me wonder whether the regexp really matches anything.
    sort {...} chomp <DATA>;

    Juerd correctly points out that you can't chomp DATA. In actual fact, you don't even need to chomp at all. You're not actually doing anything to the records. Get rid of the chomp and the -l switch and be done with it.

  • The sort function looks sufficiently complicated to merit using a Schwartz Transform to perform the split only once (split to a list and sort on the different elements).
    The data you want to sort lend themselves beautifully to the Guttman Rosler Transform which is faster than the Schwartz Transform. (aside: the previous sentence shows clearly why the correct terminology is 'Schwartz Transform', not 'Schwartzian Transform'). Here is the code to do just that:
    print map { substr( $_, 16 ) } sort map { /^(\d+) # digits of type (\D+) # type character (\d+) # part (\D*) # detail (optional) R(\d+) # revision count B?(\d*) # alternate count (optional) \.(vec|ras) # file extension \s+$ # trailing whitespace /x ? sprintf( '%3s%03d%03d%2s%03d%1d%1s', # numbers add up to + 16 $7, 999 - ($6 || 0), $1, $2, $3, $5, $4, ) . $_ : ('x' x 16) . $_ } <DATA>;

    The idea is that you add a prefix to the data you want to sort, in order to be able employ a bare sort. Once the array hits the sort code, you are running a C speed until the sort is done. No more perl op-codes for this baby. At the other end of the sort, you throw away the prefix.

    Note how I create the inverse of the alt count so that the normal compare still works. (The sprintf may have to be tailored to suit). Also note how I create a dummy prefix in case the regexp fails. For debugging, comment out the map that strips off the prefix.

  • On the question of readability/maintainability, I would use the extended regexp syntax in order to comment what you're looking for.

  • If you can use the // idiom to represent a regexp, then do so. Using m!! is unsettling.

  • For a sort subroutine that big, name it. I.e.,
    sub part_sort { ... } sort part_sort <DATA>;
    At least that way you can then set a breakpoint easily with b part_sort to see why the silly thing isn't working.

  • You don't need the for block at all
    print sort part_sort <DATA>;
    will do the job just as well.

  • I don't presume to understand your job, but looking at the results, is the type subordinate to the layer or is it the other way around? I guess it's one of weirder naming schemes I have come across.

  • As for your documentation, simply refer to http://www.perlmonks.org/index.pl?node_id=134235 in the comments. :)
--
g r i n d e r
just another bofh

print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u';

Replies are listed 'Best First'.
Re: Re: complex sort
by Juerd (Abbot) on Dec 25, 2001 at 03:15 UTC
    You can chomp <DATA> straight away before passing it to the sort routine. [...]
    [...]
    sort part_sort chomp <DATA>;
    <DATA> is immutable, so you cannot chomp it (chomp actually tries to modify what it gets - it returns the number of removed characters, not a list of chomped strings).

    #!/usr/bin/perl print chomp <DATA>; __DATA__ This piece of code will trigger the following compilation error: Can't modify <HANDLE> in chomp at - line 2, near "<DATA>;" Execution of test aborted due to compilation errors.

    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://134245]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-20 06:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found