Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Rosetta Code: Long List is Long :awk(1)+sort(1)

by marioroy (Prior)
on Dec 12, 2022 at 13:47 UTC ( [id://11148782]=note: print w/replies, xml ) Need Help??


in reply to Re: Rosetta Code: Long List is Long :awk(1)+sort(1)
in thread Rosetta Code: Long List is Long

I tried LANG=C and sorting individually (two sorts).

Results from a Linux box:

54 seconds LANG=en_US.UTF-8 33 seconds LANG=C sort -k2,2rn -k1,1 23 seconds LANG=C sort -k1,1 | sort -k2,2rn

Testing:

#!/bin/sh # https://www.perlmonks.org/?node_id=11148773 if [ $# -eq 0 ]; then printf "Give a list of files to sort.\n" >&2 exit 1 fi LANG=C awk ' { cat_count[ $1 ] += $2 } END { for ( cat in cat_count ) printf "%s\t%s\n", cat, cat_count[ cat ] } ' $@ \ | sort -k1,1 | sort -k2,2rn printf "total time: %d s\n" $SECONDS >&2

Replies are listed 'Best First'.
Re^3: Rosetta Code: Long List is Long :awk(1)+sort(1)
by parv (Parson) on Dec 12, 2022 at 18:43 UTC

    Good point about using LANG=C (to make for a fairer comparison for I had set ascii encoding to parse the input (but not during sorting🤔) in my Python version).

    With that change, takes ~99 s; that and 2 sorts, takes ~60 s.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11148782]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (2)
As of 2024-04-26 03:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found