All you need is a basic cmp sort (for domain name based URLs):
print for sort <DATA>; __DATA__ http://google.com http://google.com/groups http://google.com/groups/deeper http://msn.com http://msn.com/groups http://msn.com/groups/deeper http://apache.org http://apache.org/docs http://apache.org/docs/mod_perl
Which gives:
http://apache.org http://apache.org/docs http://apache.org/docs/mod_perl http://google.com http://google.com/groups http://google.com/groups/deeper http://msn.com http://msn.com/groups http://msn.com/groups/deeper
For numerical addresses you need to sort on 1) the integer representation of the 4 byte value that corresponds to the IP address then 2) the rest of the URL (if any). This is a little more complex and uses a Schwartzian transform for efficiency. I have assumed dot quads - it you have to deal with other stuff like "127.1" and all the other types of valid IPs Use Socket; my ($ip) = unpack "N", inet_aton($1) This will probably be a little slower than the raw unpack/pack/split presented.
my @data = qw( http://3.3.3.3/docs/mod_perl http://3.3.3.3/docs http://3.3.3.3 http://2.2.2.2 http://10.1.1.1 http://11.1.1.1 http://2.2.2.2/groups http://2.2.2.2/groups/deeper http://1.1.1.1/groups/deeper http://1.1.1.1/groups http://1.1.1.1 http://1.1.1.2 http://1.1.2.1 ); #use Socket; @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] || $a->[2] cmp $b->[2] } map { munge_url($_) } @data; print "$_\n" for @sorted; sub munge_url { my $addr = $_[0]; $addr =~ m!^(?:\w+://)?([^/]+)/?(.*)$!; # convert dot quad to a sortable integer my ($ip) = unpack 'N', pack 'C4', split '\.',$1; # or unpack 'N', +inet_aton($1); my $rest = $2 || ''; print "$ip $rest\n"; return [ $_, $ip, $rest ] } __DATA__ http://1.1.1.1 http://1.1.1.1/groups http://1.1.1.1/groups/deeper http://1.1.1.2 http://1.1.2.1 http://2.2.2.2 http://2.2.2.2/groups http://2.2.2.2/groups/deeper http://3.3.3.3 http://3.3.3.3/docs http://3.3.3.3/docs/mod_perl http://10.1.1.1 http://11.1.1.1
There is no logical relation between fqdns and dot quad IPs (sort wise) until you resolve the IPs to fqdns.
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
In reply to Re: Sorting URLs on domain/host: sortkeys generation
by tachyon
in thread Sorting URLs on domain/host: sortkeys generation
by parv
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |