Re: sort != sort
by perreal (Monk) on Oct 25, 2010 at 18:36 UTC
|
You can also use bucket sort for this, it can be faster. I.e.
my @sorted;
push(@{$sorted[_$->{'num'}-1]}, $_) foreach @Objs;
sorted(x) will then contain items with num=x+1 | [reply] [d/l] |
|
|
Wow, I like that -- thanks much!
- Guaranteed O(n) -- right now, the comparisons are about 120% of n with normative mergesort and 102% of n with quicksort.
- I can pop those into the new array meaning the memory use will remain flat (rather than temporarily doubling).
my @bucket = ();
while($#Notes >= 0) {
my $n = shift @Notes;
push @{$bucket[$n->{terms}-1]},$n;
}
#Need 1D array back:
while($#bucket >= 0) {
my $b = pop @bucket;
push @Notes, shift @$b while ($#{$b} >= 0);
}
Works perfect. | [reply] [d/l] |
Re: sort != sort
by Corion (Patriarch) on Oct 25, 2010 at 16:43 UTC
|
I can't reproduce that, at least on Perl 5.10.0.
#!perl -w
use strict;
use Data::Dumper;
my @Objs = map { +{num => int rand(500)} } 1..1000;
sub is_sorted {
my $last;
for (@_) {
if (defined $last and $last->{num} > $_->{num}) {
#print Dumper $last;
#print Dumper $_;
return 0
};
$last = $_;
};
return 1
};
print "Unsorted\t",is_sorted(@Objs),"\n";
print "Sorted\t\t",is_sorted(sort { $a->{num} <=> $b->{num} } @Objs),"
+\n";
@Objs = sort { $a->{num} <=> $b->{num} } @Objs;
print "Sorted in-place\t",is_sorted(@Objs),"\n";
Most likely, your data is not what you believe it is. So, please reduce your code and data to a short, self-contained example that still exhibits the behaviour.
Update: Changed cmp to <=>, as that's what the code should use to mirror the OP. | [reply] [d/l] [select] |
Re: sort != sort
by salva (Canon) on Oct 25, 2010 at 16:42 UTC
|
| [reply] [d/l] |
|
|
| [reply] [d/l] |
Re: sort != sort
by halfcountplus (Hermit) on Oct 25, 2010 at 17:20 UTC
|
WRT replicating it: I also tried prior to the post and cannot, I was hoping there might be an obtuse reason someone is aware of, since I am positive about the data, etc. Ie, this is not a coding error. Here's exactly the way I am verifying the issue right now:
if ($#Terms > 0) {
print "HERE\n"; # to confirm this takes place;
@Notes = sort { $b->{terms} <=> $a->{terms} } @Notes;
}
my @Sorted = sort { $b->{terms} <=> $a->{terms} } @Notes;
print "$#Notes $#Sorted\n";
for (my $i = 0; $i<=$#Notes; $i++) {
print $Notes[$i]->{terms}.$Sorted[$i]->{terms}."\n";
}
The output begins:
HERE
1008 1008
12
12
12
12
12
[..etc: nb, there is never anything other than 1 or 2]
As if "@Notes" were scoped locally in the if block. Again, this not normal behaviour nor can I replicate it in a simpler script. However, I do not see possibility of bad data affecting the above logic (please tell me specifically how you think it could, if you do). These are two versions of the same sort operation on the same array.
Interestingly, if I make a single change, by moving "my @Sorted = sort" to before the if block instead of after it (making it the first line in the above code) both arrays are the same.
I've done this and other tests numerous times here purely for this purpose, there is no possibility of me having "forgotten a change in the middle". But I'm a sceptic too: if you think this is impossible and I'm lying, I don't blame you ;)
| [reply] [d/l] [select] |
|
|
You could start by telling us the version(s) of Perl you've tested this against, by showing us what objects live in @Objs, and maybe dumping 5 of these objects. Reduce your dataset and code until the problem goes away, then put back in the last thing you took out.
It is highly unlikely that this is a bug in Perl, and far more likely that it is a bug in your code (or data), but it's very hard for us to guess about your code, as you don't show it.
| [reply] [d/l] |
|
|
v5.10.0 built for x86_64-linux-thread-multi
I would do that, except this is a fairly elaborate CGI which uses a database. The problem with removing code in this case is: what do I start removing? Random chunks? And if that worked, what would it mean? Eg, moving that sort line "fixes" the issue, but it does not provide me a clue as to why. It also very strongly implies to me that this is not my error and I will be wasting my time looking for a mistake that does not exist.
Anyway, there are a zillion (other) ways the integrity of the code is tested, I was just curious about the bizarre behaviour. As demonstrated, all the data relevant to the sort operation is in order. I simply do not see anything that could be affecting this. If I were working in C, I could see some chance of an unrelated data structure causing an overwrite of something (a code segment, whatever) that would explain this, but AFAIK that is not possible in perl.
The objects contain a lot of text, here's a dump with that edited out:
$VAR1 = bless( {
'body' => 'SNIP',
'href' => '<a class="ntitle" href="SNIP">',
'date' => '29 February 2004 ',
'terms' => '1',
'title' => 'Michael Shrimpton on Dr. Kelly'
}, 'PNSearch' );
$VAR1 = bless( {
'body' => 'SNIP',
'href' => '<a class="ntitle" href="SNIP">',
'date' => '9 September 2003 ',
'terms' => '2',
'title' => 'protester hit'
}, 'PNSearch' );
That's the first pair from the mismatched arrays (notice: 'terms' = 1 and 2). But I don't think there is anything to see with this.
I'm not expecting anyone to make a big effort here -- if no one is aware of any possibilities, don't worry about it.
| [reply] [d/l] |
|
|
|
|
|
Re: sort != sort
by SuicideJunkie (Vicar) on Oct 26, 2010 at 13:43 UTC
|
Another thing to consider is that you may have gotten some odd invisible characters typed in there. Try turning on the "view all characters" option in your editor and/or retype the section fresh.
Now, this has been asked elsewhere in the thread, but not yet clearly provided as far as I can see, so:
What minimalistic block of code do we need to run to reproduce your problem?
Cut out the database, cut out everything except:
- use strict; use warnings;
- An array declaration and assignment with canned data (as few as three elements if that is sufficient to demonstrate the issue)
- the sort operation
- a print of the results to compare against (1)
Be sure to copy and paste to ensure that the exact same code is on both sides of the internet. It should be only about 6 lines of code, with extra vertical whitespace for readability and Data::Dumper for results.
| [reply] [d/l] |
Re: sort != sort
by furry_marmot (Pilgrim) on Oct 27, 2010 at 01:02 UTC
|
Another thought, which I know doesn't address the sort issue. Since there are only ever two values, you're really just making two groups, with a lot of processing going on in-between. So have you considered just grepping the values? It might run a whole lot faster.
@Grouped = grep{$_->{'num'}==1} @Objs;
@Grouped = (@Grouped, grep{$_->{'num'}==2} @Objs);
--marmot | [reply] [d/l] |