in reply to Re^3: Restore the original order of an array after sort and performing some funchtion on array values
in thread Restore the original order of an array after sort and performing some funchtion on array values

Thanks for helping to solve this problem. For my final implementation of p.adjust function from R in Perl I need to provide you the entire algorithm. Basically you have solved a lot of it. I do not have any problem with section (d), but based on what I was thinking and you implemented the order that I get at (c) is different from what it should be. </code>

(a) sort these values in an decreasing order

0.9802138 0.9585124 0.8167950 0.6491767 0.5453980 0.4902384 0.4693030 0.4069490 0.2821822 0.1155778

(b) assign order to each above observation in a decreasing order, i.e.,

10 9 8 7 6 5 4 3 2 1

(c) then calculate

0.9802138*10/10 0.9585124*10/9 0.8167950*10/8 ..... 0 .1155778*10/1 (note that: the 10 in the numerator corresponds to the total number of tests conducted in this toy example, you should change this to the corresponding actual number of tests when you do real data analysis. Also the order assignment will start with the total number of tests and decrease to 1.)

You will obtain the following

0.9802138 1.0650138 1.0209938 0.9273953 0.9089967 0.9804768 1.1732575 1.3564967 1.4109110 1.1557780

(d) then do the the cumulative minimum adjustment, i.e.

since 0.9802138<1.0650138, replace 1.0650138 with 0.9802138,

then among the first three values 0.9802138, 0.9802138, 1.0209938, 0.9802138 is smallest, replace 1.0209938 with 0.9802138

then among the first four values 0.9802138, 0.9802138, 0.9802138, 0.9273953, 0.9273953 is the smallest, and the fourth value will not be changed

similarly, the fifth value will not be changed and all the rest of values will be changed to 0.9089967

Finally you should have 0.9802138, 0.9802138, 0.9802138, 0.9273953 0.9089967 0.9089967 0.9089967 0.9089967 0.9089967 0.9089967

rearrange these value according to the original order of p-value,

0.5453980 0.4902384 0.8167950 0.2821822 0.4693030 0.6491767 0.9802138 0.1155778 0.9585124 0.4069490

[6] [5] [8] [2] [4] [7] [10] [1] + [9] [3]

then you will obtain the following values

0.9089967 0.9089967 0.9802138 0.9089967 0.9089967 0.9273953 0.9802138 0.9089967 0.9802138 0.9089967

then you compare each of these values (in red) with 1, if it is greater than 1, then replace that value with 1. Since in this particular example, all the above values (in red) are less than 1, your final BH adjusted q-values are

0.9089967 0.9089967 0.9802138 0.9089967 0.9089967 0.9273953 0.9802138 0.9089967 0.9802138 0.9089967

  • Comment on Re^4: Restore the original order of an array after sort and performing some funchtion on array values
  • Download Code

Replies are listed 'Best First'.
Re^5: Restore the original order of an array after sort and performing some funchtion on array values
by BrowserUk (Patriarch) on Mar 03, 2010 at 23:35 UTC

    Hmm. That reads like an exam question. None the less, since it was interesting to code, here's how to do section d):

    #! perl -sw use strict; use List::Util qw[ min ]; use Data::Dump qw[ pp ]; my %pvalues = ( 1=> 0.5453980, 2=> 0.4902384, 3=> 0.8167950, 4=> 0.2821822, 5=> 0.4693030, 6=> 0.6491767, 7=> 0.9802138, 8=> 0.1155778, 9=> 0.9585124, 10=> 0.4069490 ); my @orderedKeys = sort { $pvalues{ $b } <=> $pvalues{ $a } } keys %pvalues; my $d = my $n = values %pvalues; $pvalues{ $_ } *= $n / $d-- for @orderedKeys; $pvalues{ $orderedKeys[ $_ ] } = min( @pvalues{ @orderedKeys[ 0 .. $_ ] } ) for 1 .. $n-1; pp \%pvalues; __END__ c:\test>junk68 { 1 => "0.908996666666667", 2 => "0.908996666666667", 3 => "0.9802138", 4 => "0.908996666666667", 5 => "0.908996666666667", 6 => "0.927395285714286", 7 => "0.9802138", 8 => "0.908996666666667", 9 => "0.9802138", 10 => "0.908996666666667", }

    I haven't done the last step, (I can't see the stuff in red), so you'll have to work out how to do that yourself. And to do that, you'll first need to understand how the new line above works. And if you can do that, you'll stand some chance of explaining it to whomever is going to check your work.

    That'll be $25 :)

    BTW:If this is going to be used for real on large volumes of data (which R code often is), then you'll want to replace the use of List::Utilmin() with a custom min() that doesn't use a list for input. Throwing large lists around is a sure-fire way to kill performance. That said, if the lists are large, then all the nested slicing is going to kill you anyway.

    For real performance you might consider recoding this for PDL, but that's definitely left as an exercise.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks a lot. Honestly it is not an exam, I have a larger code in which at some point of it I needed to do this calculation. The algorithm is provided to me by a statistician who was doing it in R but I had to convert it to perl. (Actually you did).

      The red font are the numbers you printed at the end. If any of these >1 then they have to replaced by 1.

      In terms of run time and size of the actual hash varies between 200 to 4000 key/values. I do not think it take a long time to run. Appreciated it again.

        Thanks a lot. Honestly it is not an exam,

        Then all you need to finish the task is a last line of:

        $pvalues{ $_ } = min( $pvalues{ $_ }, 1 ) for keys %pvalues;

        It'd still be a good idea to make sure you understand how the whole code works.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.