in reply to Re: Print Uniqe elements of arry of hash
in thread Print Uniqe elements of arry of hash

Thanks grandfather,

I think I confused everyone. As you indicated there are 10 columns and I am working with the forth and the ninth (index 3 and 8). This code does not work unless I remove the comment form "# push(@genes_number, $current_line[ 1 ]);" and comment "push(@{$genes_number{$k}}, $current_line [ 1 ]);"

Here is the complete code that reads the first file and store the data into the %gc_ag. In the data section I have included the first few rows of first file and the second file.

#!/usr/bin/perl -w use strict; use warnings; use File::Path; use vars qw(@genes_number %count $AG_value $GC_bin @arr %gc_ag); ###################################################################### +############### # Open Input and Output Files + # ###################################################################### +############### if( @ARGV < 2){ print "NSD analysis file needs the following arguments\n"; print "Name of merged Static and experimental file, Name of expe +riment (first col of first file, Output file name\n"; exit 0; } opendir (DIR, "/nsd/data") || die "Cannot open directroy data $!"; open(NSD, "firstfile.txt") || die "Cannot open file NSD file"; # NSD d +ata open(INPUT1,$ARGV[0]) || die "Cannot open file \"$ARGV[0]\""; # Analys +is file my $exp_name=$ARGV[1] || die "Enter the name of Experiment \"$ARGV[1]\ +""; open(RESULTS, ">/nsd/data/$ARGV[2]")|| die "Cannot open the results fi +le"; #my $GC_bin=$ARGV[3] || die "Enter the GC content that you want to anl +yze \"$ARGV[2]\""; #open(RESULTS,">result.txt")|| die "Cannot open the results file"; #my $AG_value=0; # read all AG 90 values for all gc bins into a hash # then for the keys of this hash use the values for making # compariso +ns my %gc_ag =(); while(<NSD>){ chomp; my @current_line_nsd = split /\t/; if ($exp_name eq $current_line_nsd[0]){ $gc_ag{$current_line_nsd[2]} = $current_line_ns +d[17]; } } foreach my $v (sort {$a<=>$b} keys %gc_ag) { print "$v\t$gc_ag{$v}\n"; } close (NSD); #reading the second file entered by user as an argument my %genes_number =(); while(<INPUT1>){ chomp; my @current_line = split /\t/; foreach my $k (sort {$a<=>$b} keys %gc_ag) { if ($current_line[3] == $k && $current_line[8] > $gc_a +g{$k} ) { # push(@genes_number, $current_line[1]); push(@{$genes_number{$k}}, $current_line[1]); } } } &count_unique (@genes_number); ###################################################### sub count_unique { @genes_number = @_; my %count; map { $count{$_}++ } @genes_number; #print them out: map {push our @arr, ${count{$_}}} sort keys(%count); for (my $j=1; $j<10; $j++){ my $counter =0; foreach my $element(@arr) { if ($element >=$j) { $counter++; } } print "$j\t$counter\n"; } #print "Number of genes with more than 5 probes is: $counter\n"; #print scalar(@arr)."\n" ; #map {print RESULTS "$_\n"} sort keys(%count); my $i =0; $i += keys %count; # print $i; return %count; } close (INPUT1); close (RESULTS); close (DIR);
THE FIRST FILE PEPG_GT1001_01_HYB1_N 2008-06-26 5 1406 1163 2.2748 +3 2.66979 2.77429 2.91807 3.10535 3.24583 3.24583 3.43355 3.67937 3.9 +0971 4.51388 4.88935 5.33778 6.30833 38.3232 2.27483 2.77429 2.91807 +3.10535 3.24583 3.43355 3.67937 4.22568 4.51388 5.33778 7.52172 9.753 +74 14.9343 27.3547 922.32 PEPG_GT1001_01_HYB1_N 2008-06-26 6 1746 65825 2.3526 +1 2.77429 2.91807 2.91807 3.10535 3.24583 3.43355 3.67937 3.67937 4.2 +2568 4.51388 4.88935 5.67276 6.81457 72.5727 2.20176 2.91807 3.10535 +3.24583 3.43355 3.90971 4.51388 5.33778 6.81457 9.75374 15.8048 22.22 +25 34.325 60.0727 2805.0 PEPG_GT1001_01_HYB1_N 2008-06-26 7 1828 356967 2.2748 +3 2.66979 2.91807 2.91807 3.10535 3.24583 3.43355 3.67937 3.90971 4.2 +2568 4.88935 5.33778 6.30833 8.14622 58.5727 2.14818 3.10535 3.43355 +3.67937 4.22568 5.33778 6.81457 9.75374 14.9343 23.4265 38.3232 50.32 +27 69.5727 114.323 4083.8 PEPG_GT1001_01_HYB1_N 2008-06-26 8 1880 747766 2.2748 +3 2.77429 2.91807 3.10535 3.10535 3.24583 3.67937 3.90971 4.22568 4.5 +1388 5.33778 6.30833 7.52172 10.767 176.073 2.14818 3.43355 3.90971 +4.88935 5.67276 9.75374 15.8048 24.8939 36.8235 53.3227 74.8227 92.57 +27 121.073 185.323 13073. PEPG_GT1001_01_HYB1_N 2008-06-26 9 1918 1068262 2.2748 +3 2.77429 2.91807 3.10535 3.24583 3.43355 3.67937 3.90971 4.51388 4.8 +8935 6.30833 6.81457 9.11957 12.5488 131.823 2.14818 3.90971 5.33778 +7.52172 11.6325 22.2225 35.574 50.3227 68.3227 91.0727 123.823 147.8 +23 185.323 274.073 7170.9 PEPG_GT1001_01_HYB1_N 2008-06-26 10 1904 1212972 2.3526 +1 2.91807 3.10535 3.24583 3.43355 3.67937 4.22568 4.51388 5.33778 6.8 +1457 9.11957 11.6325 15.8048 24.8939 1374.32 2.20176 4.88935 8.14622 +14.9343 23.4265 40.8228 61.5727 81.8227 105.073 133.073 174.573 204.3 +23 252.573 365.823 20092. PEPG_GT1001_01_HYB1_N 2008-06-26 11 1920 1128925 2.4355 +6 2.91807 3.24583 3.43355 3.67937 4.22568 4.88935 5.33778 6.81457 9.1 +1957 13.9124 18.4569 26.1208 40.8228 356.073 2.20176 6.30833 12.5488 +24.8939 36.8235 61.5727 85.8227 111.573 141.073 176.073 224.323 259.5 +73 316.073 449.073 8523.3 PEPG_GT1001_01_HYB1_N 2008-06-26 12 1946 864327 2.3526 +1 3.10535 3.43355 3.67937 4.22568 4.88935 5.67276 7.52172 9.75374 14. +9343 22.2225 30.3334 40.8228 62.8227 681.073 2.27483 8.14622 19.6077 +34.325 49.0727 79.0727 107.573 138.573 172.323 212.323 267.323 307.8 +23 371.073 521.323 8956.5 PEPG_GT1001_01_HYB1_N 2008-06-26 13 1936 550147 2.4355 +6 3.43355 3.67937 4.22568 4.88935 6.30833 8.14622 10.767 14.9343 22. +2225 34.325 43.8227 64.3227 96.8227 696.323 2.27483 10.767 26.1208 +42.0728 58.5727 91.0727 123.823 157.323 193.573 238.073 298.573 342.8 +23 411.323 580.073 9509.3 PEPG_GT1001_01_HYB1_N 2008-06-26 14 1920 291085 2.5241 +8 3.67937 4.22568 4.88935 5.67276 8.14622 11.6325 15.8048 23.4265 32. +5776 50.3227 61.5727 79.0727 116.823 797.823 2.35261 13.9124 30.3334 +46.3227 62.8227 96.8227 130.323 166.573 205.823 252.573 318.573 367.0 +73 445.073 635.823 9819.0 PEPG_GT1001_01_HYB1_N 2008-06-26 15 1898 126329 2.3526 +1 3.67937 4.51388 5.33778 6.81457 9.75374 15.8048 22.2225 32.5776 45. +0727 64.3227 81.8227 111.573 174.573 1178.57 2.52418 15.8048 31.3308 +48.0727 62.8227 95.3227 129.073 166.573 207.323 258.323 328.073 383.0 +73 470.573 708.323 8939.0 PEPG_GT1001_01_HYB1_N 2008-06-26 16 1926 45400 2.6697 +9 4.22568 5.67276 7.52172 9.11957 13.9124 21.0305 30.3334 40.8228 58. +5727 83.0727 100.823 130.323 191.073 1184.07 2.52418 15.8048 32.5776 +48.0727 62.8227 92.5727 126.323 162.573 204.323 259.573 341.323 406.0 +73 517.323 833.823 9437.3 PEPG_GT1001_01_HYB1_N 2008-06-26 17 1884 12459 2.7742 +9 4.88935 6.30833 8.14622 10.767 17.1358 26.1208 36.8235 49.0727 66. +8227 98.0727 119.823 157.323 245.823 1871.82 2.66979 17.1358 31.3308 +45.0727 60.0727 90.0727 121.073 159.823 205.823 267.323 368.323 454.3 +23 615.573 1013.32 9156.5 PEPG_GT1001_01_HYB1_N 2008-06-26 18 1824 1925 2.7742 +9 5.33778 7.52172 9.75374 12.5488 19.6077 29.0895 40.8228 53.3227 76. +3227 106.323 134.323 181.573 302.573 2503.57 3.43355 15.8048 29.0895 +42.0728 56.0727 80.5727 114.323 157.323 205.823 279.823 407.323 537.5 +73 760.073 1567.82 8010.4 PEPG_GT1001_01_HYB2_N 2008-06-26 5 1406 1163 2.2748 +3 2.66979 2.77429 2.8588 2.97936 3.10535 3.20743 3.39102 3.5525 3.9 +0971 4.16394 4.63902 5.33778 6.95136 43.8227 2.35261 2.8588 2.97936 +3.10535 3.20743 3.5525 3.72086 4.445 4.97054 6.11439 8.71517 10.76 +7 16.8886 31.3308 684.57 PEPG_GT1001_01_HYB2_N 2008-06-26 6 1746 65825 2.2748 +3 2.66979 2.8588 2.97936 2.97936 3.10535 3.39102 3.5525 3.72086 4.1 +6394 4.63902 4.97054 5.67276 6.95136 70.8227 2.21903 2.8588 3.10535 +3.20743 3.5525 3.90971 4.445 5.33778 6.95136 10.0787 16.8886 22.94 +34 34.325 59.5727 3348.5 PEPG_GT1001_01_HYB2_N 2008-06-26 7 1828 356967 2.2748 +3 2.77429 2.8588 2.97936 3.10535 3.20743 3.39102 3.5525 3.90971 4.4 +45 4.97054 5.33778 6.5127 8.14622 191.073 2.14818 3.10535 3.39102 +3.72086 4.16394 4.97054 6.5127 9.30877 14.9343 22.9434 37.8233 49.32 +27 67.8227 109.323 4133.0 PEPG_GT1001_01_HYB2_N 2008-06-26 8 1880 747766 2.3526 +1 2.77429 2.8588 2.97936 3.10535 3.39102 3.5525 3.90971 4.16394 4.6 +3902 5.67276 6.5127 8.14622 11.4116 202.823 2.21903 3.39102 3.90971 +4.63902 5.67276 9.30877 14.9343 23.9123 35.574 50.3227 72.5727 90.07 +27 117.323 178.573 5673.5 THE SECOND FILE 4750739 A209.EF064282 53 11 0.474968 -33.2 S + 4750739 165.834 44.7 3383536 A209.EF064282 55 11 0.500083 -32.4 A + 3383536 323.299 49 2634649 A209.EF064282 57 10 0.394855 -32 S + 2634649 335.989 70.8 2923929 A209.EF064282 59 10 0.440602 -32.6 A + 2923929 182.191 56.2 4872947 A209.EF064284 55 9 0.320984 -32.8 A + 4872947 532.385 103.7 1427589 A209.EF064284 57 9 0.402661 -32.2 S + 1427589 677.757 199.3 3642671 A209.EF064284 87 8 0.376187 -30.4 A + 3642671 180.485 65.7 1042210 A209.EF064284 89 7 0.320542 -29.3 S + 1042210 363.54 61.7 4959298 A209.EF064284 91 9 0.462034 -30.5 A + 4959298 549.485 105.1 2609177 A209.EF064284 93 8 0.287848 -29.9 S + 2609177 223.687 48 3121652 A209.EF064284 95 9 0.447002 -30.8 A + 3121652 491.059 104.1 4911506 A209.EF064284 97 9 0.371932 -30.2 S + 4911506 99.0901 24.5 4389484 A209.EF064284 99 10 0.48621 -32.2 A 438948 +4 123.912 47.2

Replies are listed 'Best First'.
Re^3: Print Uniqe elements of arry of hash
by gone2015 (Deacon) on Oct 15, 2008 at 09:54 UTC

    This dog won't hunt.

       Global symbol "$k" requires explicit package name at sesemin.pl line 72.
       syntax error at sesemin.pl line 72, near "})"
    
    where, by the time I sorted out the files and such, line 72 is:&count_unique (@{$genes_number{$k});

    Even fixing the trivial syntax error, one's left with the undefined $k at this point. I could guess at what $k is supposed to be here... but I think it's time for you to do some work !

    Happy to try to help, but you need to put more effort in at your end, so that the code you offer:

    1. actually runs -- with strict and warnings.
    2. does not require other files to be downloaded -- let alone placed in special directories. (see below)
    3. does not require command line arguments -- whatever you need to demonstrate should be inside the code.
    4. illustrates the issue you have, with the minimum of code.
    5. supports the question you've posed... "I am trying to achieve blah to which end I have written blather which is supposed to mangle stuff so, but what I get is a crick in my neck...., as demonstrated in the very wonderful code here...
    6. and all the other good advice in How do I post a question effectively?


    If your code only has one input file, then __DATA__ is the obvious replacement.

    If your code has several inputs, then this will do the trick. First, comment out (or remove) the original open commands and replace as illustrated:

    #open FOO, "my_favourite.yum" or die "horribly $!" ; open FOO, '<', &my_favourite_yum or die "horribly $!" ;
    then at the end of your example code place:
    #______________________________________ # ... description of the data ... sub my_favourite_yum { \(<<'~~FILE') } ... contents of my_favourite.yum go here ... ~~FILE #______________________________________ # ... description.... sub my_other_favourite { \(<<'~~FILE') } ... contents of my_other_favourite go here ... ~~FILE #______________________________________
    where:
    1. obviously, the name of the sub must match the name in the relevant open, but may be anything you like that helps the reader.
    2. the end of file marker '~~FILE' can be anything you like, but obviously it must not appear in the data ! (For the avoidance of doubt, it may be different for each file, but must be the same in the two places it appears for each one.)
    3. the marker at the end of each "file" must have a newline at the end of it. (No other trailing whitespace. End-of-file will not do -- hence the suggested #____ line at the end.
    4. NB: tabs may be translated to one or more spaces in the upload process. But ...
    5. before posting, you must ensure that your example code works (and illustrates the issue you have) with the files embedded.

    Update: removed spurious and erroneous & from sub &my_favourite_yum and sub &my_other_favourite. Thanks tinita. (Blushes deep crimson and wonders whether going back to bed and starting again is an option.)

      Good hint. I tried to use __DATA__ but for whatever reason my data was not showing properly in preview so I decided to go with readmore tag. Sorry if it caused any inconvenience.

      As per sub. if you replace

      &count_unique (@{$genes_number{$k}); with

      &count_unique (@genes_number);

      my code will work. But the simple question that I had to ask is that how can we print the unique elements of the most inner array of an array of array. Probably that makes life easier for you to be able to respond. Just a simple example will help me a lot.

      Thank you very much.

      Pedro