in reply to Unique Variable names...

Without already being able to give you a solution, I have the following comments:

  1. Just a style argument: why do you put the subs definitions in the middle of your code? It tends to make the structure a lot less easy to read.
  2. spliceing the first two items of your array with filenames/directories, is a nice trick if you can be sure that the first two items are always the dot and dot-dot items. This may be something which is not guaranteed and/or not portable across all OS.
  3. Your program assumes (as is your good right) a very specific directory and file-structure (top level only holds directories and each such directory contains a "data"-file. Which makes it difficult to test your script if one doesn't have the same structure.
  4. get_targets and get_signal, seems to go through the same "data"-file, just extracting different items, resp. extracting the first and the fourth item and saving the rest in some variables which are never used (if you did use warnings you would have received some warnings in this respect). The same goes for the variables $scratch, $excess and $spliced_data, which are essentially just garbage bins in your script.
  5. Rather than using global variables, you could pass to your subs an argument list. If you did that then you would really see that you are using the same arguments in both subroutines. Now you use @genius and @filename, which are just copies of each other, but that is not readily apparent.
  6. What you are trying to do with %hash={"$file"=>@final_data} beats me. Could you explain it?
  7. Why do you return the value of @targets to @columns? You never ever use the @columns-array?
  8. Why did you think @$i=@next_columns[$i] would work? Can you explain your reasoning behind it?
About the "unique variable name" thing: why would you need that? I'm not convinced that it is necessary for your purpose. May I suggest that you give us an example of your inputs and your expected output? That would make it a lot easier to help you.

CountZero

"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Replies are listed 'Best First'.
Re: Re: Unique Variable names...
by bioinformatics (Friar) on Jul 29, 2003 at 19:42 UTC
    My input data file would look something like:
    AFFX-BioB-5_at 20 20 200.2 P 0.001 AFFX-BioB-M_at 20 20 400.4 P 0.002 AFFX-BioB-3_at 20 20 200.5 P 0.003
    I want the 4th column, with the signal data. Actually, that other subroutine gets only the first column from the first file. I didn't know how to do that otherwise, since everything else was in a loop. This way, I have everything set to look like (if it worked anyway):
    AFFX-BioB-5_at 200.0 300.0 400.0 AFFX-BioB-M_at 200.0 300.0 400.0 AFFX-BioB-3_at 200.0 300.0 400.0
    Thanks for your suggestions!!
    Bioinformatics

      OK, now we're talking.

      Assuming you want a result which is first the AFFX-BioB-xxx_at-identifier, followed by all signal-data which are connected to this identifier, I suggest:

      • you drop the get_targets sub and all references to it.
      • Then you change your sub get_signal to:
        while (@filename) { my $i; $file=shift @filename; use Cwd 'chdir'; chdir "./data"; open (FILE, "$file") or die; while (<FILE>) { next while $i++ <= 14; (my $id, undef, undef, my $signal, undef, undef)=split(/\t +/); push @{$outputdata{$id}}, $signal; } close (FILE); }
        After having run this sub over all your files you will find in %outputdata a nicely ordered (per identifier) structure of your signal-data.
      • "Printing this datastructure goes as follows:
        for $id (keys %outputdata) { print "$id:\t",(join("\t",@{$outputdata{$id}})),"\n"; }
        Of course you can print it to a filehandle. This is a format which is suitable to be imported in a database or a spreadsheet.
      The "magic" of using references to anonymous arrays may perhaps be a bit too deep for someone who is just starting to program, but if you read Chapters 8 and 9 of the Camel book a few times and study the examples given, much will become clearer.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law