comment on

Hi Perlmonks,

I have three text files (1.txt, 2.txt & 3.txt) on desktop. Each file has 5 strings separated by comma. I am interested to upload these files without using <STDIN> and then to join the corresponding strings of the files based on length (two longest & two shortest). I have written a script (test.pl) which can find out the two longest and two shortest strings of each file. But I am at my wit's end to join the corresponding short and long strings across files based on length. I am looking forward to suggestions from you to sort out this problem. The text files, the script, the results of command script and the desired strings are given below:

Here goes the three text files:

1.txt is given below:

A
,AA
,AAA
,AAAA
,AAAAA
,
[download]

2.txt

T
,TT
,TTT
,TTTT
,TTTTT
,
[download]

3.txt

G
,GG
,GGG
,GGGG
,GGGGG
,
[download]

Here goes the script: test.pl

 
#!/usr/bin/perl  
use warnings;  
 
 @apple=(1 ... 3); 
     $nm=0; 
foreach my $num (@apple) {$nm++; 
  $output_fle="$nm.txt";
 
  if (-e $output_fle) { 
  open FILE,"$output_fle" or die "Couldn't open file: $!"; 
    $fle=join(" ",<FILE>);
    close FILE;  
  $fle=~ s/\s//g;  
  @fle=split(' ',$fle);     
  push  @all_file,@fle; 

  } # End of if LOOP for required files:  

}  # Last curly brace of Foreach LOOP for uploading all files: 

#########################
# Code for each file:  ##
#########################

    $file_no=0;
foreach my $each_fle ( @all_file) { $file_no++;
 
  @a=split(',',$each_fle);  
  $seq_no=0; 

foreach my $seq (@a) { $seq_no++; # For each sequence of the file
    $seq=~ s/,//g;  
    $seq= uc$seq;   
    $seq_len=length($seq); # For testing  

print"\n Element $seq_no of File $file_no: $seq
   Length: $seq_len\n";     

# push length & each seq to an array:     
     push  @names,$seq;  
     push  @values,$seq_len;  
    
    } # End of foreach LOOP for each file: 

#######################################################
# Find two lowest & two highest values of each file with sequences: 
#######################################################
use 5.010; 
use Data::Dumper; 
use constant IWANT => 2; 
my @data; 
my $pos=1; 
  
 for my $val (@values) { 
   my $name=shift @names; 
   my $rec=sprintf"\n Length %0.1f; Seq: %s",$val,$name; 
   push @data,$rec;} 

print"\n\nLength (Small to big) with sequences for File $file_no:\n";
 @data= sort @data; 
 for(0 .. IWANT-1) {say $data[$_];} 

print"\n";  
 print"\nLength (Big to small) with sequences for File $file_no:\n"; 
 for (1 .. IWANT) {say $data[-$_];}        

############################
# End Max & Min codes here: 
#############################

 @values=(); # To empty the array
 @names=();  # To empty the array
print"\n######## File $file_no ends ##############\n\n"; 

} # End of foreach LOOP for all files
 
exit; 
########################################
[download]

The results of the cmd goes like:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\x>cd desktop

C:\Users\x\Desktop>test.pl

 Element 1 of File 1: A
   Length: 1

 Element 2 of File 1: AA
   Length: 2

 Element 3 of File 1: AAA
   Length: 3

 Element 4 of File 1: AAAA
   Length: 4

 Element 5 of File 1: AAAAA
   Length: 5


Length (Small to big) with sequences for File 1:

 Length 1.0; Seq: A

 Length 2.0; Seq: AA


Length (Big to small) with sequences for File 1:

 Length 5.0; Seq: AAAAA

 Length 4.0; Seq: AAAA

######## File 1 ends ##############


 Element 1 of File 2: T
   Length: 1

 Element 2 of File 2: TT
   Length: 2

 Element 3 of File 2: TTT
   Length: 3

 Element 4 of File 2: TTTT
   Length: 4

 Element 5 of File 2: TTTTT
   Length: 5


Length (Small to big) with sequences for File 2:

 Length 1.0; Seq: T

 Length 2.0; Seq: TT


Length (Big to small) with sequences for File 2:

 Length 5.0; Seq: TTTTT

 Length 4.0; Seq: TTTT

######## File 2 ends ##############


 Element 1 of File 3: G
   Length: 1

 Element 2 of File 3: GG
   Length: 2

 Element 3 of File 3: GGG
   Length: 3

 Element 4 of File 3: GGGG
   Length: 4

 Element 5 of File 3: GGGGG
   Length: 5


Length (Small to big) with sequences for File 3:

 Length 1.0; Seq: G

 Length 2.0; Seq: GG


Length (Big to small) with sequences for File 3:

 Length 5.0; Seq: GGGGG

 Length 4.0; Seq: GGGG

######## File 3 ends ##############
[download]

In addition to the above results, I need the following desired strings based on length:

Two shortest strings (small to big): 
 Seq 1: ATG
 seq 2: AATTGG 

Two longest strings (big to small):
 seq 1: AAAAATTTTTGGGGG 
 seq 2: AAAATTTTGGGG
[download]

In reply to How can one join the shortest and longest strings of different text files? by supriyoch_2008

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Syntactic Confectionery Delight
	PerlMonks