lomSpace has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I need to parse a file and print to two files. The records go to different files
based on whether the values in field[0] has a dash and a number at the end of the string
or not. Example: 10002TU or 10002TU-2.

#!/usr/bin/perl -w use strict; use warnings; open(my $reg_out, ">C:/Documents and Settings/mydir/Desktop/RegularOli +gos.txt"); open(my $irreg_out, ">C:/Documents and Settings/mydir/Desktop/Irregula +rOligos.txt"); open(my $out, ">C:/Documents and Settings/mydir/Desktop/mfrp.txt"); my $first_line = <$in>; chomp $first_line; while(<DATA>){ chomp; my @fields = split /\t/; my $maid = $fields[0]; my $forward = $fields[1]; my $reverse = $fields[2]; my $probe = $fields[3]; if ($probe =~ /^\d{5}TU|TD$/){ print $reg_out "$maid\t$forward\t$reverse\t$probe\n"; } else{ print $irreg_out "$maid\t$forward\t$reverse\t$probe\n"; } } #close $in; close $irreg_out; close $reg_out; __DATA__ 10002TU AGACATTACCTGTGAGACACCTTTC GCCTCCACCTCAGAGTCAG TCCATGG +GAAGGATCTCCGTGAAATCA 10002TU-2 GCTCCAGCTAGAAGAGAATCC CCCACCAGGGCTGTGTAAG CCTGTGAGA +TAGTACAGCTGAAGAGTTGGC 10002TD TGTGTTGATTCTCAGCCTCTTG GACGGAGCACATAGGCAAAG TCTGTTCTT +CTCAGCTGTCTTTGTTGCTGC 10003TU CAGCAAGCCCTGAGGTGTG CAGTGAACTGAGAAAGACGAGAGG TGCAAGTCCAG +ATGGAGGCCACC 10174TU-2 ACCTGAACAGCCTGACATGAAC TGGGATGGAGGGCAAAGTC CCACCTAG +TATGACCCAGCACACCTCC

Any direction will be appreciated.

Replies are listed 'Best First'.
Re: separate data and print to two files
by zwon (Abbot) on Jul 08, 2009 at 18:42 UTC
      Yeah, I just figured out the SIMPLE mistake! Thanks anyway! Lom Space
Re: separate data and print to two files
by toolic (Bishop) on Jul 08, 2009 at 18:55 UTC
    Since you did not specify, I will assume your Regular file will contain the dashes. If this does not do what you want, then you must show the desired output:
    use strict; use warnings; open my $reg_out , '>', 'RegularOligos.txt' or die "can not open fi +le RegularOligos.txt:$!"; open my $irreg_out, '>', 'IrregularOligos.txt' or die "can not open fi +le IrregularOligos.txt:$!"; while(<DATA>){ chomp; my @fields = split; my $line = join "\t", @fields; if ($fields[0] =~ /-/){ print $reg_out "$line\n"; } else{ print $irreg_out "$line\n"; } } close $irreg_out; close $reg_out; __DATA__ 10002TU AGACATTACCTGTGAGACACCTTTC GCCTCCACCTCAGAGTCAG TCCATGGG +AAGGATCTCCGTGAAATCA 10002TU-2 GCTCCAGCTAGAAGAGAATCC CCCACCAGGGCTGTGTAAG CCTGTGAGA +TAGTACAGCTGAAGAGTTGGC 10002TD TGTGTTGATTCTCAGCCTCTTG GACGGAGCACATAGGCAAAG TCTGTTCTT +CTCAGCTGTCTTTGTTGCTGC 10003TU CAGCAAGCCCTGAGGTGTG CAGTGAACTGAGAAAGACGAGAGG TGCAAGTCCAG +ATGGAGGCCACC 10174TU-2 ACCTGAACAGCCTGACATGAAC TGGGATGGAGGGCAAAGTC CCACCTAGT +ATGACCCAGCACACCTCC
      ... I will assume your Regular file will contain the dashes.

      Actually, it appears to be the other way around. And here's an even simpler way (no need to split):

      #!/usr/bin/perl use strict; use warnings; my @filenames = qw/Reg Irreg/; my %ofh = map { open( my $fh, ">", $_.'ularOlios.txt' ) or die $!; $_ => $fh + } @filenames; while (<DATA>) { my $o = ( /^\w+-/ ) ? $ofh{Irreg} : $ofh{Reg}; print $o $_; } close $_ for ( values %ofh ); __DATA__ 10002TU AGACATTACCTGTGAGACACCTTTC GCCTCCACCTCAGAGTCAG TCCATGGG +AAGGATCTCCGTGAAATCA 10002TU-2 GCTCCAGCTAGAAGAGAATCC CCCACCAGGGCTGTGTAAG CCTGTGAG +ATAGTACAGCTGAAGAGTTGGC 10002TD TGTGTTGATTCTCAGCCTCTTG GACGGAGCACATAGGCAAAG TCTGTTCT +TCTCAGCTGTCTTTGTTGCTGC 10003TU CAGCAAGCCCTGAGGTGTG CAGTGAACTGAGAAAGACGAGAGG TGCAAGTC +CAGATGGAGGCCACC 10174TU-2 ACCTGAACAGCCTGACATGAAC TGGGATGGAGGGCAAAGTC CCACCTAG +TATGACCCAGCACACCTCC