in reply to Re: Comparing strings with special characters
in thread Comparing strings with special characters

Below is the code

Sample data: candseq data SKA_HWI-EAS418:5:1:163:742#0/1 SKA_HWI-EAS418:5:1:30:1357#0/1 SKA_HWI-EAS418:5:1:53:520#0/1 SKA_HWI-EAS418:5:1:99:1255#0/1 SKA_HWI-EAS418:5:1:99:904#0/1 rawseq data @SKA_HWI-EAS418:5:1:163:742#0/1 TGCTTCACAATGATAGGAAGAGCCGACATCGAAGGATTAAAAAGCGACGTCGCTATGAACGCTTGG +SKA_HWI-EAS418:5:1:163:742#0/1 BBBBB@B?AAABA@A<?@<@?@<<?A?A@5>=<@=7??A>A;??;@@;;34?:<>=>7<??6>6:- @SKA_HWI-EAS418:5:1:164:195#0/1 CGCAATACTGTATTGCCCTTAATGGGGTCACTGTAACATTTTAAAACAAATGAGCAGTGACTGACT +SKA_HWI-EAS418:5:1:164:195#0/1 C+CCBCB@C?B>C?BABCCBBBCC30==:;CC<?+7C?BC>-3(=5.?:64=08@-AC;#######
use 5.010.0; use strict; use warnings; #Read in files if ( @ARGV < 2 ){ say "Please, type in 'perl task2.pl candseqfilename rawseqfile +name' and hit enter"; exit ; } open(CANDSEQ, "<", $ARGV[0]) or die "candseqfilename can't be read: pl +ease check it: $!"; my ( @candseq, @rawseq, $rawseq ); while(<CANDSEQ>){ chomp; push(@candseq, $_); } #print $_ for @candseq; open(RAWSEQ, "<", $ARGV[1]) or die "rawseqfilename can't be read: plea +se check it: $!"; while(<RAWSEQ>){ chomp; $_ =~ s/^@// if/^@/; push(@rawseq, $_); } #say $_ for @rawseq; for( my $count = 0; $count < @rawseq; $count += 4 ){ for my $candseq( @candseq ){ if( $candseq ~~ @rawseq ){ say "found match"; } } }

Replies are listed 'Best First'.
Re^3: Comparing strings with special characters
by umasuresh (Hermit) on Mar 17, 2011 at 13:35 UTC
    If I understand you query correctly, a quick non perl solution is
    grep -A2 -f candseq_data_file rawseq_data_file
    A perlish way of checking the presence of a string in an array is
    @is_member = grep { $_ =~ /$candseq/} @rawseq;
Re^3: Comparing strings with special characters
by jaimon (Sexton) on Mar 18, 2011 at 14:18 UTC

    Your code worked for me though. It identified the first line of "rawseq data" as a match.

    Note that in $candseq ~~ @rawseq, string equality is being checked. If you want SKA_HWI-EAS418:5:1:163:742#0/1 to match the second line "rawseq data" as well, try qr($candseq) ~~ @rawseq.

    Reference: Section "Smart matching in detail" in perlsyn