in reply to Comparing strings with special characters

Welcome to the Monastery. You've given one sample string, what are you comparing it against? Show your perl code which doesn't match. It may be worth reading How do I post a question effectively?.

  • Comment on Re: Comparing strings with special characters

Replies are listed 'Best First'.
Re^2: Comparing strings with special characters
by skolobs (Initiate) on Mar 17, 2011 at 12:57 UTC

    Below is the code

    Sample data: candseq data SKA_HWI-EAS418:5:1:163:742#0/1 SKA_HWI-EAS418:5:1:30:1357#0/1 SKA_HWI-EAS418:5:1:53:520#0/1 SKA_HWI-EAS418:5:1:99:1255#0/1 SKA_HWI-EAS418:5:1:99:904#0/1 rawseq data @SKA_HWI-EAS418:5:1:163:742#0/1 TGCTTCACAATGATAGGAAGAGCCGACATCGAAGGATTAAAAAGCGACGTCGCTATGAACGCTTGG +SKA_HWI-EAS418:5:1:163:742#0/1 BBBBB@B?AAABA@A<?@<@?@<<?A?A@5>=<@=7??A>A;??;@@;;34?:<>=>7<??6>6:- @SKA_HWI-EAS418:5:1:164:195#0/1 CGCAATACTGTATTGCCCTTAATGGGGTCACTGTAACATTTTAAAACAAATGAGCAGTGACTGACT +SKA_HWI-EAS418:5:1:164:195#0/1 C+CCBCB@C?B>C?BABCCBBBCC30==:;CC<?+7C?BC>-3(=5.?:64=08@-AC;#######
    use 5.010.0; use strict; use warnings; #Read in files if ( @ARGV < 2 ){ say "Please, type in 'perl task2.pl candseqfilename rawseqfile +name' and hit enter"; exit ; } open(CANDSEQ, "<", $ARGV[0]) or die "candseqfilename can't be read: pl +ease check it: $!"; my ( @candseq, @rawseq, $rawseq ); while(<CANDSEQ>){ chomp; push(@candseq, $_); } #print $_ for @candseq; open(RAWSEQ, "<", $ARGV[1]) or die "rawseqfilename can't be read: plea +se check it: $!"; while(<RAWSEQ>){ chomp; $_ =~ s/^@// if/^@/; push(@rawseq, $_); } #say $_ for @rawseq; for( my $count = 0; $count < @rawseq; $count += 4 ){ for my $candseq( @candseq ){ if( $candseq ~~ @rawseq ){ say "found match"; } } }
      If I understand you query correctly, a quick non perl solution is
      grep -A2 -f candseq_data_file rawseq_data_file
      A perlish way of checking the presence of a string in an array is
      @is_member = grep { $_ =~ /$candseq/} @rawseq;

      Your code worked for me though. It identified the first line of "rawseq data" as a match.

      Note that in $candseq ~~ @rawseq, string equality is being checked. If you want SKA_HWI-EAS418:5:1:163:742#0/1 to match the second line "rawseq data" as well, try qr($candseq) ~~ @rawseq.

      Reference: Section "Smart matching in detail" in perlsyn