in reply to Re: Trouble skipping lines using Perl
in thread Trouble skipping lines using Perl

Hi haukex, Thanks very much, works perfectly now. I went with

 next if ($chromosome =~ "chrM");

I was using both Strict and Warnings and using the quotes instead of the regex doesn't produce any warnings :D

My apologies about the input data format

Cheers

Replies are listed 'Best First'.
Re^3: Trouble skipping lines using Perl
by haukex (Archbishop) on Nov 21, 2017 at 16:19 UTC

    I went with

    next if ($chromosome =~ "chrM");

    That works, but personally I wouldn't write it that way, because writing a regex like for example /chrM/ or m{chrM} makes it more visually clear what you want to do (and also allows you to add modifiers).

    I was using both Strict and Warnings and using the quotes instead of the regex doesn't produce any warnings

    Are you sure? next if ($chromosome = "chrM"); should have given you the warning "Found = in conditional, should be ==". Perhaps you're not enabling warnings correctly?

    Update:

    My apologies about the input data format

    You can edit your posts (please mark updates as such), see How do I change/delete my post?

      Sorry,

      I meant no warnings were produced at the end result, I got the warning you mentioned initially before using the ~

      Cheers

Re^3: Trouble skipping lines using Perl
by roboticus (Chancellor) on Nov 21, 2017 at 16:30 UTC

    LeBran:

    It didn't give you any warnings because the expression $chromosome = /^chrM/ is perfectly fine. It just doesn't do what you want it to. Instead of checking whether $chromosome starts with "chrM", it instead checks whether $_ starts with "chrM", and then sets $chromosome to a true value if it does, and a false value otherwise. Since you're not using $_ while parsing your lines, it never starts with "chrM" and always returns a false value.

    It's a common enough mistake that I could see a case being made for "if ($var = /rex/)" generating a warning, as I expect that "if ($var = ($_ =~ /rex/))" is pretty uncommon (at least, when looking at my code).

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Ah, thank you,

      So presumably  next if ($_ = /^chrM/); would also be correct?

      Cheers

        So presumably  next if ($_ = /^chrM/); would also be correct?

        No! The  $_ = /^chrM/ expression matches the regex against  $_ (update: by default, since the  // match is not explicitly bound to any other scalar by a  =~ binding operator) and then assigns the result of the comparison | match to  $_ (which at least gives  $_ some defined value, so I guess it's not all bad :) Matching against  $_ (or any other scalar variable) is only semantically correct if that variable has first been given some meaningful value as in a  while (<FILE>) { ... } loop. Stick to
            next if ($line =~ /^chrM/);


        Give a man a fish:  <%-{-{-{-<

Re^3: Trouble skipping lines using Perl
by AnomalousMonk (Archbishop) on Nov 21, 2017 at 17:49 UTC
    I went with

    next if ($chromosome =~ "chrM");

    Also note that  $chromosome =~ "chrM" matches if  "chrM" is found anywhere in the  $chromosome string:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $chromosome = 'foo xx chrM yy bar'; print 'found a chrM' if $chromosome =~ 'chrM'; " found a chrM
    This is because you no longer anchor the match to the start of the string (as you do in the code in the OP) with the  ^ match anchor regex operator. The match  /^chrM/ would IMHO be better for what you seem to want.

    Another point is that the data posted in the OP has a leading space or spaces in some cases. Leading whitespace will cause the  /^chrM/ match to fail. If leading whitespace may be present in real data, I would recommend something along the lines of  /^\s*chrM/ instead.


    Give a man a fish:  <%-{-{-{-<

      Ah ok,

      I'm following you, thanks very much :)