in reply to Re: Comparing 2 different-sized strings
in thread Comparing 2 different-sized strings

Hi, thank you so much for the help. Can you just explain to me what the double dollar sign in front of rNee means? Thank you.
  • Comment on Re^2: Comparing 2 different-sized strings

Replies are listed 'Best First'.
Re^3: Comparing 2 different-sized strings
by BrowserUk (Patriarch) on Aug 09, 2013 at 09:40 UTC
    Can you just explain to me what the double dollar sign in front of rNee means?

    It means dereference the reference.

    Because genomic work often involves very large strings; and passing large strings into subroutines causes them to be copied:

    sub something { my( $string ) = @_; ## $string is a copy of the argument } my $hugeString = ........; something( $hugeString );

    Instead of passing the arguments directly, I pass references (kind of pointers) to them:

    fuzzyMatch( \$hay, \$nee, 3 ); ## pass references to needle and haysta +ck

    Within fuzzyMatch(), it receives references to the two strings:

    sub fuzzyMatch { my( $rHay, $rNee, $misses ) = @_; ## the 'r's are to remind that +these are references

    So to get to the actual strings, I use a second $

    my $lNee = length $$rNee; ## read as: $lenghtNeedle = length of t +he data $, referenced by $rNee

    So, $$rNee is shorthand for ${ $rNee }; if that clarifies things for you?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Hi, thank you so much. That makes sense, just the only question I had was when you put the r's thats to remind you they are references, but where do you actually declare them as references, using the slash operator? Thank you!
        but where do you actually declare them as references, using the slash operator?

        You don't "declare" references -- they are just scalars with 'special content' -- you 'take references' when you need them.

        In the case of the code, the references are taken when the subroutine is called:

        ... for fuzzyMatch( \$hay, \$nee, 3 ); #....................^......^

        Ie. $hay & $nee are normal strings in the main program.

        When I call fuzzyMatch( \ $hay, \ $nee, 3 ), I am taking references to those two strings (using '\') and passing them into the subroutine.

        In the subroutine those references get assigned to the local variables: $rHay & $rNee respectively:

        sub fuzzyMatch { my( $rHay, $rNee, $misses ) = @_; ...

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Hi, Thank you so much and I'm so sorry to bother you one last time, but could you just explain what's going on inside the map function please? I'm new to perl and I'm trying to google all of the components of the script that I don't understand so I make sure that I understand what's going on at every line.
        could you just explain what's going on inside the map function please?

        Sure.

        sub fuzzyMatch { my( $rHay, $rNee, $misses ) = @_; my $lNee = length $$rNee; my $min = $lNee - $misses; map { ( ( substr( $$rHay, $_, $lNee ) ^ $$rNee ) =~ tr[\0][] ) >= $min ? $_ : () } 0 .. length( $$rHay ) - $lNee; }
        • We need to compare the needle against the haystack at each position.

          Hence the map counter runs from 0 to length( haystack) - length( needle ).

        • We need to compare the same number of characters from the haystack as there are in the needle.

          Hence, the substr presents a needle length substring of haystack at each of those counter positions.

        • We don't just want a yes/no comparison; we need a count of the differences.

          So we bit-wise xor (^) the substring and the needle.

          The result is a string that has a 0 (null) byte wherever the two strings match; and some other byte value where they do not.

        • We need to count the zero bytes.

          tr[\0][] does that efficiently.

        • If the count of matched bytes is greater than the minimum required (length( needle ) - misses)

          return the position where the match occurred ($_), otherwise return nothing (()).

        Hope that clarifies things little. Continue to ask about anything that isn't clear.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.