in reply to Re: Reliable way to detect base64 encoded strings
in thread Reliable way to detect base64 encoded strings

This works nicely. I don't have a header to work with so this good. There are a few things I don't quite understand in that regex though, would you mind commenting each line so I can get my head around it? Thanks a lot!
  • Comment on Re^2: Reliable way to detect base64 encoded strings

Replies are listed 'Best First'.
Re^3: Reliable way to detect base64 encoded strings
by ikegami (Patriarch) on Jun 29, 2009 at 22:14 UTC
    It's actually really straightforward.
    • Start of input
    • Followed by any number of groups of 4 characters from [A-Za-z0-9+/],
    • Followed by one of the following:
      • [always matches]
      • Four characters where
        • The first and second match /[A-Za-z0-9+/]/
        • The third matches /[AEIMQUYcgkosw048]/
        • The fourth is a "="
      • Four characters where
        • The first matches /[A-Za-z0-9+/]/
        • The second matches /[AQgw]/
        • The third and fourth are both a "="
    • Followed by the end of input

    It's probably a bit simpler after the update I just did for you:

    • Start of input
    • Followed by any number of groups of 4 characters from [A-Za-z0-9+/],
    • Followed by zero or one of the following:
      • Four characters where
        • The first and second match /[A-Za-z0-9+/]/
        • The third matches /[AEIMQUYcgkosw048]/
        • The fourth is a "="
      • Four characters where
        • The first matches /[A-Za-z0-9+/]/
        • The second matches /[AQgw]/
        • The third and fourth are both a "="
    • Followed by the end of input
      Your explanation makes it very clear... thanks again!

      Regex are not my favorite part of Perl....even if they are powerful. I'm trying to use your way to detect if a string is base64...so I was taking what you had and putting an if around it. I'm sure I'm just not getting it....could you give me some pointers?

      if($string_whole =~ m / ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? \z /x ) $&Log ("it found base64-$i");
      Thanks in advance for your expertise.

        You have the right idea, but

        if (...) $&Log ("it found base64-$i");
        is not valid Perl. You want
        if($string_whole =~ m/ ^ (?: [A-Za-z0-9+/]{4} )* (?: [A-Za-z0-9+/]{2} [AEIMQUYcgkosw048] = | [A-Za-z0-9+/] [AQgw] == )? \z /x) { Log("$string_whole is valid base64"); }