Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a field that is used to put URL. I want to put a check that it starts with https and does not have / at the end. How can i do that? I tried $sec =~ /\^https/ but does not seem to be working. Thanks

Replies are listed 'Best First'.
Re: string matching
by Your Mother (Archbishop) on Feb 25, 2010 at 23:28 UTC

    Use URI. URIs are more complicated than they seem and this makes handling them easier. Here's an example with a couple of surprise cases that show why/how a regular expression can be much more difficult.

    use strict; use warnings; use URI; URI: for my $raw ( <DATA> ) { my $uri = URI->new($raw); if ( $uri->scheme ne "https" ) { warn "$uri is not secure, skipping\n"; next URI; } if ( $uri->path =~ m,/\z, ) { warn "$uri has a trailing slash, skipping\n"; next URI; } print "GOOD: $uri\n"; } __DATA__ http://perlmonks.org/?node_id=825405 https://gmail.com https://gmail.com/ https://perlmonks.org/? https://mail.google.com/mail/#inbox
      We can match this requirement in single line.
      Example:
      use strict; use warnings; open(FH,"data"); foreach ( <FH>){ if ( $_ =~ m/^https.*[^\/]\n$/ ) { print $_; } }

        I think you missed the point and an i modifier.

        while ( <DATA> ) { print if /\Ahttps.*[^\/]\n\z/; } __DATA__ http://perlmonks.org/?node_id=825405 HTTPS://gmail.com https://gmail.com/ httpsux https://perlmonks.org/? https://mail.google.com/mail/#inbox

        Gives these which are either completely invalid or "end" with a trailing slash since the fragment and the empty query string are irrelevant to the URI path.

        httpsux https://perlmonks.org/? https://mail.google.com/mail/#inbox

        If you know for a fact that your data set is simple/normalized enough, you could use a straightforward regular expression. URI is simple and robust however so not using it is just sloth and it will eventually bite you or the dev who inherits your code. Trusting input data to be well-formed is risky and only appropriate in one-offs.

Re: string matching
by kennethk (Abbot) on Feb 25, 2010 at 23:24 UTC
    The issue is that you are escaping the carat - you are trying to match ^https instead of https. See perlre and/or perlretut. The correct code might look like

    #!/usr/bin/perl use strict; use warnings; my $string = 'https://somewebsite.com/'; if ($string =~ /^https/) { print "https success\n"; } if ($string =~ /\/$/) { print "trailing slash success\n"; }

    Note I have used an un-escaped carat as an anchor to require matches start at the beginning of the string - I assume this is where your confusion originated.

    Depending on what you are trying to do, you may consider checking out some of the useful modules of CPAN, such as Regexp::Common::URI.

      For work "do not have slash on end", I think this is better:
      if ( $str !~ m[/$] ) { # do something }
      which don't need to escape "/".