Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have to loop over my array and drop out all items that have characters that are NOT a-zA-Z, digits, periods or colons (actually I'm trying to ensure that the URL doesn't have any 'funky' chars in it that it shouldn't have.

Anyone know how to write this regex?

  • Comment on regex to detect any non digit and number

Replies are listed 'Best First'.
Re: regex to detect any non digit and number
by brian_d_foy (Abbot) on Oct 16, 2006 at 23:54 UTC

    It sounds as if you want URI::Escape (and maybe even Encode. Those should take care of the characters in the URL for you.

    Remember that URIs will also need [@/+?&;=%$,] (the reserved chars) and [-_.!~*'()] (the unreserved set), as well as a few odds-and-ends chars. For the full specification, see RFC 2396.

    --
    brian d foy <brian@stonehenge.com>
    Subscribe to The Perl Review
Re: regex to detect any non digit and number
by chargrill (Parson) on Oct 16, 2006 at 22:16 UTC

    Sure, lots of people do. Though we'd be more inclined to help if we see what you've tried so far, what works, what doesn't work, some sample input data, etc. Take a look at How (Not) To Ask A Question for more details.



    --chargrill
    s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)
Re: regex to detect any non digit and number
by GrandFather (Saint) on Oct 16, 2006 at 22:18 UTC

    First thing to do is to read about regexen. They are really important for most things you do with Perl. I'd recommend that you start with perlretut, perlrequick, perlre and perlreref.

    When you have done that you will know that a negated character class like [^a-zA-Z\d.:] is what you are after. But read the documentation first.


    DWIM is Perl's answer to Gödel
Re: regex to detect any non digit and number
by ikegami (Patriarch) on Oct 16, 2006 at 22:21 UTC

    You're short many valid characters, but here goes:

    @ok = grep !/[^a-zA-Z0-9.:]/, @list;

    or

    @ok = grep /^[a-zA-Z0-9.:]*\z/, @list;