kyte has asked for the wisdom of the Perl Monks concerning the following question:

Hi : I have a string witch has CamelCase words. I want to parse them. Any ideas?

Replies are listed 'Best First'.
Re: Parsing CamelCase words
by JavaFan (Canon) on Jun 12, 2009 at 22:01 UTC
    Typically we say a word is camel cased if it starts with a capital letter, then has one or more lower case letters, and at least one other capital letter.
    /[A-Z][a-z]+[A-Z][A-Za-z]*/
    would match that. But you may have another definition of camel case. You may require no two capital letters may follow each other. Or that it may start with a lower case letter. Or in a very lose definition, anything consisting of letters is camel cased.

    Please define what you exactly mean by CamelCase. (As an added benefit, if you can define how CamelCase looks like, you've almost written the regexp. Most regexp questions here aren't asked because people lack regexp knowledge - it's that they lack the ability (or will) to define what they want to match).

Re: Parsing CamelCase words
by ikegami (Patriarch) on Jun 12, 2009 at 21:20 UTC
    my @words = /([A-Z][a-z]*)/g;

    You didn't specify how to handle identifiers like "DBHandle".

      I just want words like CamelKyte from the string that I have got from a file

        Ah, I thought you wanted to extract the words from the identifier, not extract the identifiers from the surrounding text.

        Take my existing code to detect CamelCased words, and find repetitions of it:

        @cc_words = /((?:[A-Z][a-z]*)+)/g;

        If you don't want words that are all uppercase, it's easiest to just filter them out:

        @cc_words = grep /[a-z]/, /((?:[A-Z][a-z]*)+)/g;

        If you don't want CamelCase words with only one upper case letter, change the "+" to "{2,}".

Re: Parsing CamelCase words
by planetscape (Chancellor) on Jun 13, 2009 at 12:24 UTC