in reply to Regular Expression - split string by lower/upper case

It's because of the greedy .+ you have after [A-Z] ... just tweaking that to be [^A-Z] instead of . will work, and gives what you want with that test data. An alternative is to do a substitution, then split on whitespace.
use strict; use warnings; while(<DATA>){ # split regex print join ":", grep length, split /(?:(.+?[a-z])([A-Z][^A-Z]+))/g, +$_; # substition then split method s/(?<=[a-z])(?=[A-Z])/ /g; print join(":", split ' ', $_), "\n"; } __DATA__ Genetics Genomics phylogeny allele ChromosomeLocusLink geneExpression RasSignalTransductionPathway

Replies are listed 'Best First'.
Re^2: Regular Expression - split string by lower/upper case
by johngg (Canon) on Apr 14, 2006 at 20:08 UTC
    I was going to do a substitution along the lines of s/([a-z])([A-Z])/$1 $2/g; but your look-behind/look-ahead is much neater and probably quicker. Something else new I have learned today.

    Thank you,

    JohnGG

Re: Regular Expression - split string by lower/upper case
by MiamiGenome (Sexton) on Apr 14, 2006 at 18:51 UTC
    THANK YOU!! I used the substitution with positive lookahead and positive lookbehind. Worked like a charm -- not surprising, this is the Perl Monks! Cheers!