fionbarr has asked for the wisdom of the Perl Monks concerning the following question:

I feel silly asking this question but I'm wasting time going nowhere.
my $str = 'xyzxyzxyz - Prod Cache Broker - 16100'; print "$str\n$1\n" if ($str =~ m/^(.+)\s+?.+$/);
I only want the xyzxyzxyz stuff but I keep getting
xyzxyzxyz - Prod Cache Broker -

Replies are listed 'Best First'.
Re: minimal matching regex
by davido (Cardinal) on Feb 10, 2015 at 21:37 UTC

    my $str = 'xyzxyzxyz - Prod Cache Broker - 16100'; print "$1\n" if ($str =~ m/^(\S+)\s+?.+$/);

    The problem you were having is that (.+) is greedy, so it is consuming "xyzxyzxyz - Prod Cache Broker - 16100", and then the \s+? was forcing it to give up the space immediately preceding "16100", and finally the .+ was grabbing "16100".

    If you wanted the parens to capture only up to (but not including) the first space, just tell it to match non-space characters.


    Dave

Re: minimal matching regex
by Corion (Patriarch) on Feb 10, 2015 at 21:36 UTC

    See perlre for the non-greedy match operators, or maybe better, only capture the things you want to keep instead of capturing everything and later on figuring out what you don't want to keep.

    How about using \w+ instead of .+ in your first capture?

Re: minimal matching regex
by Laurent_R (Canon) on Feb 10, 2015 at 22:32 UTC
    Perhaps simply that:
    $ perl -e 'my $str = "xyzxyzxyz - Prod Cache Broker - 16100"; print "$str\n$1\n" if $str =~ /(\w+)/;' xyzxyzxyz - Prod Cache Broker - 16100 xyzxyzxyz

    Je suis Charlie.
Re: minimal matching regex
by AnomalousMonk (Archbishop) on Feb 10, 2015 at 22:11 UTC

    Update: Disregard. Misread the OP. You say you do want only 'xyzxyzxyz'. Duh.

    You say what you don't want, 'xyzxyzxyz', but don't say what you do want. I'm going to assume it's the phrase 'Prod Cache Broker'. Here's one approach. The definition of the  $word regex can be tuned to match the data you're handling.

    c:\@Work\Perl\monks>perl -wMstrict -le "my $str = 'xyzxyzxyz - Prod Cache Broker - 16100'; ;; my $word = qr{ [[:alpha:]]+ }xms; ;; my ($phrase) = $str =~ m{ - \s+ ($word (?: \s+ $word)*) }xms; print qq{'$phrase'}; " 'Prod Cache Broker'


    Give a man a fish:  <%-(-(-(-<