zebroz has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, What is the regexp to split a whole word with a period in it? I tried the expression: (split(~([a-zA-Z]+(.[a-zA-Z]*)*), @$contentString)) but I get error Bareword a not allowed while strict subs in use at guster.pl line 219. Regards, and thanks! Zak PS. sorry for the double post.. I just want a timely answer.

Replies are listed 'Best First'.
Re: RegExp quick question
by btrott (Parson) on Apr 04, 2000 at 00:41 UTC
    I'm sorry, but I don't get it. What are you trying to do? If you're trying to match a literal period, you need to escape it:
    my $str = "foo.bar"; if ($str =~ /\./) { print "Contains a period.\n"; }
    But as for that regular expression that you're passing to split... well, I'm not sure where you're going with that, or what you're trying to do. I'm getting compilation errors when I try to run that, because your regexp doesn't look right. This would be a valid regular expression:
    (split(/[a-zA-Z]+(.[a-zA-Z]*)*)/, @$contentString))
    However, I don't know if it actually does what you want it to do. Are you trying to split a sentence up into words, where a word can contain a period? If so, try something like this:
    my @words = @$contentString =~ /([\w.]+)/g;
    If not, try to clarify what you're asking for.
Re: RegExp quick question
by plaid (Chaplain) on Apr 04, 2000 at 00:57 UTC
    I'm also not quite sure what you're going for here. If you have a word such as 'foo.bar', you'd probably just want to do something like
    my($before, $after) = split(/\./, $word);
    Or, if the word might contain multiple periods
    my @tokens = split(/\./, $word);
    If you need to check that there is a period in the word, you should use the if statement posted above before splitting. If you want to verify that a word contains only letters, you'd want to do a matching regular expression prior to the split
    $word =~ /^[a-zA-Z]+\.[a-zA-Z]*$/;
    And potentially throw in some /s* if you want to check for whitespace in it.
Re: RegExp quick question
by lhoward (Vicar) on Apr 04, 2000 at 02:57 UTC
    First off, unless I'm mistaken split returns the number of split elements when called in a scalar context ($foo=split...) instead of in an array contest (@a=split....)
    If I understand your new explanation properly, what you want to do is remove trailing periods from words without affectin "internal" periods (periods that are preceeded and followed by a letter). Is that correct?
    Les Howard
    www.lesandchris.com
    Author of Net::Syslog and Number::Spell
Re: RegExp quick question
by Anonymous Monk on Apr 04, 2000 at 02:33 UTC
    Me again. I'm having more trouble... so I'll explain my question again more clearly as well. I'm trying to remove periods from words like this: Foo. -> Foo Foo.Bar. -> Foo.Bar Another question: Why would split return "1" as a result?
      Are you only trying to remove periods from the ends of words, then? Try this:
      $word =~ s/\.$//;
RE: RegExp quick question
by Anonymous Monk on Apr 04, 2000 at 01:23 UTC
    don't you need a slash between a-z & A-Z so it looks like.... a-z/A-Z
      Nope, a slash in there would effectively match everything in the range a-z, anything in the range A-Z, and the / character.
Re: RegExp quick question
by Anonymous Monk on Apr 04, 2000 at 03:14 UTC
    This is what I was looking for... thanks for everything.
    $contentString = $node->content(); my @foo = @$contentString; my $i; for( $i=0; $i<@foo; $i++ ) { foreach my $workgoddarnit (split(/\.\s|\!|\@|\#|\*|\$/ +, @foo->[$i])) { print OutFile$workgoddarnit; } }