Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How can I remove any whitespace more than two chars long?
i.e.
this is a sample sentence with more than enough whi +te space

Replies are listed 'Best First'.
Re: Silly Regex question
by Enlil (Parson) on Feb 28, 2003 at 22:01 UTC
    try: s/\s{2,}/ /g;

    -enlil

      That did it,
      Thanks !
Re: Silly Regex question
by Cody Pendant (Prior) on Feb 28, 2003 at 22:51 UTC
    Just want to say that s/\s\s+/ /g is Another Way To Do It, and to note that both of us are answering the question we think you were really asking, which is "how to I turn a string of two or more whitespaces into one space?". Right?
    --
    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
Re: Silly Regex question
by mowgli (Friar) on Feb 28, 2003 at 23:07 UTC

    $text =~ s/\s+/ /g;
    should do the trick.

    --
    mowgli

      Despite this having been voted down more than once, this may be a better answer than the others provided so far. I suspect it was voted down because the original question said "more than 2" while this answer "modifies" cases of single spaces.

      I find it interesting that everyone so far seems to have interpretting "more than 2" to be "2 or more" since all of the answers reduce a double space to a single space. Even the original question didn't include an example of two spaces being left unmodified so it may be that the original poster actually meant that.

      But the above answer leaves single spaces alone (it replaces them with a single space, which is what we started with) so it isn't any worse of an answer than the others. And it may be better because it will replace a tab with a space (and a tab will often be displayed as multiple spaces).

      But all of the answers will also turn newlines into spaces (at least if they are adjacent to any other whitespace), so I probably wouldn't use any of them.

      If you wanted to modify whitespace but not \n (nor \r), then you can use

      s/[^\S\n\r]{3,}/ /g
      for example.

                      - tye

        If the desire is to squash whitespace--with a definition for 'whitespace' of spaces or tabs--then tr/// might be a better choice

        $string =~ tr[\t ][ ]s

        BTW. I can see your logic for assuming the exclusion of \n, but I wonder about \r? If your going to exclude that, shouldn't you also exclude \f? I also wonder about \cK--does anything respond to Vertical tab any more?


        Examine what is said, not who speaks.
        1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
        2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
        3) Any sufficiently advanced technology is indistinguishable from magic.
        Arthur C. Clarke.