Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

trimming space from both sides of a string

by szabgab (Priest)
on Oct 12, 2010 at 09:49 UTC ( [id://864785]=perlmeditation: print w/replies, xml ) Need Help??

The other day there was a lengthy bike shedding on how to trim spaces on both ends of a string on LinkedIn (I think you will need to be logged in to see that)

Today I saw another interesting one:

$mechanism =~ s/^\s*\b(.*)\b\s*$/$1/g;
here.

What other interesting ones can we found in open source code? (share them with links, please)

Replies are listed 'Best First'.
Re: trimming space from both sides of a string
by moritz (Cardinal) on Oct 12, 2010 at 10:36 UTC
    That doesn't work if the string non-space characters start with something that's not a word character:
    $ perl -wE '$_ = " - - "; s/^\s*\b(.*)\b\s*$/$1/g; say qq["$_"]' " - - "

    In general I prefer working over interesting solutions :-). Mine:

    use v6; say ' - - '.trim.perl;
    Perl 6 - links to (nearly) everything that is Perl 6.

      Why not use something like:

      $string =~ s/^\s+|\s+$//g;

      That should cut spaces from start and end and leave the rest.

        That's what I always use too. It does what you want (trim leading and trailing spaces), does it well, and does it without being too obscure what it is doing. I see no reasons to complicate this.

      Oh, so not only is that interesting it is also wrong :).

      Or at least I did not understand what it was doing.

Re: trimming space from both sides of a string
by phaylon (Curate) on Oct 12, 2010 at 16:35 UTC
    Does Text-Trim count? :)

    Ordinary morality is for ordinary people. -- Aleister Crowley
Re: trimming space from both sides of a string
by sundialsvc4 (Abbot) on Oct 13, 2010 at 01:24 UTC

    I guess that I’m just a lazy sot.   :-)

    I like to use the Text::Trim module.

    This package gives you the ltrim, rtrim, trim functions that you have come to expect ... and that later Perls actually provide.   Obviously, it is an uncomplicated module and therefore, “not the only way to do it,” but I like it because it helps me to SWIM = Say What I Mean.

Re: trimming space from both sides of a string
by OverlordQ (Hermit) on Oct 12, 2010 at 21:45 UTC
Re: trimming space from both sides of a string
by JavaFan (Canon) on Oct 14, 2010 at 15:35 UTC
    I find trimming space quite boring. There's nothing interesting about it. What I do is:
    $str =~ s/^\s+//; $str =~ s/\s+$//;
    rtrim/ltrim following trivially from it.
Re: trimming space from both sides of a string
by Kanji (Parson) on Oct 13, 2010 at 00:30 UTC
      What's wrong with:
      $string =~ s/^\s*(.*?)\s*$/$1/;
      I can't vouch for best speed in benchmarks, but works in any case I've needed such a construct for.

        It certainly works, but it's interesting to see what happens under the covers.

        If you use a regex that doesn't use captures:

        $a = ' fred and bill ';; Dump $a;; SV = PV(0x15be50) at 0x1b7ad8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1d3558 " fred and bill "\0 CUR = 32 LEN = 40 $a =~ s[^\s*|\s*$][]g;; Dump $a;; SV = PV(0x15be50) at 0x1b7ad8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1d3558 "fred and bill"\0 CUR = 13 LEN = 40

        You'll notice that the PV (the pointer to the memory holding the actual string) hasn't changed. Although the leading and trailing whitespace has been "removed", this has been done by juggling a few offsets into the original string.

        Now doing it your way:

        $b = ' fred and bill ';; Dump $b;; SV = PV(0x13be50) at 0x13e0d8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x1b3558 " fred and bill "\0 CUR = 32 LEN = 40 $b =~ s[^\s*(.*?)\s*$][$1];; Dump $b;; SV = PVMG(0x207928) at 0x13e0d8 REFCNT = 1 FLAGS = (SMG,POK,pPOK) IV = 0 NV = 0 PV = 0x38b4388 "fred and bill"\0 CUR = 13 LEN = 16 MAGIC = 0x391bda8 MG_VIRTUAL = &PL_vtbl_mglob MG_TYPE = PERL_MAGIC_regex_global(g) MG_LEN = -1

        Notice that the PV changed, meaning that it had to: calculate the offset and length of the "remainder"; it then allocated a new lump of space to hold it; then copy it from the original string to the new string; and then free the old space.

        And in the process, it upgraded the original PV to a PVMG and attached some magic to it, meaning several more allocations and frees. I'm not sure quite what that magic does in this case?

        So, whilst the end result is semantically the same, the route getting there is a lot further around.

        Sometimes, on a nice day, a slow, meandering route to the shops is a pleasant diversion, but few of would deliberately take a circuitous route habitually if we know a better one.

        Of course, if you really want to "do it the right way", you'd replace these simple one liners with a CPAN module like Text::Trim which uses the more efficient two regex method. Of course, by doing so you'd be throwing away the performance gain of the two regex method by a) calling a subroutine; b) having that subroutine do inanities like assigning @_ to itself! (But only if it actually contains something!)

        But hey. It's on CPAN, so it's got to be good right!


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: trimming space from both sides of a string
by mr_mischief (Monsignor) on Oct 14, 2010 at 20:10 UTC

    Updated. Fixed based on the error GrandFather was kind enough to point out.

    This reminds me of a discussion of the very thing from one of the newsgroups. It was almost certainly comp.lang.perl.misc in which it took place, about a decade ago. It included silly benchmarks. Yes, we're talking premature optimization of a simple line or two not likely to be a critical point and all that. I seem to remember that s/^\s.*//; s/\s.*$//; s/^\s*//; s/\s*$//; both did very well in the benchmarks and was accepted pretty well as clear over clever. It beat out things involving extra interpolations, substr, etc.

    Now, let me leverage Google Groups to refresh my memory and see if the above conclusion I recall was right. It might be interesting (although still somewhat silly) to see how the benchmarks compare on more modern perls compared to what we had then.

    Here's one such: NEED: Fast, Fast string trim(). Here's another: Removing spaces in a string. It turns out there are several such threads in the newsgroup on this topic between 1996 and 2002, with some later than that. Moreover, more than one went so far as to include benchmarks.

    This obviously doesn't prove the strange ones ever made it into larger projects, but given there were proponents for strange constructions there's a chance.

      It seems likely (without benchmarking) that those two substitutions (s/^\s.*//; s/\s.*$//;) are very efficient at removing leading and trailing white space. However the side effect of removing everything if there is leading or trailing white space may be too much of a price to pay for "efficiency"!

      True laziness is hard work
        oh! Right! No dot in there. s/^\s*//; s/\s*$//; instead of course. s/^\s+//; s/\s+$//; should work as well.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://864785]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-04-16 22:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found