SmokeyB has asked for the wisdom of the Perl Monks concerning the following question:

Hey All! I've searched and searched (and maybe I didn't search hard enough) but I came up empty handed. I was wondering if there was a way to remove the tabs and white space from a line except when in quotes. For example I would like:
my $thisRocks = " That's what I said!";
to look like:
my$thisRocks=" That's what I said!";
I've found solutions to spliting on spaces except in quotes, but I was wondering if there is a way to do so with a substitution, or any other command. Cheers!

Replies are listed 'Best First'.
Re: Remove Tabs and white space from a line except in Quotes
by Zaxo (Archbishop) on Jun 26, 2003 at 15:18 UTC
    I've found solutions to spliting on spaces except in quotes,...

    Almost there, just do: my $result = join '', @the_so_split_array; Be careful if you apply this to perl code, as your example suggests. Some whitespace is necessary. $foo or die is not the same as $fooordie.

    After Compline,
    Zaxo

Re: Remove Tabs and white space from a line except in Quotes
by chip (Curate) on Jun 26, 2003 at 15:38 UTC
    If you're not interested in backslash processing, this works, albeit not particularly efficiently:

      s{ [ \t]* ( (?: " .*? " )? ) }{$1}xg;

        -- Chip Salzenberg, Free-Floating Agent of Chaos

      Thanks bro! This is a great start! Everyone else too, those suggestions will put me on the right track! Cheers!
      EDITED: Do you know what, I made a mistake. There is still some problems with what I posted, but I didn't realize it in time. Please ignore this! Hey Again, I took what you started with, and I modified it to account for backslash quotes and came up with this:
      s/[ \t]*((?:"(?:(?>[^\\"]*)).*")?)/$1/xg;
      It works just the way I want it to! I don't know how efficient it is, but it does the job. My next stumper is how to incorporate single quotes in as well? Any thoughts?
Re: Remove Tabs and white space from a line except in Quotes
by BrowserUk (Patriarch) on Jun 26, 2003 at 17:28 UTC

    Here's a slightly different approach. I don;t how it works out for efficiency, but it doesn't require any backtracking.

    $_ ='the quick "brown fox" jumps over the "lazy dog"'; $in = 0; s[ (?>(.)) (?{ $in = ~ $in if $1 eq '"' }) ([\t ]) ] [ $in ? $1.$2 : $1 ]gex; print thequick"brown fox"jumpsoverthe"lazy dog"

    You didn't specify whether you wanted the quotes removed , so I left them in.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Re: Remove Tabs and white space from a line except in Quotes
by hardburn (Abbot) on Jun 26, 2003 at 15:20 UTC

    I haven't done much with real parsing beyond simple regexen, but I believe what you really want to do is break the line into tokens:

    my $to_parse = q(my $thisRocks = " That's what I said!";); my @tokens = get_tokens($to_parse); # The elements in @tokens will look something like: # # 0: my # 1: $thisRocks # 2: = # 3: " That's what I said!" # 4: ; # # Now you can just print them out print @tokens; __OUTPUT__ my$thisRocks=" That's what I said!";

    Of course, I'm leaving out the implementation of get_tokens() (as you can see above, it would take in a string to parse and returns a list with each token in a single element). This is the part that goes beyond my experiance, so I'm leaving implementation of that alone. Hopefully, this will be enough for you to dig up the rest of the information. I do know that relevent code for how perl breaks it up is the the perl source code in toke.c.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated