eyepopslikeamosquito has asked for the wisdom of the Perl Monks concerning the following question:

If all lines of a block of text contains leading space/s, I want to left justify by removing as many leading spaces as possible.

my $s = <<'EOS'; hello there test EOS my $m = 32000; # regex /^ {$m}/ blows up if $m > 32766 while ($s =~ /^( *)/mg) { $m = length($1) if length($1) < $m } $s =~ s/^ {$m}//mg if $m;

Improvements welcome.

Replies are listed 'Best First'.
Re: Remove leading spaces (left justify)
by theorbtwo (Prior) on May 17, 2003 at 05:48 UTC
    my $lim = 32700; $s=~s/^ {1,$lim}//mg while ($s=~/^ /);
    Let the RE engine do the work. (You can make this simpiler if you don't worry about lines with more then 32k blanks at the start: $s=~s/^ *//mg;.)


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Re: Remove leading spaces (left justify)
by aquarium (Curate) on May 17, 2003 at 12:37 UTC
    this is hardly a regex problem..and in fact quite slow and resource hungry as you're looking lines ahead when there's no need. also it's quite difficult to work into such hard working regex any additional formatting of the text. a more extensible code, even though more long winded:
    $par_indent = 0; # or +ve for first line indenting $line_indent = 5; # char pos baseline. +ve for hang indent $line = <>; ($spaces, $first_line) = $line =~ m/^( *)([^ ].+)/; $chop_chars = length $spaces; print ' ' x $par_indent . $first_line . "\n"; while ($line = <>) { print ' ' x $line_indent . substr $line,$chop_chars }

Re: Remove leading spaces (left justify)
by BrowserUk (Patriarch) on May 17, 2003 at 05:19 UTC

    $s =~ s[^ +][]mg;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

      That unilaterally removes all leading spaces. I want to maintain any relative indentation. For example:

      hello one two

      should be transformed to:

      hello one two

      i.e. remove two spaces from each line in this case because the minimum number of leading spaces in any line is two. Sorry if I was unclear.

        Sorry. I misunderstood you. This would do it.

        $s =~ s[^ ][]mg while not $s =~ m[^[^ ]]mg;

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
Re: Remove leading spaces (left justify)
by TVSET (Chaplain) on May 17, 2003 at 07:17 UTC
    For some strange reason, I've always preferred /\s+/. Probably, because I rarely have to deal with spaces only. :)

    Leonid Mamtchenkov aka TVSET

Re: Remove leading spaces (left justify)
by TheDamian (Vicar) on May 19, 2003 at 01:23 UTC
    It's of no help yet, but it might interest folks to know that in Perl 6 you'd just write that as:
    my $s = <<'EOS'; hello there test EOS
    and the interpreter will strip as many whitespaces off each line as there are before the terminator.
      I read about that, and my instant worry was about the whole world of tabs-vs-spaces. So-called "smart" editors might think that the indent in your example is best served by a tab and some spaces to 'hello', two tabs to 'there', and two tabs and some spaces to 'test'. What is the right amount to remove?

      Since indentation is a visual thing but perl and emacs and vim and msdev.exe will likely not discuss their various rendering strategies, these indented here-docs are likely to get out of whack.

      --
      [ e d @ h a l l e y . c c ]

      Presumably, if the terminator is prefixed with 1 or more tabs, then that same number of tabs will be removed from the other lines (if they are present), and if it has 2 spaces and a tab, the 2 spaces and a tab will be removed from the other lines if present?

      This won't do the right thing if the whitespace is mixed and variable, but it might do 'the right thing' more often than not.

      From what I've read around the Monastery, you'd be forgiven that HEREDOCS were dangerous, unuseful animals that should have been depricated long ago.

      It's nice to see an old favorite of mine reinventing itself. I wonder if this change will satisfy its detractors?

        Yep, the leading whitespace will, by default, be matched exactly. If perl6 detects heterogeneous whitespace, it will detab from the first heterogeneous character, assuming 8-spaced hard tabs extending from the margin.

        And, yes, there will be a pragma to change that default behaviour in various ways (e.g. to 4-space tabs, full-detabbing, no-detabbing, etc. etc.).

Re: Remove leading spaces (left justify)
by Anonymous Monk on May 18, 2003 at 02:04 UTC
    $s=~s/\n\t\s+/\n/gs;