karthik92 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

i need a delete a line if it starts with a blank space. my file looks like below

*************************************** 1234 pass 25 30 1 pass 25 30 2 1965 pass 35 45 1 pass 35 45 2
use strict; use warnings; set LINES=40; set COLUMNS=80; open(my $in , "<" , "UR59_rev162_1KLoops_LastRun_SITEC1.txt") or die " +can't open input.txt $!"; open(my $out , ">" , "UR59_rev162_1KLoops_LastRun_SITEC1_deleted.txt") + or die "can't open out.txt $!"; while(<$in>) { if( $_ != /^\s/) { print $out $_; } }

but this code giving warning like ********** is not numeric.

2017-12-22 Athanasius added code and paragraph tags

Replies are listed 'Best First'.
Re: Delete a line
by 1nickt (Canon) on Dec 22, 2017 at 04:01 UTC

    Hi, please edit your post and enclose your code in code tags as shown in the instructions. Reading instructions and following them, and attention to detail, are important skills in programming.

    Your code:

    if( $_ != /^\s/)
    ... uses the numeric "not equal" comparison operator != and hence you get the helpful error because you are passing strings to it. See Equality Operators.

    You want:

    if( $_ !~ /^\s/)
    ... which uses the "not a match" regular expression operator !~. See Simple word matching.

    Hope this helps!


    The way forward always starts with a minimal test.
Re: Delete a line
by thanos1983 (Parson) on Dec 22, 2017 at 09:15 UTC

    Hello karthik92,

    Welcome to the Monastery. Well it looks the fellow Monk 1nickt provided an answer to your question. Just to include a two more alternative solutions:

    #!/usr/bin/perl use strict; use warnings; # use Benchmark qw(:all) ; # WindowsOS use Benchmark::Forking qw( timethese cmpthese ); # UnixOS my $str = " test of white space"; my $results = timethese(100000000, { 'regex' => sub { $str =~ /^\s/ }, 'substr' => sub { substr($str, 0, 1) eq ' ' }, 'ord' => sub { ord($str) == 32 }, }, 'none'); cmpthese( $results ); __END__ $ perl test.pl Rate regex substr ord regex 9225092/s -- -44% -87% substr 16556291/s 79% -- -77% ord 73529412/s 697% 344% --

    I used Benchmark to compare the alternative solutions and the fastest and best option on this case seems to be ord.

    Update: Sample of the proposed solution:

    my @strs = ("1234 pass 25 30 1", " pass 25 30 2", "1965 pass 35 45 1", " pass 35 45 2"); foreach my $sample (@strs) { if (ord($sample) == 32) { say "Matched: " . $sample; } } __END__ $ perl test.pl Matched: pass 25 30 2 Matched: pass 35 45 2

    Update2: In case you are wondering about ord it simply reads the first character and returns the numeric value. From the ASCII table 32 decimal is SPACE number 1 (character) would return 49.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
      Thank you very much for this really interesting benchmark, thanos1983, I knew that substr was likely to be faster than a regex for that type of case, but I never thought about using ord for that and did not imagine it would be so significantly faster. I still have to figure out whether it makes a significant difference in my use cases (typically reading a very large file and discarding a sizable fraction of its lines), but I might make good use of that to enhance the performance of some of my programs.

        Hello Laurent_R,

        I love timing things and playing with details. To me it does not make any difference, any solution would choose work just fine since I am not handling huge files.

        I am glad that someone else can take advantage of this maybe in the future.

        BR / Thanos

        Seeking for Perl wisdom...on the process of learning...not there...yet!

      If you pass $str as an argument to the benchmarked subs, the difference will be markedly less pronounced. Run the bench with time limit instead of loop count, and you'll see the test takes a suspiciously long time.

      The subs in question are too trivial to measure via Benchmark. Yet another example of benchmarking pitfalls...

Re: Delete a line
by Anonymous Monk on Dec 22, 2017 at 15:43 UTC
    Also, the Linux/Unix command egrep -v can do this directly, without programming. The -v modifier tells it to produce the lines which do not match the specified regex. Simply direct the output to another file, check that it has what you want, and rename it to replace the existing file.

    egrep '^\s+' -v myfile > myfile_edited

    Windows powershell has similarly elegant solutions that avoid the need for programming.