Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

substitute space to 0

by yueli711 (Sexton)
on Aug 17, 2018 at 15:31 UTC ( [id://1220510]=perlquestion: print w/replies, xml ) Need Help??

yueli711 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I just want to substitute the space to 0. Spaces were separated by tab. Thanks in advance! For example

1 2 2 5 4 4 4 4 4 3 4 4 1 1 5 6 4 change to 1 2 0 0 2 0 5 0 0 0 4 4 4 4 0 0 4 0 0 0 3 0 0 0 4 4 0 0 1 0 0 0 0 0 1 5 6 0 4 0 0 0 0 0
#use strict #use warning open(IN1, "tmp01") || die "Cannot open this file"; @lines = <IN1>; $i = 0; for (@lines) { $lines[$i]=~s/\s\s+/\t0\t/g; $thislines[$i]=$lines[$i]; print $thislines[$i]; $i++; } open(OUT, ">tmp02") || die "Cannot open this file"; for $thisline(@thislines){ print OUT $thisline;} close(OUT); close(IN1);

Replies are listed 'Best First'.
Re: substitute space to 0
by tybalt89 (Monsignor) on Aug 17, 2018 at 15:59 UTC
    #!/usr/bin/perl # https://perlmonks.org/?node_id=1220510 use strict; use warnings; while( <DATA> ) { s/(^| \K)(?!\d)/0/g; print; } __DATA__ 1 2 2 5 4 4 4 4 4 3 4 4 1 1 5 6 4

      Hello, tybalt89, Thank you so much for your great help.

      The data file is

      1 2 2 5 4 4 4 4 4 3 4 4 1 1 5 6 4

      It comes out

      1 2 2 5 4 4 4 4 4 3 0 4 4 1 0 1 5 6 4 #!/usr/bin/perl # https://perlmonks.org/?node_id=1220510 use strict; use warnings; open(DATA,"DATA")||die"cannot open the file: $!\n"; while( <DATA> ) { s/(^| \K)(?!\d)/0/g; print; }

        You have too old a perl, try

        #!/usr/bin/perl # https://perlmonks.org/?node_id=1220510 use strict; use warnings; while( <DATA> ) { s/(^| )(?!\d)/$&0/g; print; } __DATA__ 1 2 2 5 4 4 4 4 4 3 4 4 1 1 5 6 4
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: substitute space to 0
by AnomalousMonk (Archbishop) on Aug 17, 2018 at 15:49 UTC

    Update: Unfortunately, neither of these approaches handles leading four-blank groups properly. Oh, well... (Update: Also, it seems to me that at some point the OPed example data changed from having groups of ASCII 0x20 blanks to having tabs. Is this really so? Only yueli711 can know. :)

    One way (assuming a "space" is a blank, i.e., ASCII 0x20):

    c:\@Work\Perl>perl -wMstrict -le "my $s = '1 2 2 5 4'; $s =~ s{ ([ ]{4}) (?! \d) }{${1}0}xmsg; print qq{'$s'}; ;; my $t = '1 2 0 0 2 0 5 0 0 0 4'; die 'bad transformation' unless $s eq $t; " '1 2 0 0 2 0 5 0 0 0 4'

    Update: A slight variation if you have Perl version 5.10+ and the  \K regex operator:

    c:\@Work\Perl\monks>perl -wMstrict -le "my $s = '1 2 2 5 4'; $s =~ s{ [ ]{4} \K (?! \d) }{0}xmsg; print qq{'$s'}; ;; my $t = '1 2 0 0 2 0 5 0 0 0 4'; die 'bad transformation' unless $s eq $t; " '1 2 0 0 2 0 5 0 0 0 4'


    Give a man a fish:  <%-{-{-{-<

Re: substitute space to 0
by LanX (Saint) on Aug 17, 2018 at 15:38 UTC
    Is your input tab delimited or dealing with fixed width columns?

    I recommend first splitting the columns and then replacing space/undef with 0.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: substitute space to 0
by AnomalousMonk (Archbishop) on Aug 18, 2018 at 05:11 UTC

    A hex dump of the download of the OPed example data seems to show that these are tab-separated CSV records in which you need to expand null fields to a single  '0' character. If that's the case, the best advice is to save yourself a headache | many headaches and to use the flexible and reliable Text::CSV module (which will automatically install its cool, fast big brother Text::CSV_XS if it possibly can).

    However, if a "pure" regex solution is needed, here's my approach. (A split/fixup/join approach might be easier to understand, but I'm too tired right now to take that on. :) Tested under Perl versions 5.8.9 and 5.10.1.4, but some alternate versions of the critical substitution regex need version 5.10+ for the  \K operator.

    Output:


    Give a man a fish:  <%-{-{-{-<

Re: substitute space to 0
by kcott (Archbishop) on Aug 18, 2018 at 09:40 UTC

    G'day yueli711,

    Update: This approach has issues with some of the other data. See ++AnomalousMonk's feedback re these problems.

    I believe applying this regex to each line will provide what you want.

    s/\t(?=\t)/\t0/g

    My test (just using the data from the first line you posted):

    $ perl -le 'my $x = "1\t2\t\t\t2\t\t5\t\t\t\t4"; print $x; $x =~ s/\t( +?=\t)/\t0/g; print $x' 1 2 2 5 + 4 1 2 0 0 2 0 5 0 0 + 0 4

    Note: As you can see, my test used just tabs. The output lined up as shown above, both from my command line and when pasted into the textarea; however, the browser made a mess of it when rendered. I've actually changed all the output tabs to a series of spaces: what I've now posted is just a true representation of what the output looked like, not a verbatim copy of the actual output itself.

    — Ken

      s/\t(?=\t)/\t0/g

      Unfortunately, that regex doesn't work with other of the OPed example data strings that have leading and/or trailing tabs, e.g., "4\t4\t4\t\t\t4\t\t\t\t3\t" and "\t\t4\t4\t\t\t1\t\t\t\t". (It took me a while, but I finally figured out that all that whitespace was due to tab characters. I suspect that yueli711 changed the posted example data, too. Oh, well... :)


      Give a man a fish:  <%-{-{-{-<

        G'day AnomalousMonk,

        ++ Thanks for advising of that problem. I've updated my post.

        — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1220510]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-16 06:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found