salmonix has asked for the wisdom of the Perl Monks concerning the following question:

Dear all,

I have a text file. This text file contains page numbers. Those should be \d-s, but there are typos like \d+o+ - when o stands for 0 (eg. 2oo). I am seeking for a solution to s/// those os to 0s.

The rule is if an o follows a \d, make it 0 - for each o, in one step. Perhaps the solution is trivial. I do not know. But I am lost somewhere.

I have this:s/(?<=\d|o)o/0/g, but this will also change \woO too, like foo -> fo0. And its not good.

(A bit later)

thank you for all

, the ideas really pushed me to the right direction. The /e flag simply escaped my attention, so did the i.

Pacem fortunamque, Domini!

Replies are listed 'Best First'.
Re: s/// for \dOO for typos
by JavaFan (Canon) on Oct 28, 2010 at 09:41 UTC
    Does this do what you want?
    use 5.010; use strict; use warnings; while (<DATA>) { chomp; my $copy = $_; s/\d\K(o+)/0 x length $1/eg; say "$copy -> $_"; } __DATA__ o 1o 123ooo 12oo123ooo56 abcooo1o o -> o 1o -> 10 123ooo -> 123000 12oo123ooo56 -> 120012300056 abcooo1o -> abcooo10
Re: s/// for \dOO for typos
by Marshall (Canon) on Oct 28, 2010 at 11:51 UTC
    Another version that may do what you want.
    #!/usr/bin/perl -w use strict; my @test = qw (20o3oo something 02oo-s 2o0o0o-s 2OoOO); foreach my $x (@test) { print "$x becomes "; $x =~ s|(\d[\doO]+)|my $a = $1; $a=~tr/oO/00/; $a|e; print "$x\n"; } __END__ 20o3oo becomes 200300 something becomes something 02oo-s becomes 0200-s 2o0o0o-s becomes 200000-s 2OoOO becomes 20000
    Update: I had neglected the captial O.

    A few notes that might help OP on how I arrived at this...
    Doing a direct substitution of one char for another is a natural job for tr. However, we don't want to just run amok changing all o or O letters to zeroes (something shouldn't become s0mething). The substitute operator allows a regex to specify the character sequence to apply "tr" upon. You might need to modify this with say "page" as an additional qualifier for the match, but maybe not - you could just run this s/// on every line and that would probably be fine. It is completely allowed to have Perl code calculate the thing that is going to be substituted! Very cool. And that is what the |e; option is all about.

Re: s/// for \dOO for typos
by ig (Vicar) on Oct 28, 2010 at 12:14 UTC

    I wondered about cases like 2o2 and 2O2. It was harder than I thought it would be to change these, but the following does:

    while(<DATA>) { print; s/\b([\do]+)\b/(my $x = $1) =~ s!o!0!gi; $x;/gie; print "--> $_"; } __DATA__ lo123 asdf lone fs 2oo 3O7 o123 o123b 321o asf o987o 123 23o8 123 o4o4ooo

    which produces

    lo123 --> lo123 asdf lone fs --> asdf lone fs 2oo --> 200 3O7 --> 307 o123 --> 0123 o123b --> o123b 321o --> 3210 asf o987o 123 23o8 --> asf 09870 123 2308 123 o4o4ooo --> 123 0404000
      Try this:

      #!/usr/bin/perl use strict; my $new = "100o798 boonanas woot!"; print "Original line: $new\n"; $new =~s/([0-9]+)o/$1\Q0\E/gi; print "With ~s: $new\n"; exit;


      Thanks,
      Dawn
Re: s/// for \dOO for typos
by use perl::always (Initiate) on Oct 28, 2010 at 09:29 UTC
    If I understand you correctly,
    You want to change
    d-0 into d-0
    Anyway, if this is what you want, try
    s/d\-0/d\-o/g
    If that's _not_ what you're after, maybe I misunderstood you.

    HTH

    --chris

    Shameless self pronotion follows PerlWatch