gisrob has asked for the wisdom of the Perl Monks concerning the following question:

Monks - I have a text file that I'm trying to open, and reformat each line. The file contents looks like this:
1996.40637 1996.41064 1996.41199 1996.41467 1996.41882
I want to write a regex that takes one of these dates, removes the decimal, and removes the first two #'s, i.e. from 1996.41064 to 9641064 I can get the pattern right, but not the replacement. Can you help fix the statement to
$new = s/\d+.\d/replacement/g;
get the replacement string to work? I tried using substr on the value, but this didn't work, i.e.
$new = s/\.//g; $new = substr($new, 2);
Thanks

Replies are listed 'Best First'.
Re: substr on $_
by japhy (Canon) on Sep 28, 2001 at 00:31 UTC
    After seeing the suggested regexes, I'm going to offer a slight change:
    s/\d+(\d\d)\./$1/; # no need for the (\d+) at the end # being replaced by $2 (itself!)

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      Alternatively, no need to capture anything. Although alternation is generally an efficiency pig, for these small strings it tests a bit quicker.

      s/^\d\d|\.//g;
Re: substr on $_
by Ovid (Cardinal) on Sep 28, 2001 at 00:15 UTC
(crazyinsomniac: shock) Re: substr on $_
by crazyinsomniac (Prior) on Sep 28, 2001 at 12:02 UTC
    I am shocked and amazed to hear such blasphemy.

    The title of the node is "substr on $_ by gisrob" yet all I see is variations of $new = s/\d+.\d/replacement/g;.

    Talk about a shared mentality, didn't anyone notice the word substr?? No, it's not just gibberish, it is an actual perl function.

    Observe:

    #!/usr/bin/perl -w use strict; while(<DATA>) { my $dot_ix = index $_ ,'.'; printf "%40s: %s\n", "the string before trimming", $_; printf "%40s: %s\n", "dot+", substr $_, $dot_ix; printf "%40s: %s\n", "dot-2", substr($_, $dot_ix - 2); printf "%40s: %s\n", "trimmed fat", substr($_, $dot_ix - 2, $dot_ix -1, ''); # $dot_ix -1 since it's the num of chars, not index printf "%40s: %s\n", "the string after trimming", $_; } __END__ 1996.40637 1996.41064 1996.41199 1996.41467 1996.41882
    And the output:
    F:\dev>perl sub.pl
                  the string before trimming: 1996.40637
    
                                        dot+: .40637
    
                                       dot-2: 96.40637
    
                                 trimmed fat: 96.
                   the string after trimming: 1940637
    
                  the string before trimming: 1996.41064
    
                                        dot+: .41064
    
                                       dot-2: 96.41064
    
                                 trimmed fat: 96.
                   the string after trimming: 1941064
    
                  the string before trimming: 1996.41199
    
                                        dot+: .41199
    
                                       dot-2: 96.41199
    
                                 trimmed fat: 96.
                   the string after trimming: 1941199
    
                  the string before trimming: 1996.41467
    
                                        dot+: .41467
    
                                       dot-2: 96.41467
    
                                 trimmed fat: 96.
                   the string after trimming: 1941467
    
                  the string before trimming: 1996.41882
                                        dot+: .41882
                                       dot-2: 96.41882
                                 trimmed fat: 96.
                   the string after trimming: 1941882
    
    UPDATE: OOps, I think funny ;D
    #!/usr/bin/perl -w use strict; while(<DATA>) { my $dot_ix = index $_ ,'.'; printf "%40s: %s\n", "the string before trimming", $_; printf "%40s: %s\n", "dot+", substr $_, $dot_ix; # oops, i think funny # printf "%40s: %s\n", # "dot-2", # substr($_, $dot_ix - 2); printf "%40s: %s\n", "string - 2", substr($_, 2); # printf "%40s: %s\n", # "trimmed fat", # substr($_, $dot_ix - 2, $dot_ix -1, ''); # # $dot_ix -1 since it's the num of chars, not index printf "%40s: %s\n", "trimmed fat", substr($_, $dot_ix, 1, ''); printf "%40s: %s\n", "trimmed fat", substr($_, 0, 2, ''); printf "%40s: %s\n", "the string after trimming", $_; } __END__ 1996.40637 1996.41064 1996.41199 1996.41467 1996.41882

     
    ___crazyinsomniac_______________________________________
    Disclaimer: Don't blame. It came from inside the void

    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

      Well, in order to appease crazyinsomniac's shock at the lack of substr() solutions, here's a fun little version:

      #!/usr/bin/perl -w use strict; $_ = 'x' x 11; my($x,$y) = \(substr($_,2,2), substr($_,5)); while(<DATA>){ print $$x,$$y; } __END__ 1996.40637 1996.41064 1996.41199 1996.41467 1996.41882
(jeffa) Re: substr on $_
by jeffa (Bishop) on Sep 28, 2001 at 00:16 UTC
    Here is one way:
    use strict; # look ma! no $_! ;) while (<DATA>) { s/\d\d(\d\d)\.(\d+)/$1$2/; print; } __DATA__ 1996.40637 1996.41064 1996.41199 1996.41467 1996.41882

    UPDATE: ack! i see it now: you want =~, not = for your substitution lines.

    Here is my above while loop with the implicit $_:

    while ($_ = <DATA>) { $_ =~ s/\d\d(\d\d)\.(\d+)/$1$2/; print $_; }

    UPDATE 2: re japhy
    DOH! ;)

    jeffa

      Cool - thanks.
Re: substr on $_
by virtualsue (Vicar) on Sep 28, 2001 at 17:39 UTC

    I'm pretty big on checking the validity of the data that I get as well as doing the correct kind of actions with it. Perl is naturally magnificent at doing both in one fell swoop:

    #!/usr/local/bin/perl -w use strict; while (my $val = <DATA>) { chomp $val; print "Read: $val - "; # Only accept input values which consist of NNNN.NNNNN, one per lin +e if ($val =~ s/^\d{2}(\d{2})\.(\d{5})$/$1$2/) { print "format OK, changed to $val.\n"; } else { print "item $val not in YYYY.NNNNN format.\n"; } } __DATA__ 1996.40637 1996.41064 1996.41199 1996.41467 1996.41882 11996.41882 1997.418828 1998.4128
Re: substr on $_
by dragonchild (Archbishop) on Sep 28, 2001 at 00:14 UTC
    If you assign the results of a substition regex, that doesn't assign the new string. Regex substitutions are done in-place. The return value is the number of substitutions, not the new string.

    It sounds like you want to do too many operations on one line. You're not changed by the character, my friend! Just do your assignment, then your substitution.

    my $new = $_; $new =~ s/foo/bar/g;

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.