Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys. I have a text file that looks like
January 1, 1915 Franck Pourcel January 3, 1926 George Martin January 3, 1945 Stephen Stills January 3, 1946 John Paul Jones January 4, 1942 John McLaughlin January 5, 1950 Chris Stein
And I want to replace it with
January 1, 1915 (Birthday) Franck Pourcel - singer January 3, 1926 (Birthday) George Martin - singer January 3, 1945 (Birthday) Stephen Stills - singer January 3, 1946 (Birthday) John Paul Jones - singer January 4, 1942 (Birthday) John McLaughlin - singer January 5, 1950 (Birthday) Chris Stein - singer
Sometimes the spaces after the year are less or more.

This is what I got so far.

#!/usr/bin/perl use warnings; use strict; open(LOG, "source.txt") or die "error"; my @lines = <LOG>; close(LOG); my $cnt = 0; foreach my $line (@lines) { $cnt++; $line =~ s/(\d+\s+)(.+)/$1 }
I think I need some help.

Replies are listed 'Best First'.
Re: small regex help
by Zaxo (Archbishop) on Jun 17, 2006 at 00:37 UTC

    You're pretty close,

    for (@lines) { s/(\d+\s+)(.+)$/$1(Birthday) $2 - singer/; print; }
    I don't see why you're counting the lines. You can make that print to a new output file if you want.

    After Compline,
    Zaxo

Re: small regex help
by GrandFather (Saint) on Jun 17, 2006 at 00:44 UTC

    The following seems to be what you are after:

    use strict; use warnings; my @lines = <DATA>; foreach my $line (@lines) { next if $line !~ m/(\d+\s)\s*(.+?)\s*$/; print "$1(Birthday) $2 - singer\n"; } __DATA__ January 1, 1915 Franck Pourcel January 3, 1926 George Martin January 3, 1945 Stephen Stills January 3, 1946 John Paul Jones January 4, 1942 John McLaughlin January 5, 1950 Chris Stein

    Prints:

    January 1, 1915 (Birthday) Franck Pourcel - singer January 3, 1926 (Birthday) George Martin - singer January 3, 1945 (Birthday) Stephen Stills - singer January 3, 1946 (Birthday) John Paul Jones - singer January 4, 1942 (Birthday) John McLaughlin - singer January 5, 1950 (Birthday) Chris Stein - singer

    However you probably want to read up a little about sprintf (to clean up the formatting). You might also want to read the Tutorials sections here for an introduction to regexen. Also read the Perl man pages perlretut, perlrequick and perlre.


    DWIM is Perl's answer to Gödel
Re: small regex help
by swampyankee (Parson) on Jun 17, 2006 at 01:05 UTC

    Presuming your code was copied correctly, it's not syntactically correct: you're missing the closing slash after the $1. When that's corrected, if I read it correctly (a very big if!), you'll just replace the year and everything to the right of it by the year, and the trailing blanks.

    Try this:

    #!/usr/bin/perl use strict; use warnings; while(<DATA>){ chomp(); print "before: $_\n"; s/(\d+\s+)(.+)/$1 (Birthday) $2 - singer/; print "after: $_\n"; } __DATA__ January 1, 1915 Franck Pourcel January 3, 1926 George Martin January 3, 1945 Stephen Stills January 3, 1946 John Paul Jones January 4, 1942 John McLaughlin January 5, 1950 Chris Stein

    I think this will make the change you want; the regex is the important bit.

    emc

    e(π√−1) = −1
Re: small regex help
by sulfericacid (Deacon) on Jun 17, 2006 at 00:35 UTC
    The following code is untested, but probably has a 90-95% chance of working.
    #!/usr/bin/perl use warnings; use strict; open(LOG, "source.txt") or die "Error: $!"; my @lines = <LOG>; close(LOG); my $cnt = 0; foreach my $line (@lines) { $cnt++; $line =~ m/(\d+\s+)(.+)/; $lines[$cnt] = "$1 (Birthday) $2 - singer"; } open(LOG, "> source2.txt") or die "Error $!"; print LOG join("\n", @lines); close(LOG);


    "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

    sulfericacid
Re: small regex help
by sh1tn (Priest) on Jun 17, 2006 at 14:17 UTC
    # in the cycle: ... s/(\w+\s+\w+\W*).$/(Birthday) $1 - singer/ ...