Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello again, I have the follwing text file (portion below)
[get4] $uri=/LRA/RiskApplicationPage.aspx $method=GET [get5] $uri=/LRA/CSS/RiskManagement.css $method=GET
I need to format it to look like this
[RiskApplicationPage.aspx get4] $uri=/LRA/RiskApplicationPage.aspx $method=GET [RiskManagement.css get5] $uri=/LRA/CSS/RiskManagement.css $method=GET
I want to take the text following the last / and copy it to the inside of the get before it. This is how I'm currently attempting it...
#!C:\Perl\bin\perl -w use Getopt::Std; $USAGE = <<USAGE; Usage: Absolute_Path\\file A script to adjust the request ID and replace it with something a little more descriptive USAGE ; # process command line getopts('x') || die $USAGE; # get parameters $script = shift || die $USAGE; # rename .tst file to a .txt file chomp ($script); my $temp = $script; $temp =~ s/\.tsv$//; $new_script = join("", $temp, ".txt"); rename ($script, $new_script); open (DATA, "$new_script") || die "Error: Couldn't open $new_script : $!\n"; while(<DATA>) { push @Array, $_; } foreach $_ (@Array) { if ($_ =~ /^\[get/) { $_ = join("", $_, "TEST"); print $_; } }
Any suggestions? Thanks

Replies are listed 'Best First'.
Re: Need RegEx assistance
by dws (Chancellor) on Jul 12, 2002 at 19:38 UTC
    Two approaches suggest themselves: One is to slurp the entire file into memory, and then use /m or /s to build a regexp that spans lines, doing a substitution on the entire block in one swell s///gs.

    If you can't fit the file into memory, or fear doing regexps that span lines, you could do a two-pass substitution. On the first pass, build up a hash that maps [get] to the trailing component of the URI, then rewind the file and make the subsitution on the second pass. This lets you use simple regexps.

    If you haven't done regexps that match multiple lines, now's your chance. Go with method 1, and add to your bag of tricks.

Re: Need RegEx assistance
by jsegal (Friar) on Jul 12, 2002 at 20:43 UTC
    A third option -- slurp each stanza into memory at once, process that stanza, and then print.
    If you are sure you won't have square brackets inside your stanza, you could even let perl do the slurping for you by setting the input record separator. E.G. something like (warning, untested code ahead which may need tweaking...):
    $/="["; while(<>) { ($newprefix) = m{uri.*([^/]*)$}m; if ($newprefix) { # we won't match on the first line so it will print just "[" print $newprefix," "; } print; }
    Note the use of the "m" modifier of the matching to allow $ to match the end of line in the middle of a string.
    You could also be more prosaic, and read in line at a time, concatenating to a temporary variable, until you hit your new stanza leader, and do a similar process. (That would be cleaner, and more extensible/less hacky, too).
    Good luck...

    --JAS
Re: Need RegEx assistance
by jsprat (Curate) on Jul 12, 2002 at 21:01 UTC
    Here's a one-liner - I'm sure it could be golfed...

    perl -ibak -a -ln00e "$,=qq/\n/;($t)=$F[1]=~'/([^/]*)$';$F[0]=~s/\[/[$t /;print @F,qq/\n/" filename

    Here's another way - I moved your data into __DATA__ to simplify.

    #!C:\Perl\bin\perl -w use strict; #paragraph mode - see perlvar, input record separator $/ = ""; while(<DATA>) { my @lines = split /\n/, $_; my $file; $lines[0] =~ s/\[/\[$file / if ($file) = $lines[1] =~ '/([^/]*)$'; print join ("\n", @lines), "\n\n"; } __DATA__ [get4] $uri=/LRA/RiskApplicationPage.aspx $method=GET [get5] $uri=/LRA/CSS/RiskManagement.css $method=GET

    The key is to change the input record separator, so each time you read you get a single complete record to manipulate.

    Update: added "'" (single quote) in re line 10

      Thank you very much, but I'm having a rough time executing the code, I get the following error:
      Unmatched [ before HERE mark in regex m/<[ << HERE ^ at line 10
        Unmatched [ before HERE mark in regex m/<[ << HERE ^ at line 10
        Please cut and paste or 'd/l code', don't try to retype it. If you don't see the 'd/l code' link, reparent by clicking on the node title. You will see it.
        #m/<[ # ^ should be '/' m/\[ # ^
        I somehow deleted a single quote in line 10 when I posted it - take a look at the update. Not a very clear error message, huh?

        BTW, I'd have /msg'ed you, but you aren't logged in ;-) Maybe you should create an account?

Re: Need RegEx assistance
by amoura (Initiate) on Jul 12, 2002 at 20:31 UTC
    my $myGet =~ (m/(\w+\[get)/ ); my @path = split(/\//, $_[2]); my $myCSS = $path[-3]; my $newLine = "[$myCSS $myGet]";
    #this should get you the line as [RiskManagement.css get5]

    Edit by dws to clean up <code> tags