Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

replacing code with regex

by raflach (Pilgrim)
on May 11, 2005 at 15:46 UTC ( [id://456046]=perlquestion: print w/replies, xml ) Need Help??

raflach has asked for the wisdom of the Perl Monks concerning the following question:

Ok, I have a text file with the following in it:

\t\tmy($page)=&create_page(isitvalid("templatefile.html"),\%replace,\% +lists,@unhide);

NOTE: the \t's represent actual tabs in the file

I also have a script with the following regex

s/^\W* # we may have some white space at start of line my\W*? # all variable will use my \(? # variable may or may not be list \W*? # some people space out the parens and some don't \$page # assuming for now that page is always the variable \W*? # more possible space around parenthesis \)? # variable may or may not be a list \W*? # whitespce is optional = # we have to assign the variable \W*? # more optional whitespace \&? # function may or may not be explicitly contexted create_page\W*? # the name of the funciton followed by optional +space \( # parenthesis begins the parameterlist \W*? # we might have whitespace before first parameter ([^,]*?) # BROKEN: this should collect everything up to the +next comma in variable number 1 \W*? # this shouldn't be necessary since whitespace should h +ave been slurped on previous line but this shouldn't hurt either , # Got to have the separator \W*? # more optional whitespace ([^,]*?) # NOT BROKEN: this grabs everything up to the secon +d comma (of course we don't have parens and dblquotes in second param \W*? # this shouldn't be necessary since whitespace should h +ave been slurped on previous line , # Got to have the separator \W*? # more optional whitespace ([^,]*?) # NOT BROKEN: third one gets gotten fine as well \W*? # again the unnecessary whitespace collector , # Again the seperator \W*? # Again the optional whitespace ([^)]*?) # NOT BROKEN: heres our final collection point \W*? # Again with the unnecessary whitespace \) # our parameter list has come to a close \W*? # whitespace might seperate from closing punctionation thou +gh it seems unlikely ; # closing punctuation ends the statement the replace doesn't m +atter for the question /my \$renderer = new HRsmart::Lightning::Render ( template => $1 +,\n\t\tloops => $3,\n\t\treplace => $2,\n\t\tfinals => $4 );\n\$rende +rer->customize_header\( \$company_id \);\nmy \$page = \$renderer->ren +der;\n\$page =~ s\/~%.{0,20}%~\/\/g;\n/gx;

As you can see I'm trying to replace the old functional way of doing things with a new semi-oo way.

The problem is, instead of getting 'isitvalid("templatefile.html")' for $1 which is what I was expecting, it is cutting off before the second double quote, so I get 'isitvalid("templatefile.html'

Anyone can point me in the right direction?

BTW there's no chance of spaces between the doublequotes, so that's not an issue.

UPDATED: Under-commented code replaced with Over-commented code

Replies are listed 'Best First'.
Re: replacing code with regex
by reasonablekeith (Deacon) on May 11, 2005 at 16:08 UTC
    I would suggest you make use of the x modifier, break your regex out on multiple lines, and put some comments in. If you've not figured it out for yourself by then (and I bet you will have) you'll get a much better response from people here.

    HTH, Rob

    ---
    my name's not Keith, and I'm not reasonable.
      You overestimate me! Highly commented code still leaves me stumped. Anyone want to jump in here?
        As you followed my advice I could hardly not try and help!

        Anyway changing

        (\W*?) # this shouldn't be necessary since whitespace should have be +en slurped on previous line but this shouldn't hurt either
        ... to ...
        (\s*?) # this shouldn't be necessary since whitespace should + have been slurped on previous line but this shouldn't hurt either
        fixes it. Horrah. $1 printed 'isitvalid("templatefile.html")' when I tested it

        Anyway \s is a more standard way of matching whitespace, so I guess you could get milage out of changing all your instances of \W*?, which are probably matching more than you expect. Case in point.

        my $test = '{}[]£$%'; my $match = ($test =~ m/(\W*)/)[0]; print $match; __OUTPUT__ {}[]£$%
        ---
        my name's not Keith, and I'm not reasonable.
Re: replacing code with regex
by Joost (Canon) on May 11, 2005 at 16:09 UTC
Re: replacing code with regex
by Animator (Hermit) on May 11, 2005 at 18:43 UTC

    Based on my first look at it I can already give these hints: (I'm still looking at it though)

    (This first point follows on reasonablekeith's point:) You use \W*? to match optional whitespace, but \W represents a non-word charachter, and \s represents whitespace... also do you really want to make it non-greedy? I would suggest replacing \W*? with \s*, and also I would not give it the comment optional whitespace... (but that's just me)

    What I supsect is wrong is this part of the regex: \W*?([^,]*?)\W*?,
    You are making the [^,] optional and non greedy... Maybe \W*? eats all the non-comma symbols? Something like: \s*([^,]+),\s* might be better. Verifying by capturing \W*? shows that this is the problem. (as in $2 holds '")')

    And, if you post a message with a regex then you might want to give both the input and the output you expect, you described it, but giving the actual string makes it easier for everyone to verify...

    Update: I did some further investigating, and here is a regex that works... (or atleast as far as I can tell):

    s/^ \s* my \s* \(?\s* \$page # page is the variable \s*\)? \s* = \s* \&? # function may or may not be explicitly contexted create_page # function name \s* \(\s* # Begin param ([^,]+?), # 1st param \s* ([^,]+?), # 2nd param \s* ([^,]+?), # 3th param \s* ([^,]+?) # 4th param \s*\) # End param ; /your_replace_string/x;

    Also note that the /g in your regex is useless, since you have the ^... which will only match at the start of the string (unless ofcourse when you have /m too, but that's not the case in your example)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://456046]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-04-18 03:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found