agent00013 has asked for the wisdom of the Perl Monks concerning the following question:

I'm working with web page links in my perl script. I need to take a web page name and change it to the full absolute address.

Ex: http://www.something.com/something
or http://www.something.com/something/
and expanding it to
http://www.something.com/something/index.html

What regex would handle this easily? Thanx in advance.
  • Comment on What regex will swap somewebpage/ with somewebpage/index.html ??

Replies are listed 'Best First'.
Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by dimmesdale (Friar) on Jun 19, 2001 at 01:54 UTC
    srawls, your .= solution doesn't work in this case: http://www.something.com/something (one of the formats the poster was expecting). Trying, perhaps a conditional statement would fix your solution:

    if($address =~ m!/$!) { $address .= 'index.html' } else { $address .= '/index.html' }

    As per srawls' last solution offered, you can add a simple regex check to the condition clause, as well, if only certain $address variables are desired.
Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by I0 (Priest) on Jun 19, 2001 at 09:34 UTC
    s!/?$!/index.html!
Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by srawls (Friar) on Jun 19, 2001 at 00:30 UTC
    if you know you will always add 'index.html', a regex is too much. Try this:
    $address .= '/index.html'
    if you won't always add it, try this as a regex:
    $address =~ s!http://www.(?:[^/]*/*)*!/index.html!
Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by srawls (Friar) on Jun 19, 2001 at 00:52 UTC
    I guess it won't let me edit my post ...

    Change that last regex to:

    $address =~ s!http://(www.(?:[^/]*/*)*)!$1/index.html!
Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by Hofmator (Curate) on Jun 19, 2001 at 14:13 UTC

    If you only want to change exactly http://www.something.com/something (with or without slash) to http://www.something.com/something/index.html try:

    s{ http://www\.something\.com/something # the URL you are looking for # remeber to backslash '.' (?!\w) # a negative look-ahead asserti +on # do not allow a dirname of 'so +methinglong' /? # an optional slash } # end of search pattern { # replace with http://www.something.com/something/index.html }gx

    -- Hofmator

Re: What regex will swap somewebpage/ with somewebpage/index.html ??
by agent00013 (Pilgrim) on Jun 19, 2001 at 01:31 UTC
    eesh, well, that ended up tacking /index.html onto all the links, not just the ones that needed it...