JimStoneyBurk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm a beginner in perl and I have looked all over for ways in Perl to make URLs standardized. In other words, I'm looking to create a small program to check a string (with url) and then change any URL form of example.org, www.example.org or http://example.org to the standard form http://www.example.org, no matter what the domain name. Any advice on the best way to do this will be appreciated.

Replies are listed 'Best First'.
Re: Standardizing URLs
by Your Mother (Archbishop) on Jul 27, 2015 at 03:04 UTC

    My advice is don't do it; or don't do it that way. WWWdot are the most ridiculous and unnecessary 10 syllables in English.

    The tools you might want to look at to do what you want include but are not limited to: HTML::LinkExtor (extract links), URI::Find (find URIs in plain text), URI (deal with URIs properly, as objects), HTML::TreeBuilder (parse HTML to find attributes like href and src), XML::LibXML (same but different approach, works great with HTML with proper settings). See also: http://learn.perl.org/faq/perlfaq9.html#How_do_I_extract_URL

    If you provide what you've tried already + maybe sample input and desired output, you'll likely get more concrete assistance.

Re: Standardizng URLs
by 1nickt (Canon) on Jul 27, 2015 at 03:14 UTC

    Hi JimStoneyBurk, please show us what you have tried so far. Posting a short piece of working code (it works if it compiles and runs and produces an error) will lead to better answers faster.

    Also it is expected that you provide a link to any cross-posting you may have done on other sites, such as your post on StackOverflow. This is so that monks don't spend time answering a question in the monastery that has been answered elsewhere.

    Have you looked at URI? It has everything you need to build a solution for your problem.

    Alternatively, one of our brothers is working right now on a new module to rewrite URLs, you may want to check it out.

    Update: Cross-posting

    The way forward always starts with a minimal test.
Re: Standardizng URLs
by Anonymous Monk on Jul 27, 2015 at 03:08 UTC
Re: Standardizng URLs
by soonix (Chancellor) on Jul 27, 2015 at 08:23 UTC
Re: Standardizng URLs
by jellisii2 (Hermit) on Jul 27, 2015 at 14:19 UTC
    This is what Apache's mod rewrite was made to do. I'm sure other web servers have similar options.