Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi.

I've got a system that processes URLs and opens them. The following is a sample link (the system + the external url):
agent_site.pl?dir=$dir&lang=$lang&des=$des&page=https://www.consumerin +fo.com/cic/form_online_a1.asp?sc=00030000&af=&br=&cl=0105
Now, the agent_site.pl script messes up because those amperstand (&) signs in the consumerinfo.com URL get in the way... How can I tell the perl script to avoid amperstands after the &page one?

Thanks,
Ralph.

edited: Wed Aug 21 21:33:44 2002 by jeffa - code tags

Replies are listed 'Best First'.
•Re: Problem with query string
by merlyn (Sage) on Aug 21, 2002 at 21:35 UTC
    You'll need to create the URL with the URI module:
    use URI; my ($dir, $lang, $des) = qw(DIR LANG DES); # sample my $u = URI->new("agent_site.pl"); $u->query_form( dir => $dir, lang => $lang, des => $des, page => "page=https://www.consumerinfo.com/cic/form_online_a1.asp?sc +=00030000&af=&br=&cl=0105", ); print "$u\n";
    which prints:
    agent_site.pl?dir=DIR&lang=LANG&des=DES&page=page%3Dhttps%3A%2F%2Fwww. +consumerinfo.com%2Fcic%2Fform_online_a1.asp%3Fsc%3D00030000%26af%3D%2 +6br%3D%26cl%3D0105
    Next, if you're including it as part an HTML page (like for a link), you'll need to HTML-escape it, as:
    use HTML::Entities qw(encode_entities); print encode_entities($u);
    which prints out:
    agent_site.pl?dir=DIR&lang=LANG&des=DES&page=page%3Dhttps% +3A%2F%2Fwww.consumerinfo.com%2Fcic%2Fform_online_a1.asp%3Fsc%3D000300 +00%26af%3D%26br%3D%26cl%3D0105
    Although all of this is actually done for you if you're using CGI.pm, if you generate the link by mucking with the current "param" set. I have an example of that in one of my columns which shows how to generate a "self" url that has modified parameters.

    -- Randal L. Schwartz, Perl hacker

Re: Problem with query string
by chromatic (Archbishop) on Aug 21, 2002 at 21:37 UTC

    The best approach is to fix whatever generates the input URLs to encode URIs correctly. If you have bad data, fix it. Don't blame the tool.

Re: Problem with query string
by Aristotle (Chancellor) on Aug 21, 2002 at 21:37 UTC
    You should URI-encode your parameters. agent_site.pl?dir=$dir(=$lang&des=$des&page=https%3A%2F%2Fwww.consumerinfo.com%2Fcic%2Fform_online_a1.asp%3Fsc%3D00030000%26af%3D%26br%3D%26cl%3D0105 See the URI or URI::Escape modules. You should probably also escape $dir, $lang and $des.

    Makeshifts last the longest.

Re: Problem with query string
by tachyon (Chancellor) on Aug 21, 2002 at 21:42 UTC

    You need to URL encode the bit you want to remain untouched

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Problem with query string
by Anonymous Monk on Aug 21, 2002 at 23:09 UTC
    Hi, thanks for your replies!

    I tried using those modules but it just became to complicated, for such a simple job. It wouldn't resolve the URLs well.

    So here's my invention, and it works:

    @varsec = split(/\&/,$query); $dirraw = $varsec[0]; $langraw = $varsec[1]; $desraw = $varsec[2]; shift(@varsec); shift(@varsec); shift(@varsec); $wgo = join('&',@varsec); ($dirtag,$dir) = split(/=/,$dirraw); ($langtag,$lang) = split(/=/,$langraw); ($destag,$des) = split(/=/,$desraw); $ptdel = "page="; $ptrep = ""; $wgo =~ s/$ptdel/$ptrep/;


    In this way, I'm able to get the full URL without having the amperstands 'eaten' by the perl script. A little brute, but it gets the job done.

    Thanks,
    Ralph.

    Edit kudra, 2002-08-22 Replaced br tags with code tags