Hello Monks,

I have a list of UTF-8 encoded names in a file:

$ file -i names.txt names.txt: text/plain charset=utf-8 $ cat names.txt Ján Slota Peter Kažimír Alojz Hlina František Mikloško Ján Počiatek

I want to check whether these names are associated with some Slovak companies. I want to do it by running this script against the Business register (there is no API, AFAIK). The problem is, I'm not getting the expected results for all names (the script works just for the third name/line in the file). I guess it is because of the URI encoding done by URI::Encode (line 26, 27 in the script) - for example for the second name from the file I get:

http://www.orsr.sk/hladaj_osoba.asp?PR=Ka%C3%85%C2%BEim%C3%83%C2%ADr&M +ENO=Peter&SID=0&T=f0&R=on
and the portal is expecting (I get this by filling in the form on the portal):
http://www.orsr.sk/hladaj_osoba.asp?PR=Ka%9Eim%EDr&MENO=Peter&SID=0&T= +f0&R=on
I read I shouldn't even need to use URI::Encode most of the time. I have tried without it and with URI::Escape - without success. Can you show me the way? Thanks.

Excellence is an art won by training and habituation: we do not act rightly because we have virtue or excellence, but we rather have these because we have acted rightly. -- Will Durant


In reply to Percent encoding of URIs with UTF-8 characters by reisinge

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.