2.2. URL Character Encoding Issues

   URLs are sequences of characters, i.e., letters, digits, and special
   characters. A URLs may be represented in a variety of ways: e.g., ink
   on paper, or a sequence of octets in a coded character set. The
   interpretation of a URL depends only on the identity of the
   characters used.

   In most URL schemes, the sequences of characters in different parts
   of a URL are used to represent sequences of octets used in Internet
   protocols. For example, in the ftp scheme, the host name, directory
   name and file names are such sequences of octets, represented by
   parts of the URL.  Within those parts, an octet may be represented by
   the chararacter which has that octet as its code within the US-ASCII
   [20] coded character set.

##</code><code>##

  Reserved:

   Many URL schemes reserve certain characters for a special meaning:
   their appearance in the scheme-specific part of the URL has a
   designated semantics. If the character corresponding to an octet is
   reserved in a scheme, the octet must be encoded.  The characters ";",
   "/", "?", ":", "@", "=" and "&" are the characters which may be
   reserved for special meaning within a scheme. No other characters may
   be reserved within a scheme.

##</code><code>##

3.1. Common Internet Scheme Syntax

   While the syntax for the rest of the URL may vary depending on the
   particular scheme selected, URL schemes that involve the direct use
   of an IP-based protocol to a specified host on the Internet use a
   common syntax for the scheme-specific data:

        //<user>:<password>@<host>:<port>/<url-path>

   Some or all of the parts "<user>:<password>@", ":<password>",
   ":<port>", and "/<url-path>" may be excluded.  The scheme specific
   data start with a double slash "//" to indicate that it complies with
   the common Internet scheme syntax. The different components obey the