7stud has asked for the wisdom of the Perl Monks concerning the following question:

When the Apache Document Root is set to htdocs (in httpd.conf), e.g.

DocumentRoot "/Library/Apache2/htdocs"

what path should a cgi script use for an image located, say, here:

htdocs/my_imgs/pic.jpg

For example, suppose the cgi script is located in Apache's cgi-bin directory, and the cgi script has this line in it:

print '<img src="/my_imgs/blue_square.jpg" />';

That absolute path works fine for me: Apache is able to find the image and the image displays on the web page. It's my understanding that the leading slash refers to the Document Root.

However, this relative path also works:

print '<img src="../my_imgs/blue_square.jpg" />';

Why does that work? On the face of it, that path shouldn't work. The leading '..' says to go up one directory from the current directory. Presumably, the current directory is the cgi-bin directory, and going up one directory from that is the Apache2 directory. Then because there is no subdirectory in Apache2 called my_imgs, the path should fail.

Does Apache consider the cgi-bin directory to be a subdirectory of htdocs--even though in the filesystem the cgi-bin directory is outside of htdocs? Something else? Which is preferable, the absolute path or the relative path?

If someone were looking for tutorial ideas, I think it would be of great help to write a tutorial that lists the directory structure of Apache2, gives a brief description of each directory--emphasizing the importance of the logs/error_log--and then explains which paths will work in a cgi script and which paths are preferable. I've been searching the Apache web site and google for three days, and I can't find anything that addresses this question. Thanks.

Replies are listed 'Best First'.
Re: (OT) cgi: relative v. absolute paths, Apache
by merlyn (Sage) on Nov 19, 2009 at 16:45 UTC
    Relative URLs are interpreted and calculated by the browser, not by the server. So you don't need to consider the actual disk layout of your directories—just look at the URL. If you refer to an image at "../my_imgs/pic.jpg" from a page that was fetched at "/cgi-bin/somescript", the browser subtracts /cgi-bin, and requests "/my_imgs/pic.jpg".

    It's all about the browser.

    -- Randal L. Schwartz, Perl hacker

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

      As merlin states above, it is all about the browser.

      Your apache configuration defines a mapping that creates a virtual directory tree in web space. In it's stock form, I believe that apache maps something like this (images directory from your example):

      Web pathDisk path
      /$SERVER_ROOT/htdocs
      /cgi-bin$SERVER_ROOT/cgi-bin
      /my_imgs/pic.jpg $SERVER_ROOT/htdocs/my_imgs/pic.jpg

      So, when your cgi script (href=/cgi-bin/myscript.cgi) runs, the relative path "../my_imgs/pic.jpg" would refer, from the browser's perspective, to /cgi-bin/../my_imgs/pic.jpg, or /my_imgs/pic.jpg, which is what you have as the absolute path.

      Hope this helps.

      Update (2009/11/20 11:36 GMT-0500): Whoops. s/\$DOCUMENT_ROOT/$SERVER_ROOT/g; $SERVER_ROOT == ServerRoot setting in httpd.conf. $DOCUMENT_ROOT = $SERVER_ROOT/htdocs.

      --MidLifeXis

        Hi,

        Thanks for the responses. I don't get the url math, though. Can you direct me to a tutorial?

        The only way I can understand it is if I say to myself, "The current directory is cgi-bin, and the ../ directory refers to cgi-bin's parent directory. Then according to the virtual directory structure, cgi-bin's parent directory is the root directory, which is htdocs, and then you descend from htdocs to the my_imgs directory--where the file is found.

        Web path	 Disk path
        /	         $DOCUMENT_ROOT/htdocs
        /cgi-bin	 $DOCUMENT_ROOT/cgi-bin
        /my_imgs/pic.jpg $DOCUMENT_ROOT/htdocs/my_imgs/pic.jpg
        
        Hope this helps.

        I don't think that is correct. My DocumentRoot is set to /Library/Apache2/htdocs, so it wouldn't make sense to say that the web path for / is the disk path $DOCUMENT_ROOT/htdocs, which would be /Library/Apache2/htdocs/htdocs. I found this on the apache website:

        DocumentRoot directive
        
        Syntax: DocumentRoot directory-path
        Default: DocumentRoot /usr/local/apache/htdocs
        Context: server config, virtual host
        Status: core
        
        This directive sets the directory from which httpd will serve files. 
        Unless matched by a directive like Alias, the server appends the
        path from the requested URL to the document root to make the
        path to the document. Example:
        
            DocumentRoot /usr/web
        
        then an access to http://www.my.host.com/index.html refers to /usr/web/index.html.
        
        There appears to be a bug in mod_dir which causes problems
        when the DocumentRoot has a trailing slash (i.e., "DocumentRoot
        /usr/web/") so please avoid that.
        

        One thing I discovered: when a browser converts a relative path to an absolute path prior to requesting a resource, if a relative path tries to move up the hierarchy of a url too far with ../../../, the extra ones are ignored. For instance, if the page's url is:

        http://localhost/cgi-bin/prog1.pl

        the current directory is cgi-bin. However, in that url cgi-bin does not have a parent directory. Therefore, if the cgi script produces a page with an image that uses this relative path::

        <img src="../../../../my_imgs/blue_square.jpg"

        the ../../../../ part of the relative path just gets you:

        http://localhost

        then the rest of the path, /my_imgs/blue_square.jpg, gets appended to that, giving you:

        http://localhost/my_imgs/blue_square.jpg

        Subsequently, when apache receives the request for that url, as the passage from the apache website above says, everything after the host gets appended to the document root, which in my case yields this:

        /Library/Apache2/htdocs/my_imgs/blue_square.jpg

        That is a real path on the filesystem. To summarize there is a two step process:

        1) The browser converts a relative path (used by an html element on a page) to an absolute path by looking at the page's url, then sends a request for that url to the Apache server.

        2) Apache takes the part of the url after the host name and appends it to the DocumentRoot (as specified in httpd.conf). For instance, if apache receives a request for this url

        http://www.mysite.com/dir1/dir2/page.htm

        the host name is www.mysite.com, and with my DocumentRoot (= /Library/Apache2/htdocs) Apache would create the following path to the requested resource:

        /Library/Apace2/htdos/dir1/dir2/page.htm

        That's my current mental model of what's going on. I'll adjust it as required.

Re: (OT) cgi: relative v. absolute paths, Apache
by moritz (Cardinal) on Nov 19, 2009 at 16:52 UTC

    Update: I was completely on the wrong track, please ignore

    Just use
    #!/usr/bin/perl use Cwd; print "Content-Type: text/plain\n\n" print getcwd(), $/;
    To find your current working directory. I'm sure it's somewhere in the Apache docs, but this could be a lot easier than finding the appropriate place in the docs :-)
    Perl 6 - links to (nearly) everything that is Perl 6.