I've been working on a script to test the total sizes of web pages.

It's easy enough to do in a fairly straightforward way, based on oldfashioned HTML. Here's what you do:

So now I'm trying to do in in the modern world of CSS.

My first problem is that there are two kinds of CSS files which might be imported into a page.

One is via the LINK REL tag, which is easy enough to find with a parser, but the other is via the @import url(URLGOESHERE) statement.

I don't think there's a parser which will read that as a link, is there?

Never mind, it's easy enough to parse for in a regex (I know, I know).

But linked files can have nested linked files, that is, a CSS file that you import can contain one or more further @import url(URLGOESHERE) statements, so I'll have do to that recursively, but that's not the problem because...

...the true weight of an HTML 4/CSS page is the weight of the page, any CSS-tag files, any SCRIPT-tag files, any IMG-tag files, and any images referenced in the CSS files which have to be loaded.

For instance if there's a DIV with the ID "foo" with a P inside it with the class "bar", and somewhere in one of the CSS files there's a declaration which includes DIV#foo P.bar and sets a background image, that image should be counted toward the total.

But how will I know, without parsing the HTML and the CSS as well, which images are being loaded for that particular page?

The CSS file, if everything's going well, will be shared between multiple pages. Some of those pages will call on some of the images, but unless I parse the DOM and relate it to the CSS, I'm never going to know which images are being loaded on this particular page.

So, is this at all possible? Should I give up now..?



($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
=~y~b-v~a-z~s; print

In reply to Testing Page Size with HTML 4/CSS by Cody Pendant

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.