2009
|-- 01
| |-- 01.xml.gz
| |-- 02.xml.gz
| |-- 03.xml.gz
| ...
`-- 02
|-- 01.xml.gz
|-- 02.xml.gz
|-- 03.xml.gz
...
2009.tar.gz # contains whole year
2009.tar.gz.asc # gpg signature to verify data integrity
The data (html) that WXDaily.exe outputs is static and fixed size (10,443 bytes). It compresses well.
$ tree 1896/01
1896/01
|-- 18960101.txt.gz
|-- 18960102.txt.gz
|-- 18960103.txt.gz
|-- 18960104.txt.gz
|-- 18960105.txt.gz
|-- 18960106.txt.gz
|-- 18960107.txt.gz
|-- 18960108.txt.gz
|-- 18960109.txt.gz
|-- 18960110.txt.gz
|-- 18960111.txt.gz
|-- 18960112.txt.gz
|-- 18960113.txt.gz
|-- 18960114.txt.gz
|-- 18960115.txt.gz
|-- 18960116.txt.gz
|-- 18960117.txt.gz
|-- 18960118.txt.gz
|-- 18960119.txt.gz
|-- 18960120.txt.gz
|-- 18960121.txt.gz
|-- 18960122.txt.gz
|-- 18960123.txt.gz
|-- 18960124.txt.gz
|-- 18960125.txt.gz
|-- 18960126.txt.gz
|-- 18960127.txt.gz
|-- 18960128.txt.gz
|-- 18960129.txt.gz
|-- 18960130.txt.gz
`-- 18960131.txt.gz
$ du -h 1896
124K 1896/01
124K 1896/Copy (10) of 01
124K 1896/Copy (11) of 01
124K 1896/Copy (2) of 01
124K 1896/Copy (3) of 01
124K 1896/Copy (4) of 01
124K 1896/Copy (5) of 01
124K 1896/Copy (6) of 01
124K 1896/Copy (7) of 01
124K 1896/Copy (8) of 01
124K 1896/Copy (9) of 01
124K 1896/Copy of 01
1.5M 1896
$ du -hs 1896.tar.gz
372K 1896.tar.gz
1.5M to host a whole year (that is on-disk size)
and 372K to download whole year.
WXDaily output is structured enough to easily convert into xml, which is easily transformed into any format you required.
With this arrangement the webserver is doing what it is optimized for;
WXDaily.exe doesn't become a bottleneck.
If 1 webserver can't keep up with demand, a mirror can be established effortlessly.
|