comment on

There are already some quite good suggestions here. I'll repeat some here and add a couple. Take the repetition as a sign I agree with SuicideJunkie, the Anonymous monk, LanX, and MidLifeXis and not that I failed to read their nodes.

Disabling access time updates (noatime) is a good idea. If access times are in some way important to a process you have but you want better performance, the relatime option is available under Linux the past few years as well. It takes a much smaller performance hit than regular access time updates.

You should have enough memory in buffers, as already mentioned, to help with a lot of this. This is one more reason I think access times are one of the culprits (because they hit the disk even if the file was buffered).

A RAM disk for Perl might help. A RAM disk for the programs might help. Either one would help partly for the speed of initial access, although again buffering should already be helping here. Reading from the RAM disk would also help alleviate acces time updates, but again you can eliminate most (relatime) or all (noatime) of those anyway. Fitting both into a RAM disk would probably help more than one or the other, but one may make a much bigger difference than the other.

If you're able to buy another spindle, especially another 15k spindle, then putting part of what you're accessing on that may help. When running systems with a lot of disk contention having the heaviest accesses to your directory hierarchy split onto file systems on different physical disks can make a huge difference.

I had a mail server once that had its incoming spool (for both MX and outbound SMTP), outgoing spool (for sending outbound from the domains and to the POP3 server), its swap space, and its local copy of syslog output all on the same spindle when I inherited it from another admin. Whenever a piece of mail moved through the system, the log was updated. If a piece of mail was sent from within the customer base (it was a server for a small ISP we took over) to another customer, it would hit the disk coming in, when being virus scanned and spam scanned, when going to the outbound spool on its way to the POP3 server, and once or more for the syslog for each of those actions. In the short term, rather than waiting to order or build a new server, determine how to split the services, and make changes at the network level that would have to propagate, I just added another disk one Sunday morning about 3 AM. I kept the incoming spool, spam tests, and virus tests on the original disk. I moved the swap, the outgoing spool (which was rarely used except for network issues and MX servers for other domains being briefly unreachable), and the local copy of syslog output on another disk. That disk came from a spares shelf and may have been so fast at the time as 5400 or even 7200 RPM. That server went from nearly unresponsive with the drive access light for the primary storage drive being on constantly to running smoothly for another two or three months until we had a proper break-out plan for its services (which was actually an integration plan with our own server farm). This was a system I couldn't afford to babysit, because it sat in an unmanned data center half an hour away. I could babysit its fail-over server, but sending everything to the fail-over all the time is of course a bad idea.

RAID 1 is great for data redundancy. It can help a fair bit with reading if your controller is that smart. It's always going to tie up both spindles for every write, though. If you're thrashing disks and part of that thrashing is cause by writes (like access time writes to inodes) then RAID 1 just assures you of thrashing two disks rather than one. Anything that's thrashing your disks and can be replaced readily (like your Perl system) could be on another disk in the machine that doesn't necessarily even need to be in RAID. Even if you downloaded perl from source and built it, you can back that build up off-server. If necessary, you could put two more drives in and make it live on another RAID 1 array. You might even put in three more drives and make it a RAID 5 array to sit beside your RAID 1 you have now. ("Beside" is of course conceptually. It doesn't need to be how the disks are actually arranged.) ;-)

I know you've ruled out SSDs, but that may be premature, too. If you can afford 15k physical drives then you can probably afford small MLC SSDs. Your Perl system (without your programs that depend on it) would surely fit on a 32 GB SSD no matter how much you download from CPAN. Two cheap ones of those for a RAID 1 array cost about $100 from NewEgg right now. One of the tips for SSDs, though, is to use them with noatime to minimize writes. So there we're back to one of the original tips. If you can't afford to put $100 into your business-critical server, then you have some other serious issues as well.

I don't mean to be rude, but some of your assertions seem to based on incorrect assumptions about disk and filesystem management. As I've alluded to several times already, your whole directory structure indeed does not need to live on one filesystem. You can have mount points pretty much anywhere you want. You're not even limited to what the Linux installers suggest such as /, /boot, /var, /home, and /usr for mount points. You could have a separate /etc and in some unusual cases that even makes sense. You can have /var/log as a mount point to separate it from the rest of the /var directory, and you could do the same thing with a spool directory for a mail server or a document root for a web server for example. Sometimes /opt is a good choice for a separate filesystem, or even /tmp (and sometimes /tmp makes sense as a RAM disk). Sometimes on systems primarily for a single user I'll put their home directory (for example /home/chris) on its own partition and maybe even on a separate drive. Sometimes /lib, /usr, or /usr/local makes a lot of sense to separate out, especially if you in general need something like atime on most of the system but don't want to pay the price for every time an executable is run or a library is opened. One of the sweetest things about all this is that you don't need to reinstall the whole OS to do it. You can put in a new disk, partition it, make file systems on it, copy the data to the new filesystem, delete it from the old directory, and make the directory a mount point in your filesystem table (/etc/fstab for any Linux I've ever seen). How to do that in depth is probably more suited to some venue other than PerlMonks, though.

In reply to Re: Techniques to cache perl scripts into memory ? by mr_mischief
in thread Techniques to cache perl scripts into memory ? by fx

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.