in reply to Non-deterministic behaviour with simple array initialization

If the problem follows the image, then it's likely some part of the image that's been corrupted. The guest OS's kernel or libc are suspect, as is the perl installation.

Try comparing the kernel, the perl executable, the C library, and libraries used by perl between the working image and non-working image a little deeper. In situations where file corruption is suspected du, cksum, md5sum, and diff are some tools to verify both sets of files are actually identical.

One way to speed this up is to create a tar file of the directories you suspect on each machine and compare the tar files. If the images are really identical then the tar files made from the same directories will be identical as well.

  • Comment on Re: Non-deterministic behaviour with simple array initialization

Replies are listed 'Best First'.
Re^2: Non-deterministic behaviour with simple array initialization
by thkarcher (Novice) on Sep 25, 2008 at 23:24 UTC

    I md5sum'ed and cksum'ed the kernel image and all files that are accessed according to strace. The /boot/initrd and /etc/ld.so.cache differ, but all other files are identical, including /usr/bin/perl and /lib64/libc.so.6.

    In case that makes a difference: It's a 64-bit system.

    Are there other files I should check?

      It's odd that ld.so.cache would be different on identical systems. That's supposed to be an ordered list of libraries found in the directories listed in ld.so.conf file. If you have additional libraries on one system not found on the other, that would explain that. Having additional libraries doesn't by itself explain the behavior at hand if they're libraries not actually used in the test, though.

      The initrd file shouldn't be of any consequence after the system is booted and running, as it's used to provide a temporary root file system before the actual root is ready. Again, I'm not sure why identical images would have different versions of this file, but it shouldn't matter. The file could be different because the images were made separately on hardware with different capabilities. It is possible to leave things running from initrd running after the system is loaded, but I doubt Suse does that on a typical installation.

      As for other files, I'm at a loss for the moment. I'll let you know if I think of anything, though.

        Perhaps I wasn't precise enough: The two VMs are roughly the same, which means, they're running the same kernel on a SuSE 10.3 with the distribution base packages. On each VM are some additional application-specific packages installed, of which some are not found on the other VM. 'ldconfig' walks across all the lib directories, building the cache according to the libs that are lying in there, and because not all libs are installed on both VMs, the resulting cache differs. But I don't think this yields to the problem, do you?

        Some kernel modules differ as well - do you think this could be an issue?

        Thanks,
        Thomas