in reply to Data structure challenge

This is an interesting question.

If I read you correctly, it is like this: you could get an only slightly slower solution by using a hash, or better, a binary tree instead of the uninitialized array A (with a binary tree, you get O(log n) instead of O(1) for each operation). In fact, this is true no matter how you use an uninitialized array, that is, you can always use a hash or a binary tree instead of one.

Notice however, that if you use such a data structure only once, and you use the efficent C implementation, the OS still has to initialize the array when the process gets the memory by calling brk or mmap.

Replies are listed 'Best First'.
Re: Data structure challenge
by Abigail-II (Bishop) on Mar 18, 2004 at 12:54 UTC
    with a binary tree, you get O(log n) instead of O(1) for each operation
    Well, only if the binary tree is balanced (and ordered). I am very well aware of the performance levels of a binary tree - if I wanted to settle for O (log n) performance, I wouldn't have posted the challenge. The challenge is to do all operations in O (1) - doing one or more in O (log n) is not a challenge.
    Notice however, that if you use such a data structure only once, and you use the efficent C implementation, the OS still has to initialize the array when the process gets the memory by calling brk or mmap.
    While some C libraries might initialize the memory claimed with mmap or brk, there's no reason they have to. The OS and/or the C libraries can always be patched that one can claim a segment of memory, and taking it "as is". For the scope of the challenge, you may assume that we run on a platform where memory is given out "as is".

    Abigail

      When saying binary tree, I thought of a trie actually. I know it's still O(log n) so does not satisfy the conditions, but a binary trie's code is not as complicated as that of a balanced binary tree.

      As for initializing memory: the C library certainly won't initialize the whole array when calling malloc. However, on multi-user OS, the system has to initialize all the pages allocated by a process, for security reasons. If it would not do that, you could get vulnerable informations by allocating much memory and searching for daat that other processes freed but hav not zeroed. This does not take much time because 1. malloc tries to reuse the memory freed by a process, so it doesn't have to ask for a new chunk evry time, 2. the os can do it in idle time, so you don't have to wait for it when you allocate the memory, 3. zeroing a chunk of memory is fast. The first reason may not count with very large arrays, as malloc gets thoes with mmap, and when they're freed they're munmapped.

      The question you asked still holds however, because using such a structure is good even in such OS'es, as you may use it many time, and you don't have to zero it evry time the same process clears and reuses one. I still don't know if it can be done in Perl.

Re: Re: Data structure challenge
by dragonchild (Archbishop) on Mar 18, 2004 at 12:30 UTC
    No, the OS doesn't necessarily have to initialize the array. Many compilers will do that, but all malloc() should be doing is saying "I will be using this space for a while". It's not guaranteed at all what will be there.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.