The features that Perl data structures provide over minimal C data structures include at least the following:
- Reference counting to ensure prompt and reliable garbage collection. Once the reference count hits 0, the structure can be reclaimed.
- Data structures provide storage for multiple form values. For instance, if a string is used as an integer, and as a number, further accesses of the string as either a string, an integer or a number do not require for the string to be converted each time.
- In-place upgrading of types. For instance, if a type is blessed, or if a hash is tied, or if an integer becomes a string, all references (active references, not the same as Perl scalar references) remain valid. For comparison, consider the C linked list your friend mentions. Now change the list to a b-tree. Will all data structures that referred to the list now be ok referring to the b-tree? What if the b-tree data structure is bigger than the list data structure and requires being moved?
Another thing to explain to your friend is that the HASH structure is a form of indexed linked list. First, the key is hashed into an integer form, then the integer is used to produce an index into the array of linked lists.
The overhead for a standard Perl string is at least 24 bytes (on a 32-bit architecture). This is each of: a reference count (4 bytes), flags (4 bytes), a reference to a more specific data structure (4 bytes), a reference to the string content (4 bytes), the length of the string (4 bytes), the space allocated for the string as it is likely less than the length (4 bytes).
The overhead for a standard Perl hash is at least: (again, for a 32-bit architecture)
60 bytes +
((20 + length of average key) * number_of_keys)) +
(4 * size of array used within hash) +
(24 * length of average value, assuming all string values)
When I referred to the amount you would save for not pre-allocating hash buckets, I was referring to "4 * size of array" number above, which is almost insignificant compared to the rest of the overhead. The above only describes a hash mapping strings to strings. In your case, you are mapping a string to a scalar reference to a hash that maps strings to numbers.
Your specific case is pushing for title of 'application that best makes Perl look like a pig.' Usually, Perl is able to achieve significant performance gains by using more memory. In your case, the sheer size of the data structures likely results in the opposite effect.