What you write is mostly true but not very relevant.
First, a correction: I originally wound up with 50 million strings averaging about 2.57 characters each (note that I split each line as it's read).
The fact that I'm replacing lots of undef with lots of strings isn't the problem. I observed that process, and when it was done, I had a certain amount of memory used. The problem is that memory usage continued to grow as I operated on (but did not add to) the arrays I'd created.
The problem is caused by the fact that Perl is converting all those strings to numbers for me on the fly. My solution was to force them to be numbers in the first place (instead of strings). Now, when I operate on the set, it stays the same size.