There will be brief latency spikes as CoW page links are broken. Assuming that there is sufficient RAM available that this does not result in using swap, there will be no lasting slow down. The latency spikes can hit either parent or child process, whichever first writes to a CoW page.
Note that newer Linux kernels also have a "kernel same-page merging" feature that opportunistically searches physical memory for pages that happen to have the same contents and replaces them with a single CoW page. If this is enabled, CoW-break latencies can hit even unrelated processes, if the kernel happened to notice that they had pages with the same contents. Note also that CoW-break should be much faster than swap and pages can also be swapped out, so this should not be a significant performance concern.
The Perl runtime itself is written in C and therefore compiled in advance and demand loaded by mmapping libperl. Read-only mappings like those used for executable machine code are (or should be...) always shared between all processes that map the same file. You should only have one copy of libperl in RAM no matter how many (unrelated) perl processes you have running, but each Perl interpreter has considerable data structures that are built independently and not mapped from the filesystem and therefore will probably not be shareable between unrelated processes, although fork will "copy" them and "same-page merging" could combine them if two processes happen to have byte-identical structures.