in reply to pulling duplicate data from a complex structure

Your update code scans the entire results so far for every new row added ("Can you say 'slow', boys and girls? Good, I knew you could</Fred>). Your first item doesn't work because you're trying to push onto something that isn't an array reference (thanks to your incrementing it). You want a parallel %seen hash which tracks what items you've already pushed.