I was searching around for a way to remove duplicate lines from a text file that weren't necessarily adjacent to one another, while maintaining the line order. I found this solution by BrowserUk:
#! perl -sw use strict; my %lines; #open DATA, $ARGV[0] or die "Couldn't open $ARGV[0]: $!\n"; while (<DATA>) { print if not $lines{$_}++; } __DATA__ this is a line this is another line yet another and yet another still this is a line more and more and even more this is a line and this and that but not the other cos its a family website:)
It worked for me, but I don't know why. I still don't fully see the magic of hashes. Can someone explain why this works? Specifically the if not $lines{$_}++ structure? I really don't see how incrementing works here. Also, could you replace if not with unless?
Thanks!
In reply to Why does this hash remove duplicate lines by kangaroobin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |