kangaroobin has asked for the wisdom of the Perl Monks concerning the following question:
I was searching around for a way to remove duplicate lines from a text file that weren't necessarily adjacent to one another, while maintaining the line order. I found this solution by BrowserUk:
#! perl -sw use strict; my %lines; #open DATA, $ARGV[0] or die "Couldn't open $ARGV[0]: $!\n"; while (<DATA>) { print if not $lines{$_}++; } __DATA__ this is a line this is another line yet another and yet another still this is a line more and more and even more this is a line and this and that but not the other cos its a family website:)
It worked for me, but I don't know why. I still don't fully see the magic of hashes. Can someone explain why this works? Specifically the if not $lines{$_}++ structure? I really don't see how incrementing works here. Also, could you replace if not with unless?
Thanks!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Why does this hash remove duplicate lines
by mr_mischief (Monsignor) on Mar 06, 2008 at 05:31 UTC | |
by chromatic (Archbishop) on Mar 06, 2008 at 08:18 UTC | |
by mr_mischief (Monsignor) on Mar 06, 2008 at 17:19 UTC | |
|
Re: Why does this hash remove duplicate lines
by ysth (Canon) on Mar 06, 2008 at 05:31 UTC | |
|
Re: Why does this hash remove duplicate lines
by halfcountplus (Hermit) on Mar 06, 2008 at 05:59 UTC | |
by olus (Curate) on Mar 06, 2008 at 11:38 UTC |