Remove unique lines from file

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Remove unique lines from file by kennethk (Abbot) on Mar 20, 2015 at 22:24 UTC
What have you tried? What worked? What didn't? See How do I post a question effectively? Please see How can I remove duplicate elements from a list or array? in perlfaq4. This won't exactly fit your need, but it should give you the tools necessary to attack the problem. You may also need to consult split and/or print, depending on your level of experience. #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.	[reply]
Re: Remove unique lines from file by Laurent_R (Canon) on Mar 21, 2015 at 09:33 UTC
Can you help me to keep only these lines in which the preceding number appears only once? In the example above, I would keep the lines starting with 1,3 and 5. That's seems contradictory. If you keep the lines starting with 1,3 and 5, then you keep the lines where the preceding number appears more than once. Please explain better what you really want. Je suis Charlie.	[reply]
Re^2: Remove unique lines from file by Lotus1 (Vicar) on Mar 21, 2015 at 16:43 UTC
They seem to mean the lines where the previous line number only occurred one time. There was only one line with a 2 or a 4 at the front. That still doesn't explain why to keep lines starting with 1.	[reply]
Re^2: Remove unique lines from file by BillKSmith (Monsignor) on Mar 21, 2015 at 19:23 UTC
Considering the title of the thread, I think he wants to keep lines whose leading number is unique. The confusion between "keep" and "remove" is probably one of viewpoint. Bill	[reply]
Re^3: Remove unique lines from file by GotToBTru (Prior) on Mar 23, 2015 at 01:15 UTC
The difference between "keep" and "remove" is as large a gap as exists! Perhaps more of a problem of vocabulary than viewpoint? Dum Spiro Spero	[reply]
Re: Remove unique lines from file by LanX (Saint) on Mar 20, 2015 at 22:25 UTC
What did you try so far? Any ideas? Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :)} PS: Je suis Charlie!	[reply]
Re: Remove unique lines from file by AppleFritter (Vicar) on Mar 20, 2015 at 23:22 UTC
Here's a quick and dirty solution: `#!/usr/bin/perl use strict; use warnings; my @lines = (); my %numbers = (); while(<DATA>) { my ($number) = m/^(\d+)/; push @lines, [$number, $_]; $numbers{$number}++; } foreach (@lines) { print $_->[1] if $numbers{$_->[0]} > 1; } __DATA__ 1 ahjewgfje 1 gopjregre 2 kkkkkkk 3 figjiorger 3 rekopfroeer 3 ejfjviknced 4 erjgirjgerio 5 eieuiee 5 reopjtfrpeoi` [download] What this does is iterate through the data (from the special `DATA` filehandle), extract the number at the beginning of each line using a regular expression (in list context, so it returns the captured values), and populate an array of arrays where each element of the first array is an anonymous two-element array containing the extracted number and the entire line. It also keeps a running total of how often each number was seen. Once all that's done it goes through the array of arrays; for each element (`$_`, representing a line), it checks whether the extracted number (`$_->[0]`, the first element of the anonymous array currently being looked at) has a running total of more than one, and if so, prints the line in question (`$_->[1]`, the second element of the anonymous array). One downside is that this'll slurp the entire file into memory before printing anything, which may be a problem if your files are very large. It'll also work no matter whether lines with the same number are separated by lines with different numbers or whether they're not (as in your sample data). Whether this is a feature or a bug only you can say.	[reply] [d/l]