Re^4: how to change this code into perl
by Laurent_R (Canon) on Aug 30, 2015 at 17:57 UTC
|
OK, a real script that should detect all lines having duplicate keys (quick script, untested, no time now, but based on something I am doing quite often, so, hopefully, I've it right).
my ($previous_key, $previous_line);
open my $IN, "<", $infile or die "cannot open $infile $!";
while (<$IN>) {
my $key = $1 if /^(\w+)/;
if ($key eq $previous_key) {
print $previous_line if defined $previous_line;
print $_;
undef $previous_line;
} else {
$previous_line = $_;
}
$previous_key = $key;
}
| [reply] [d/l] |
Re^4: how to change this code into perl
by Laurent_R (Canon) on Aug 30, 2015 at 17:42 UTC
|
Sure, where there are two entries with the same key, it only prints the second one (the duplicate, not the original one); when there are three, it will print only the second one and the third one. And of course, it will work only if the lines are properly sorted.
If you need to print all the lines that are duplicates, then it is slightly more complicated, because you need to keep track of recent history. And then, yes, it is probably better to write a real script.
Another way is to use a hash to keep track of everything in memory.
| [reply] |
Re^4: how to change this code into perl
by perlnewbie012215 (Novice) on Aug 30, 2015 at 19:07 UTC
|
Hi poj, thank you for the quick response, I tried the script and could not get the duplicate rows, the outcome came up with zero rows. below is the script i tried
open IN,'<','/home/scripts/imageoutcome.txt' or die "Could not open $i
+nfile : $!";
my %count = ();
my @lines = ();
while (<IN>){
push @lines,$_;
# print $_;
if (/^(\S+)/){
++$count{$1};
}
}
close IN;
open OUT,'>','/home/scripts/outcome.txt' or die "Could not open $outfi
+le : $!";
#print @lines;
for (@lines){
if (/^(\S+)/){
print $count{$1};
print OUT $_ if $count{$1} > 0;
}
}
close OUT;
| [reply] [d/l] |
|
|
1 twenty
2 thirty
1 forty
1 fifty
Update : Does your file have spaces at the beginning of the lines ?
poj | [reply] [d/l] |
|
|
| [reply] |
Re^4: how to change this code into perl
by perlnewbie012215 (Novice) on Aug 30, 2015 at 17:29 UTC
|
| [reply] |
|
|
#!perl
use strict;
use warnings;
my $infile = $ARGV[0];
my $outfile = $ARGV[1];
open IN,'<',$infile or die "Could not open $infile : $!";
my %count = ();
my @lines = ();
while (<IN>){
push @lines,$_;
if (/^(\S+)/){
++$count{$1};
}
}
close IN;
open OUT,'>',$outfile or die "Could not open $outfile : $!";
for (@lines){
if (/^(\S+)/){
print OUT $_ if $count{$1} > 1;
}
}
close OUT;
poj | [reply] [d/l] |
Re^4: how to change this code into perl
by perlnewbie012215 (Novice) on Aug 30, 2015 at 19:14 UTC
|
Thank you very much Laurent_R, I tried the script and its printing all the rows, instead of duplicates. Laurent_R, this code looks very interesting, can you please explain it
#!/usr/bin/perl
my ($previous_key, $previous_line);
open my $IN, "<", '/home/scripts/imageoutcome.txt' or die "cannot open
+ $infile $!";
while (<$IN>) {
my $key = $1 if /^(\w+)/;
if ($key eq $previous_key) {
print $previous_line if defined $previous_line;
print $_;
undef $previous_line;
} else {
$previous_line = $_;
}
$previous_key = $key;
}
| [reply] [d/l] |
|
|
I tried the script and its printing all the rows
Then you have to show me your input data. I've just tried that script with the following input data:
aa blah
bb blah
bb blahblah
bb foo
cc dlqskjf
cc cfkqs
dd dkls
ee dsjkqjs
ff blah
gg klsqdj
gg sqkl
and it print only the lines where the first column is a duplicate, as shown in this output:
bb blah
bb blahblah
bb foo
cc dlqskjf
cc cfkqs
gg klsqdj
gg sqkl
This seems to work perfectly.
Otherwise, the way it works is that it reads the file one line at a time, and store this line ($previous_line), as well as the comparison key until the next line is read. If they have the same key, then I print the previous line (if defined) and the current one; in such case, I undef the previous line to prevent it from being printed twice if there are triplicates.
If it does not work properly for you, please show your input and/or test data.
| [reply] [d/l] [select] |
Re^4: how to change this code into perl
by poj (Abbot) on Aug 30, 2015 at 17:23 UTC
|
| [reply] |
Re^4: how to change this code into perl
by perlnewbie012215 (Novice) on Aug 30, 2015 at 19:40 UTC
|
Hi poj, you are correct, I forgot chomp, its working now. thank you so much for helping me.
| [reply] |
Re^4: how to change this code into perl
by perlnewbie012215 (Novice) on Sep 01, 2015 at 22:39 UTC
|
Hi Laurent_R, That was my bad, I had hidden characters in it, thats why I did not work. Your script is working...thank you so much for helping me and explaining it..
| [reply] |