Re: Find duplicate digits
by friedo (Prior) on Feb 15, 2006 at 15:00 UTC
|
I would use a hash to count the occurance of each digit in the numbers. Then, if the number has a count of two for any digit, keep it. Otherwise, discard it. For example:
my @keep; # numbers to keep
while(chomp( my $line = <NUMBERS> ) ) {
my @digits = split '', $line;
my %count;
++$count{$_} for @digits;
if( grep { $_ == 2 } values %count ) {
push @keep, $line;
}
}
| [reply] [d/l] |
|
|
my @keep;
while ( chomp(my $num = <DATA>) ) {
for ( split //, $num ) {
my @a = ($num =~ m/$_/g); #count occurances
if (@a>1) { push @keep, $num; last; } #keep if more than one
}
}
| [reply] [d/l] |
|
|
:shrug:, seems like roughly the same amount of work to me. Unless by "work" you mean lines of code, in which case I was being rather more verbose than usual. My method could easily be written:
my @keep; # numbers to keep
while(chomp( my $line = <NUMBERS> ) ) {
my %count;
++$count{$_} for split '', $line;
push @keep, $line if grep { $_ == 2 } values %count
}
I also wouldn't be surprised if the regex method was slower, but I'm too lazy to do a benchmark right now. :) (And for only 10,000 numbers, it probably does not matter much.) | [reply] [d/l] [select] |
|
|
|
|
|
|
Strangely, nobody yet seems to have taken up the issue of this line in friedo's code:
while(chomp( my $line = <NUMBERS> ) ) {
Don't do this!!
Why? Because if, for example, the last line of the file does not end with a newline, your code will ignore it (in order to undertand why, type "perldoc -f chomp" at your command line).
For a more detailed discussion of this meme, see thread 303987, in particular the comments from ChemBoy and Abigail-II.
| [reply] [d/l] |
|
|
Works like magic, thaaaaaaaaaaaaaaanks you!
| [reply] |
Re: Find duplicate digits
by japhy (Canon) on Feb 15, 2006 at 15:08 UTC
|
So there are two rules:
- No digit can appear more than twice, and
- At least one digit must be duplicated.
So break it down into its cases:
m{
^
(?= \d\d\d\d $ ) # ensure it's only 4 digits long
(?:
# first digit is the duplicated one
(\d) \1 (?!\1) \d (?!\1) \d
| (\d) (?!\2) \d \2 (?!\2) \d
| (\d) (?!\3) \d (?!\3) \d \3
# second digit is the duplicated one
| (\d) (?!\4) (\d) \5 (?!\4|\5) \d
| (\d) (?!\6) (\d) (?!\6|\7) \d \7
# third digit is the duplicated one
| (\d) (\d) (?!\8|\9) (\d) \10
)
}x;
Now, that's just atrocious. It can be made a bit more efficient by grouping things together smarter:
m{
^
(?= \d\d\d\d $ ) # ensure it's only 4 digits long
(?:
# first digit is the duplicated one
(\d) (?:
\1 (?!\1) \d (?!\1) \d
| (?!\1) \d \1 (?!\1) \d
| (?!\1) \d (?!\1) \d \1
)
# second digit is the duplicated one
| (\d) (?!\2) (\d) (?:
\3 (?!\2|\3) \d
| (?!\2|\3) \d \3
)
# third digit is the duplicated one
| (\d) (\d) (?!\4|\5) (\d) \6
)
}x;
But we're still stuck with disgusting regexes. So... why use a regex? friedo's got a simple non-regex solution for you. The "pattern" you need to match isn't a pretty one.
Update: you could also use a far simpler regex that makes two passes at the string like so:
m{
^
(?= \d* (\d) \d* \1 )
(?! \d* \1 \d* \1 \d* \1 )
}x;
| [reply] [d/l] [select] |
|
|
Thanks for your enlightenment :) I didn't know how to approach the problem...regex was the only thing that came to mind. But, yes, I like friedo's simple solution.
| [reply] |
|
|
m{
^
(?= \d* (\d) \d* \1 )
(?! \d* \1 \d* \1 \d* \1 )
}x;
Could you explain what's going there? Tia :) | [reply] [d/l] |
|
|
The regex is anchored to the beginning of the string. The first assertion is a positive look-ahead that says "see if we can find zero or more digits, followed by a particular digit, followed by zero or more digits, and then that particular digit again". That satisfies the "at least one digit must be duplicated" rule. The second assertion is a negative look-ahead that says "make sure we can't match zero or more digits, that particular digit, zero or more digits, that particular digit, and then zero or more digits and that particular digit a THIRD time". This satisfies the "the digit can't appear more than three times" rule.
| [reply] |
|
|
Re: Find duplicate digits
by BrowserUk (Patriarch) on Feb 15, 2006 at 15:18 UTC
|
Not very satisfying as a solution, but it works.
#! perl -slw
use strict;
m[
(?:^[^0]*0[^0]*0[^0]*$)|
(?:^[^1]*1[^1]*1[^1]*$)|
(?:^[^2]*2[^2]*2[^2]*$)|
(?:^[^3]*3[^3]*3[^3]*$)|
(?:^[^4]*4[^4]*4[^4]*$)|
(?:^[^5]*5[^5]*5[^5]*$)|
(?:^[^6]*6[^6]*6[^6]*$)|
(?:^[^7]*7[^7]*7[^7]*$)|
(?:^[^8]*8[^8]*8[^8]*$)|
(?:^[^9]*9[^9]*9[^9]*$)
]x and print for '0000' .. '9999';
Update: ... but only for 4 digit numbers. By inverting the logic and only printing numbers that don't contain 3 repeated digits, it should work for numbers of any length.
#! perl -slw
use strict;
m[
(?:[^0]*0[^0]*0[^0]*0[^0]*)|
(?:[^1]*1[^1]*1[^1]*1[^1]*)|
(?:[^2]*2[^2]*2[^2]*2[^2]*)|
(?:[^3]*3[^3]*3[^3]*3[^3]*)|
(?:[^4]*4[^4]*4[^4]*4[^4]*)|
(?:[^5]*5[^5]*5[^5]*5[^5]*)|
(?:[^6]*6[^6]*6[^6]*6[^6]*)|
(?:[^7]*7[^7]*7[^7]*7[^7]*)|
(?:[^8]*8[^8]*8[^8]*8[^8]*)|
(?:[^9]*9[^9]*9[^9]*9[^9]*)
]x or m[(?=(\d).*\1)] and print for '000000' .. '999999';
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
If you included -l on your shebang line as I have, it would add the newlines for you, but since many people do not like simple :), you can do this:
#! perl -sw
use strict;
m[
(?:^[^0]*0[^0]*0[^0]*$)|
(?:^[^1]*1[^1]*1[^1]*$)|
(?:^[^2]*2[^2]*2[^2]*$)|
(?:^[^3]*3[^3]*3[^3]*$)|
(?:^[^4]*4[^4]*4[^4]*$)|
(?:^[^5]*5[^5]*5[^5]*$)|
(?:^[^6]*6[^6]*6[^6]*$)|
(?:^[^7]*7[^7]*7[^7]*$)|
(?:^[^8]*8[^8]*8[^8]*$)|
(?:^[^9]*9[^9]*9[^9]*$)
]x and print "$_\n" for '0000' .. '9999';
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
|
Please ignore my earlier post below. I figured that out. Sorry about that...
| [reply] |
Re: Find duplicate digits
by inman (Curate) on Feb 15, 2006 at 15:35 UTC
|
Canonicalise the data and look for multiples. The grep tests for numbers where there are a number of multiples of which at least one is a pair. This solution works with numbers of any length and can be made to work with triples, quads etc. just by changing the grep test.
The OP didn't make it clear whether a number such as 0101 that has two pairs should be counted. You can always test the number of members of the list returned by the grep.
while (<DATA>){
print "$_" if grep {length $_ == 2} join('', sort split //, $_) =~
+ /((\d)\2+)/g;
}
__DATA__
0000
0001
0002
0003
0004
0005
0006
0007
0008
0009
0010
0011
0012
0013
0014
01511
451411
gives
0011
0012
0013
0014
451411
| [reply] [d/l] [select] |
|
|
Thanks, inman!
Yes, I would like 1010 to be counted.
I'm finding it hard to understand your grep. Where is the list that it's supposed to work on?
| [reply] |
|
|
The code is equivalent to
while (<DATA>)
{
my $a = join('', sort split //, $_);
my @a = $a =~ /((\d)\2+)/g;
print "$_" if grep {length $_ == 2} @a;
}
| [reply] [d/l] |
|
|
Re: Find duplicate digits
by Roy Johnson (Monsignor) on Feb 15, 2006 at 16:47 UTC
|
It looks almost paradoxical, but this does the trick:
#! perl
use strict;
use warnings;
while (<DATA>) {
print if /(\d)(?=.*\1)/g and not /($1)(?=.*\1)/g;
}
__DATA__
0000 (Don't pick)
0001 (Don't pick)
0012 (Pick)
0011 (Pick)
1001 (Pick)
1009 (Pick)
1234 (Don't pick)
9879 (Pick)
Because I use the /g, the second match continues after the first one, so its capture
group will actually be the 2nd occurrence of the digit. Then the backreference would
be the third (which would mean we don't want it).
It would not work for situations where some digit appears three times but another
appears only twice (so it would be a "pick"). But since your examples are only four
digits long, that doesn't come up here.
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: Find duplicate digits
by salva (Canon) on Feb 15, 2006 at 15:49 UTC
|
@m = grep {/(\d).*\1/ and !/(\d).*\1.*\1/} '0000' .. '9999' | [reply] [d/l] |
Re: Find duplicate digits
by GrandFather (Saint) on Feb 16, 2006 at 03:33 UTC
|
Here's a fairly tidy solution (well, I like it):
use strict;
use warnings;
my @nums = (1000..9999);
my @picks = grep {my %d; map {$d{$_}++} split ''; grep {$_ == 2} value
+s %d} @nums;
print join "\n", @picks;
DWIM is Perl's answer to Gödel
| [reply] [d/l] [select] |
Re: Find duplicate digits
by LanX (Saint) on Apr 07, 2010 at 14:41 UTC
|
straightforward, accept one repetition but not two.
print grep { /(\d).*\1/ && ! /(\d).*\1.*\1/ } map {sprintf "%04d",$_}
+(0..9999)
Tried to do it in one regex but I couldn't find out how to negate the match \1, [^\1] obviously doesnt work and all negative look aheads get confused by the end of string...
UPDATE: sorry nothing new :( ...Re: Find duplicate digits | [reply] [d/l] [select] |
|
|
print grep { /(\d).*\1/ and eval "2 == tr/$1//" } map {sprintf "%04d
+",$_} (0..9999)
| [reply] [d/l] |
Re: Find duplicate digits
by mroman (Initiate) on Apr 07, 2010 at 10:50 UTC
|
Hello Monks,
This is my first post, and I was delighted to find this thread, as it almost exactly addresses what I want to do, with the exception that I would like to narrow the numbers down even further in comparison to the OP's spec.
Thus, I want to take 0000 to 9999 and select only those that have one pair of duplicate digits.
For example:
- 0000 - no
- 0011 - no
- 0012 - yes
- 0120 - yes
I am a noob, and this has turned out to be a bit beyond my skill level at this time . . . I was going the grep + regex on Bash route, however, upon seeing the Perl solutions in this thread, I believe I will only use bash to admin my system on the command line from now on :) | [reply] |
|
|
perl -le"%h=(), undef @h{split'',$_}, keys %h == 3 and print for '0000
+'..'9999'"
| [reply] [d/l] |
|
|
syntax error at ns.pl line 1, near "-le"
Execution of ns.pl aborted due to compilation errors.
? | [reply] [d/l] |
|
|
|
|
|
|
|
|
|
|
#/usr/bin/perl -w
use strict;
use Algorithm::Loops qw< NestedLoops NextPermute >;
my( $uniq )= ( @ARGV, 3 ); # 3 unique digits plus one duplicate digit
my $digs= NestedLoops(
[ [0..9], ( sub { [1+$_..9] } ) x ($uniq-1) ],
);
my @digs;
while( @digs= $digs->() ) {
do {
for my $dup ( 0 .. $#digs ) {
for my $after ( $dup .. $#digs ) {
my @result= @digs;
splice @result, 1+$after, 0, $digs[$dup];
print @result, $/;
}
}
} while( NextPermute( @digs ) );
}
| [reply] [d/l] |
|
|
| [reply] |
|
|
You can try the following algorithm:
- pass through your input one digit per time
- use a hash to count how often each digit occurs
- once you are finished, pass through the hash (again) to see how many digits are ocuring two times
- print yes/no
Once you have done this, think how to adapt the algorithm to handle the special case 1111....
Have Fun! Rata
PS.: it is good that you looked for an old thread! However you'll get more answers if you'd opened a new thread and just had provided a link to the old one! well, next time ;-)
| [reply] |
|
|
| [reply] |
Re: Find duplicate digits
by Anonymous Monk on Feb 16, 2006 at 10:11 UTC
|
Is this a problem from project Euler (mathschallenge.net)? | [reply] |
|
|
| [reply] |