in reply to Help with a Regex

Update: When I wrote my original solution to this problem, I overlooked the requirement to ignore out-of-order numerals. The code below now checks for this (I also made some other minor changes). Thanks to Roy Johnson for the heads-up!

Update 2: Animator informs me that the new version of the code still fails (it does not give the desired output for "XIX", for example). (Thanks!) . I have updated the code to show the failures.

On further thought, I think that this part of the spec is ambiguous:

I want to ignore characters that are "out of order" (the "I" in "IV") but allow subsequent matches of characters that are in order (the "V" in "IV").
...since it does not specify why it is the "I" and not the "V" that should be regarded as the out of order numeral. Why is "X" the desired output for "XL", but "V" the desired output for "IV"? The case of "XIX" is also problematic, because both "X" or "I" may be regarded as out of order (depending on how one chooses to interpret this specification), and yet the desired output for "XIX" is "....X.I". What should the output be for "VIX"? Or for "DLXVIM"?


I don't think that regexes are the tool for this job:

use strict; use warnings; my %pos; @pos{ qw( M D C L X V I ) } = ( 0 .. 6 ); my $n_keys = keys %pos; my $template = '.' x $n_keys; while ( <DATA> ) { my ( $in, $desired) = split; my $out = $template; my $ptr = $n_keys; for my $i ( reverse ( 0 .. length( $in ) - 1 ) ) { my $c = substr $in, $i, 1; if ( $pos{ $c } < $ptr ) { substr( $out, $pos{ $c }, 1 ) = $c; $ptr = $pos{ $c }; } } print "$in\t=> $out\t", ( $out eq $desired ? '' : 'not ' ), 'ok', $/ +; } __DATA__ I ......I IV .....V. V .....V. VI .....VI IX ....X.. X ....X.. XI ....X.I XIV ....XV. XV ....XV. XVI ....XVI XIX ....X.I X ....X.. XL ....X.. LX ...LX.. XC ....X.. CLXIX ..CLX.I CDXLVI .D..XVI MCMXCVI M.C.XVI MDCLI MDCL..I __END__ I => ......I ok IV => .....V. ok V => .....V. ok VI => .....VI ok IX => ....X.. ok X => ....X.. ok XI => ....X.I ok XIV => ....XV. ok XV => ....XV. ok XVI => ....XVI ok XIX => ....X.. not ok X => ....X.. ok XL => ...L... not ok LX => ...LX.. ok XC => ..C.... not ok CLXIX => ..CLX.. not ok CDXLVI => .D.L.VI not ok MCMXCVI => M.C..VI not ok MDCLI => MDCL..I ok

the lowliest monk