At the risk of saying something stupid-but-obvious about Roman Numerals

Like everyone else, I have been reading the “10**21” threads with some academic interest – but also, I confess, some confusion when talk is made of “4 gigabyte lookup-tables.” If the poor Romans actually had to do that, they’d never have been able to count anything. An algorithm for converting Roman numerals to decimal is trivially easy: just reverse the string (say ...) so that you intepret it from right-to-left instead of left-to-right.

use strict;
use warnings;

my %letters = ('I' => 1, 'V' => 5, 'X' => 10, 'L' => 50, 
               'C' => 100, 'D' => 500, 'M' => 1000);

my $roman = "MCCMLXXIV";  #1874

my $result = 0;
my $high_score = 0;

foreach my $letter ( split(//, reverse($roman)) ) {
  my $score = $letters{$letter};
  if ($score >= $high_score) {
    $result += $score;
  }
  else {
    $result -= $score;
  }

  $high_score = $score if $score > $high_score;
}
print "$result\n";
[download]

When the string is viewed from left-to-right, parsing seems difficult because in a well-formed Roman string like MCCMLXXIV you don’t know what C means until you encounter M. However, that’s not the way the Romans did it. They probably also read the string from right-to-left, as was and is the standard for many human languages of the region. Now, the rule is one that any human could do in his head: “if the digit is as big or bigger than the biggest one you have seen so-far, reading from right to left, add it to the total; otherwise, subtract.” They did not use arbitrary combinations of letters, although the algorithm shown above would accept them. (As an example, I doubt that they actually used IM to represent 999.) In fact, the rules that they did use make the process similar to “making change” given various denominations of currency, and this can’t be accidental.

So, while the talk of this being a “10**21 problem” is an interesting one, and the latest installment (#3) about whacking the TLB of the processor is even more so, it seems like an incredible amount of work to unleash on a problem that has a trivial algorithmic solution. I am certainly not “dissing” that thread for its informative discussion of how to deal with problems that would require massive search, but Roman Numerals does not seem to be a good example to use here, unless I am entirely missing the point. (Which possibility I do not deny.) The proper solution for Roman Numerals would be to “find a better algorithm,” then optimize-the-hell out of that algorithm. I cannot imagine that any solution would be faster, especially when the time required to populate those gigabytes of data is included, but I say this only because a nearly cost-free algorithm to solve this problem readily exists, as just shown. For problems that lack such alternatives, the issues discussed in that thread are of course hugely relevant, and the solutions, very informative.

- - - - -

For the “true die-hard golfers” among us, the same algorithm can be expressed, even without any conditional branches at all other than for the scan-loop, as a finite-state machine (FSM) with twenty-six-element tables containing numbers that are either positive or negative depending on the table. The machine’s state would, of course, represent the largest digit, if any, yet seen when scanning from right to left. For instance, in the table for state LARGEST_SEEN_IS_V (i.e. all tables except LARGEST_SEEN_IS_I and INITIAL_STATE), the entry for I would now be (-1). A “real” Perl implementation would use a C-language subroutine. If an ERROR_STATE is provided, such an algorithm could also detect and reject “syntax error” inputs (as the algorithm shown above cannot do). Which is exactly why we build language-parsers and lexical-scanners this way . . .

- - - - -

Clearly, the intent of the main thread must be to present efficient solutions to problems that actually do consist of arbitrary-string inputs, for which all possible strings are equally probable, under conditions where arbitrarily-large amounts of physical RAM are presumed to be available. Problems that, unlike Roman Numerals, do not present a trivial algorithmic solution. Problems for which an efficient solution in the Perl language (or at least, the Perl environment) is required.

- - - - -

(Postscript: I recently helped with a geocaching-like team building challenge for highly-paid “alpha-male” executives, in which you had to use a GPS to find nearby things in an unfamiliar town and obtain clues. One of the things we took them to was a statue which had a date written in Roman Numerals. They just needed to get that date. No big deal, any third-grader could do that, right? Uh uh. I was speechless to watch several(!) of them busily searching-for and downloading a “Roman Numerals app” to let them decipher the clue ... which they had utterly no idea how to do on their own. They could not read Roman Numerals! So they sat there, locating and downloading – in a couple of cases, buying – an app for their cell-phones. “Oy vey!”)

Replies are listed 'Best First'.
Re: At the risk of saying something stupid-but-obvious about Roman Numerals by roboticus (Chancellor) on May 12, 2014 at 13:55 UTC
sundialsvc4: You're almost entirely missing the point. No-one is having any difficulty with roman numerals. Instead, he's presenting it as a demonstration of (1) a code-golfing technique where he tries to find a 'magic formula' to do the conversion in as few keystrokes as possible, and (2) optimizing code and/or algorithms to search an intractably large search space in a reasonable time period. Re-read the first installment again... ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply]
Re^2: At the risk of saying something stupid-but-obvious about Roman Numerals by Anonymous Monk on May 12, 2014 at 16:14 UTC
"You're almost entirely missing the point." Story of their existence.	[reply]
Re: At the risk of saying something stupid-but-obvious about Roman Numerals by Limbic~Region (Chancellor) on May 12, 2014 at 19:49 UTC
sundialsvc4, Something seems fishy with your code. `my $roman = "MCCMLXXIV"; #1874` [download] Shouldn't that be MDCCCLXXIV? I am pretty sure what you have isn't valid. Cheers - L~R	[reply] [d/l]
Re^2: At the risk of saying something stupid-but-obvious about Roman Numerals by Anonymous Monk on May 12, 2014 at 23:28 UTC
Or even, MDCCCLXXIIII. Romans probably treated the numbers like a bag of coins. To add two numbers, just lump them together. Substractive forms (IV, IX, XL, XC, CD, CM), and even double substractives (such as CCM) would also occur, according to wikipedia.	[reply]
Re: At the risk of saying something stupid-but-obvious about Roman Numerals by mr_mischief (Monsignor) on May 13, 2014 at 19:58 UTC
The problem presented is not about converting Roman numerals. It is about searching for a random program which converts Roman numerals. It's more akin to computational pharmacology than to grade-school math.	[reply]
A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.


Problems? Is your data what you think it is?
	PerlMonks