Back in the 1980's, the sci-fi wargame and role-playing game Traveller created thousands of "Universal World Profile" codes for thousands of worlds. These UWPs contain mainworld data, some bare system data (stellar class, gas giant presence...), etc etc. The UWP essentially is a text database which is somewhat human readable. Here's the canonical example:


Regina        1910 A788899-C  A Ri Cp              703 Im F7 V M8 D M6 V :1909,2007
World-name    Loc  Profile    Misc Codes                  Stellar Data    Trade routes

The field order has more or less remained fixed, while the spacing has varied slightly. This makes parsing harder for some languages, but easy for perl.

So, without further ado, here is my base UWP parser, written as a perl inside-out object.

Update: Added read-only methods to easily extract data out of conglomerate strings.

Rob
package UWP; { my ( %name, %loc, %uwp, %base, %codes, %zone, %pbg, %alleg, %stars, %xboat ); sub new { bless {}, shift; } # read-write methods sub name : lvalue {$name {+shift}} sub loc : lvalue {$loc {+shift}} sub uwp : lvalue {$uwp {+shift}} sub base : lvalue {$base {+shift}} sub codes : lvalue {$codes {+shift}} sub zone : lvalue {$zone {+shift}} sub pbg : lvalue {$pbg {+shift}} sub alleg : lvalue {$alleg {+shift}} sub stars : lvalue {$stars {+shift}} sub xboat : lvalue {$xboat {+shift}} # read-only methods sub col { $loc{+shift} =~ /^(..)/; } sub row { $loc{+shift} =~ /(..)$/; } sub starport { $uwp{+shift} =~ /^(.)/; } sub size { $uwp{+shift} =~ /^.(.)/; } sub atm { $uwp{+shift} =~ /^..(.)/; } sub hyd { $uwp{+shift} =~ /^...(.)/; } sub pop { $uwp{+shift} =~ /^....(.)/; } sub gov { $uwp{+shift} =~ /(.).-.$/; } sub law { $uwp{+shift} =~ /(.)-.$/; } sub tl { $uwp{+shift} =~ /(.)$/; } sub pm { $pbg{+shift} =~ /^(.)/; } sub belts { $pbg{+shift} =~ /^.(.)/; } sub ggs { $pbg{+shift} =~ /(.)$/; } sub primary { $stars{+shift}->[0]; } sub DESTROY { my $self = shift; delete $name {$self}, $hex {$self}, $uwp {$self}, $base {$self}, $codes {$self}, $zone {$self}, $pbg {$self}, $alleg {$self}, $stars {$self}, $xboat {$self}; } sub create { my $line = shift; # Convert really really old format into new format. $line = sprintf( "%-49s 001 ?? ??", $1) if $line =~ /^(.{29,50}\s*)G\s*$/; $line = sprintf( "%-49s 000 ?? ??", $1) if $line =~ /^(.{29,50})\s*$/; unless ( $line =~ # # name hex uwp - base codes zone pbg + allg. stellar xboat # /^(.*)(\d{4}) (\w)(\w)(\w)(\w)(\w)(\w)(\w).(\w) \s?(.) (.{15}) (.) \s +?(\d)(\d)(\d) (..) ([^:]*)?\s*(:.*)?$/ ) { #croak "Cannot decode UWP format: $line"; return 0; } my $self = new UWP; $self->uwp = $3.$4.$5.$6.$7.$8.$9.'-'.$10; $self->name = $1; $self->loc = $2; $self->base = $11; $self->codes = $12; $self->zone = $13; $self->pbg = $14.$15.$16; $self->alleg = $17; $self->stars = getStars( $18 ); $self->xboat = $19 || ''; $self->xboat =~ s/://; return $self; } # # There's going to be 1, 2, or 3 stars, I think. # But maybe 4. # sub getStars { my $stars = shift; chomp $stars; return [ "$1 $2", "$3 $4", "$5 $6", "$7 $8" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+)/ ); return [ "$1 $2", "$3 $4", "$5 $6" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+)/ ); return [ "$1 $2", "$3 $4" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+)/ ); return [ $stars ]; } } 1;

In reply to Inside-Out Traveller UWP Parser by rje

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.