Back in the 1980's, the sci-fi wargame and role-playing game Traveller created thousands of "Universal World Profile" codes for thousands of worlds. These UWPs contain mainworld data, some bare system data (stellar class, gas giant presence...), etc etc. The UWP essentially is a text database which is somewhat human readable. Here's the canonical example:


Regina        1910 A788899-C  A Ri Cp              703 Im F7 V M8 D M6 V :1909,2007
World-name    Loc  Profile    Misc Codes                  Stellar Data    Trade routes

The field order has more or less remained fixed, while the spacing has varied slightly. This makes parsing harder for some languages, but easy for perl.

So, without further ado, here is my base UWP parser, written as a perl inside-out object.

Update: Added read-only methods to easily extract data out of conglomerate strings.

Rob
package UWP; { my ( %name, %loc, %uwp, %base, %codes, %zone, %pbg, %alleg, %stars, %xboat ); sub new { bless {}, shift; } # read-write methods sub name : lvalue {$name {+shift}} sub loc : lvalue {$loc {+shift}} sub uwp : lvalue {$uwp {+shift}} sub base : lvalue {$base {+shift}} sub codes : lvalue {$codes {+shift}} sub zone : lvalue {$zone {+shift}} sub pbg : lvalue {$pbg {+shift}} sub alleg : lvalue {$alleg {+shift}} sub stars : lvalue {$stars {+shift}} sub xboat : lvalue {$xboat {+shift}} # read-only methods sub col { $loc{+shift} =~ /^(..)/; } sub row { $loc{+shift} =~ /(..)$/; } sub starport { $uwp{+shift} =~ /^(.)/; } sub size { $uwp{+shift} =~ /^.(.)/; } sub atm { $uwp{+shift} =~ /^..(.)/; } sub hyd { $uwp{+shift} =~ /^...(.)/; } sub pop { $uwp{+shift} =~ /^....(.)/; } sub gov { $uwp{+shift} =~ /(.).-.$/; } sub law { $uwp{+shift} =~ /(.)-.$/; } sub tl { $uwp{+shift} =~ /(.)$/; } sub pm { $pbg{+shift} =~ /^(.)/; } sub belts { $pbg{+shift} =~ /^.(.)/; } sub ggs { $pbg{+shift} =~ /(.)$/; } sub primary { $stars{+shift}->[0]; } sub DESTROY { my $self = shift; delete $name {$self}, $hex {$self}, $uwp {$self}, $base {$self}, $codes {$self}, $zone {$self}, $pbg {$self}, $alleg {$self}, $stars {$self}, $xboat {$self}; } sub create { my $line = shift; # Convert really really old format into new format. $line = sprintf( "%-49s 001 ?? ??", $1) if $line =~ /^(.{29,50}\s*)G\s*$/; $line = sprintf( "%-49s 000 ?? ??", $1) if $line =~ /^(.{29,50})\s*$/; unless ( $line =~ # # name hex uwp - base codes zone pbg + allg. stellar xboat # /^(.*)(\d{4}) (\w)(\w)(\w)(\w)(\w)(\w)(\w).(\w) \s?(.) (.{15}) (.) \s +?(\d)(\d)(\d) (..) ([^:]*)?\s*(:.*)?$/ ) { #croak "Cannot decode UWP format: $line"; return 0; } my $self = new UWP; $self->uwp = $3.$4.$5.$6.$7.$8.$9.'-'.$10; $self->name = $1; $self->loc = $2; $self->base = $11; $self->codes = $12; $self->zone = $13; $self->pbg = $14.$15.$16; $self->alleg = $17; $self->stars = getStars( $18 ); $self->xboat = $19 || ''; $self->xboat =~ s/://; return $self; } # # There's going to be 1, 2, or 3 stars, I think. # But maybe 4. # sub getStars { my $stars = shift; chomp $stars; return [ "$1 $2", "$3 $4", "$5 $6", "$7 $8" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+)/ ); return [ "$1 $2", "$3 $4", "$5 $6" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+) (\w\d) (\w+)/ ); return [ "$1 $2", "$3 $4" ] if ( $stars =~ /(\w\d) (\w+) (\w\d) (\w+)/ ); return [ $stars ]; } } 1;