in reply to Help me write a good reg-exp for this text

If you don't mind loosing the numbers at the end of some of the fields (or at least storing them elsewhere) you could just use a split
use strict; my $str = <<TXT; Total index B50001 Crude processing (capacity) B5610C Primary & semifinished processing (capacity) B562A3C Finished processing (capacity) B5640C Manufacturing ("SIC") B00004 Manufacturing (NAICS) GMF Durable manufacturing (NAICS) GMFD Wood product G321 + 321 Nonmetallic mineral product G327 + 327 Primary metal G331 + 331 Iron and steel products G3311A2 + 3311,2 Fabricated metal product G332 + 332 Machinery G333 + 333 TXT my(%hash, %numbers); for(split "\n" => $str) { my @fields = reverse split; $numbers{$fields[1]} = shift @fields if $fields[0] =~ /\d(?:,\d+)?/; $hash{$fields[0]} = join ' ' => reverse @fields[1 .. $#fields]; } my $code = 'GMF'; print "$code: $hash{$code}.\n"; __output__ GMF: Manufacturing (NAICS).
So that should give you the hash you want.
HTH

_________
broquaint

Replies are listed 'Best First'.
Re2: Help me write a good reg-exp for this text
by dragonchild (Archbishop) on Sep 05, 2003 at 16:19 UTC
    That doesn't work. Many of the text descriptions have spaces in them, plus there's spaces at the beginning of most of the lines.

    As this is fixed length, use unpack. Now, you're going to have to also use some logic if you care about the indenting stuff to make sure that you don't keep that whitespace at the beginning of your description. If you don't, it's easy enough to strip off the indenting whitespace. To get what you exactly wanted, do something like:

    # Change these to the actual column widths. Use a star at the end to g +et the rest. my @column_widths = (###, ###, '*'); my $unpack_spec = join ' ', map { "A$_" } @column_widths; my %codes; while (<IN_FILE>) { chomp; my ($desc, $code, $other_thingy) = unpack $unpack_spec, $_; # If you want to remove the pre-pended whitespace on the descripti +on ... $desc =~ s/^\s+//; $codes{$code} = { Description => $desc, Other_Thing => $other_thingy, }; } my $choice = 'GMT'; print "$choice: $codes{$choice}{Description}\n";

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.