The following might not be exactly the structure that you really want, but it does store all the pci.id file content without loss of information or structure. It differs from the OP layout (which maybe wasn't what you really wanted either) in that here, the various ID field values from the file are used as hash keys.

It's a bit clunky, because the "descriptive" fields are held as values of "special" hash keys ("vendor_name" and "device_name"), and these keys are siblings to the hash keys that happen to be ID fields. But the structure is both adequate and fairly simple.

(I had to make sure the data sample contained tab characters as described.)

#!/usr/bin/perl use strict; use Data::Dumper; my %pci_id; my ( $curr_vendor, $curr_device ) = ( '', '' ); while ( <DATA> ) { next if ( /^#/ or /^\s*$/ ); chomp; if ( /^([0-9a-f]+)\s+(.*)/i ) { $curr_vendor = $1; $pci_id{$curr_vendor}{vendor_name} = $2; } elsif ( /^\t([0-9a-f]+)\s+(.*)/i ) { $curr_device = $1; $pci_id{$curr_vendor}{$curr_device}{device_name} = $2; } elsif ( /^\t\t([0-9a-f]+)\s+([0-9a-f]+)\s+(.*)/i ) { $pci_id{$curr_vendor}{$curr_device}{$1}{$2} = $3; } else { warn sprintf( "%s: not sure what to do with input line %d: %s\ +n", $0, $., $_ ); } } print Dumper( \%pci_id ); # just to see what it looks like my @vendors = sort keys %pci_id; for my $vendor ( @vendors ) { print "==== Data for vendor $pci_id{$vendor}{vendor_name}:\n"; my @devices = sort grep( !/vendor_name/, keys %{$pci_id{$vendor}} ); for my $device ( @devices ) { my @subvendors = sort grep( !/device_name/, keys %{$pci_id{$vendor}{$device}} +); if ( @subvendors == 0 ) { printf( "== Device %s (%s) has no additional data\n", $device, $pci_id{$vendor}{$device}{device_name} ); next; } printf( "== Device %s (%s) has the followinng subdevices:\n", $device, $pci_id{$vendor}{$device}{device_name} ); for my $subvendor ( @subvendors ) { my @subdevs = sort keys %{$pci_id{$vendor}{$device}{$subve +ndor}}; print join( ' * ', $vendor, $device, $subvendor, $_, $pci_id{$vendor}{$device}{$subvendor}{$_} )."\ +n" for ( @subdevs ); } } } __DATA__ # Vendors, devices and subsystems. Please keep sorted. # Syntax: # vendor vendor_name # device device_name <-- single tab # subvendor subdevice subsystem_name <-- two tabs # Formerly NCR 1000 LSI Logic / Symbios Logic 0001 53c810 1000 1000 LSI53C810AE PCI to SCSI I/O Processor 0002 53c820 0003 53c825 1000 1000 LSI53C825AE PCI to SCSI I/O Processor (Ultra Wide) 0004 53c815 0005 53c810AP 0006 53c860 1000 1000 LSI53C860E PCI to Ultra SCSI I/O Processor 000a 53c1510 1000 1000 LSI53C1510 PCI to Dual Channel Wide Ultra2 SCSI Con +troller (Nonintelligent mode) 000b 53C896/897 1000 1000 LSI53C896/7 PCI to Dual Channel Ultra2 SCSI Multifu +nction Controller 1000 1010 LSI22910 PCI to Dual Channel Ultra2 SCSI host adapt +er 1000 1020 LSI21002 PCI to Dual Channel Ultra2 SCSI host adapt +er # multifunction PCI card: Dual U2W SCSI, dual 10/100TX, graphics 13e9 1000 6221L-4U 000c 53c895 1000 1010 LSI8951U PCI to Ultra2 SCSI host adapter 1000 1020 LSI8952U PCI to Ultra2 SCSI host adapter 1de1 3906 DC-390U2B SCSI adapter 1de1 3907 DC-390U2W 000d 53c885 000f 53c875 0e11 7004 Embedded Ultra Wide SCSI Controller 1000 1000 LSI53C876/E PCI to Dual Channel SCSI Controller 1000 1010 LSI22801 PCI to Dual Channel Ultra SCSI host adapte +r 1000 1020 LSI22802 PCI to Dual Channel Ultra SCSI host adapte +r 1092 8760 FirePort 40 Dual SCSI Controller 1de1 3904 DC390F/U Ultra Wide SCSI Adapter 4c53 1000 CC7/CR7/CP7/VC7/VP7/VR7 mainboard 4c53 1050 CT7 mainboard

In reply to Re^3: pci.ids to a complex data structure by graff
in thread pci.ids to a complex data structure by wynnmc

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.