Re: database design issue: one big table vs hundred of smaller tables

Are you perhaps looking for a data dictionary sort of application, one that tracks the locations, names, and field structures of other tables?

To that end, perhaps the following table structures would help:

DIRPATHS
   PathID - INT, NOT NULL, AUTOINC
   PathName - VARCHAR(16) # Adjust to taste
   ParentDir - INT  # PathID of the parent directory, null for /

TABLELIST
   TableID - INT, NOT NULL, AUTOINC
   TableName - VARCHAR(16) # Adjust to tase
   PathID - INT  # Linking Field pointing to DIRPATHS
   # Add'l fields, e.g. Description, LastUpdate, BackedUpOn, etc.

FIELDNAMES
   TableID - INT, NOT NULL  # Link to match in TABLELIST
   FieldNo - INT, NOT NULL, # Field Position
   Name - VARCHAR(32) # Adjust to taste.
   # Description and other fields as needed

ENTITIES
   LocalID - INT, NOT NULL, AUTOINC  
   EntityName - VARCHAR(32) # Adjust to need/taste
   EntityType - ENUMERATED SET (Note: I prefer lookup tables):
                0 - directory
                1 - table
                2 - field
                3 - Etc
   EntityID - INT  # ID from the appropriate table listed earlier
[download]

In this scenario, you let the user search ENTITIES for the stuff they're looking for, use the LinkType field to find out the object that matched, and then take the ID back to the appropriate table. You'll need to customize backtracking code to stream a matching ID back to its original data and then handle that, however, it can be a very flexible scheme.

The main problem is that you essentially force yourself into a maintenance problem should something change, such as the directory structure or the storage scheme. Typically, this is a BAD THING,^® as anything that gets hard-coded can make life hellish for your maintenance programmer. I avoid this whenever possible.

Perhaps a better approach is to create one table mapping tables (and paths) to arbitrary ID's and then to create a second table listing all fields of every table mapped in the previous one. The fields are larger, but it's easier to maintain in the long run. If a table gets deleted, you only need to DELETE to sets of data. And so on.

In either case, you'll need multiple queries to get the user to the final entity that matches. However, since you're dealing with subsets after the first one, it shouldn't be that much slower than other approaches.

In the end, it doesn't seem that much different than creating a good search engine¹ for a web site. Instead of words in a document, you're trying to capture--and then search--the location of various types of data. The specifics changes, but not perhaps the overall techniques. If we can record it, we can search it.²

As far as interesting ways to do this in Perl, I find that someone has written an interesting column or few. (Note: They may not seem topical at first glance, but I think the techniques may help.)

--f

¹ - I do recommend getting the latest examples, though. Things have changed since this was published.

² - Or, a some guy named Dutch once said, "If it bleeds, we can kill it."

Comment on Re: database design issue: one big table vs hundred of smaller tables Download Code