comment on

First off, I love this idea and I am eager to help out in any way that I can. That being said, I hope this doesn't come across as a criticism.

Should we scale down the project a little? An "Anything Indexer" faces the problem of normalization. I would assume that the user would enter all of the relevant fields for the data they wish to collect, and then further questions would have to be asked of the user to determine the relationships of those fields. Those relationships would be used by the program to normalize the data. The problem is, I'm not aware that it's possible to generate on-the-fly normalization for complex data. The more information the user wishes to track, the more difficult the normalization becomes.

For instance, assume that we're tracking CDs. Let's say that the user wants to track the record label. This has to go in a separate table as one record label will be on many CDs. This is a one-to-many relationship. This seems fairly straightforward. We create a CD table and a record_label table and the CD table has a field identifying the record_label ID.

What happens when the exact same CD is issued under another label? Then we realize that we have a many-to-many relationship and we should have a junction table tying the CD and record_label fields together. Oops. If the database isn't set up that way in the first place, the user may be forced to create another CD record with a new label name and we wind up with duplicate data in the database and we wind up with the potential for modification anomalies.

That may not be the best example, but the point holds. If we target the Indexer at a specific use, we can address these issues up front.

Update: A modification anomaly (as mentioned above) is where updating data corrupts the integrity. In this case, if we have one CD under two different labels, but the user got the name of the CD wrong, he/she may correct the name, but not realize that this error occurred in two places. We wind up with one CD with two different names.

In reply to RE: Community Teaching Project II - the Call to Arms by Ovid
in thread Community Teaching Project II - the Call to Arms by Ozymandias

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.