in reply to Organizing and presenting a cross-reference

A picture paints a thousand words. Or, a clearly labelled sample of the data, 2 or 3 rows, will be a far clearer explanation than:

The first column is more or less the reference column: that's what was originally used for looking up the data in the other columns - and the number in that first column can be repeated on several lines if there's more than one cross-referenced product for any given vendor.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."
  • Comment on Re: Organizing and presenting a cross-reference

Replies are listed 'Best First'.
Re^2: Organizing and presenting a cross-reference
by oko1 (Deacon) on Sep 24, 2008 at 02:28 UTC

    OK. I hesitated to show the actual data because it's so broad, but I'll take a reasonable-width snip of it (whitespace reduced for better fit.)

    NGK_STK. NGK_P/N NGK ALT ACCEL AC_DELCO AUT +OLITE BECK/ARNLEY 1010 A6 N/A N/A C86 375 + N/A 1010 A6 N/A N/A C86 269 +7 N/A 1010 A6 N/A N/A C86 375 + N/A 1010 A6 N/A N/A C86 377 + N/A 1010 A6 N/A N/A C86 375 + N/A 1010 A6 N/A N/A C86 386 + N/A 1010 A6 N/A N/A C86S* 386 + N/A 1010 A6 N/A N/A S85F 386 + N/A 1010 A6 N/A N/A C87 277 +5 N/A 1010 A6 N/A N/A C86 277 +5 N/A 1010 A6 N/A N/A C86 303 +5 N/A 1010 A6 N/A N/A C86 303 +5 N/A 1010 A6 N/A N/A C86 303 +5 N/A 1010 A6 N/A N/A C86 303 +5 N/A 1010 A6 N/A N/A C86S* 313 +6 N/A 1010 A6 N/A N/A C86 376 + N/A 1010 A6 N/A N/A C85S 379 + N/A 1010 A6 N/A N/A C86 375 + N/A 1010 A6 N/A N/A C86 386 + N/A 1010 A6 N/A N/A 18A* 379 + N/A 1010 A6 N/A N/A C86,M8 311 +6 N/A 1010 A6 N/A N/A C87 303 +5 N/A 1010 A6 N/A N/A C87 376 + N/A 1010 A6 N/A N/A C86,M8 313 +6 N/A 1011 B7EB* 5122 142 R42XL* 403 + N/A 1024 AR6FS-11* N/A N/A R83T 124 + N/A 1024 AR6FS-11* N/A N/A R83T 804 + N/A 1024 AR6FS-11* N/A N/A R83T 124 + N/A 1024 AR6FS-11* N/A N/A R83T 584 + N/A 1024 AR6FS-11* N/A N/A R83T 124 + N/A 1027 AP9FS N/A N/A 84TS 32 + 176-5178 1027 AP9FS N/A N/A 84TS 32 + N/A 1029 BPMR6A-10 N/A N/A CS42S 297 +4 N/A 1030 DPR8EV-9* 2872 N/A N/A 416 +3 N/A 1030 DPR8EV-9* 2872 N/A N/A 416 +3 N/A 1030 DPR8EV-9* 2872 N/A N/A 416 +3 N/A 1034 BP7ES N/A 113 R41XLS 53 + 176-5075 1034 BP7ES N/A 113 R41XLS 425 +2 N/A 1034 BP7ES N/A 113 R41XLS 425 +2 176-5075 1034 BP7ES N/A 113 R41XLS 62 + 176-5075 1034 BP7ES N/A 113 R41XLS 52 + 176-5075 1034 BP7ES N/A 113 41XLS* 425 +2 N/A 1034 BP7ES N/A 113 41XLS* 425 +2 N/A 1041 ZFR6A-11 N/A N/A N/A 522 +4 176-5204 1043 BR8EVX SOLID* 6747 N/A N/A 406 +3 N/A 1049 B8EFS N/A N/A N/A AR4 +74 N/A 1052 B6HS-10 N/A 156 42F 409 +3 176-5006 1059 R217-10 N/A N/A N/A N/A + N/A

    --
    "Language shapes the way we think, and determines what we can think about."
    -- B. L. Whorf

      Despite my "a picture paints a thousand words", I'm still having trouble understanding your data.

      For example, in the following two lines, only the presence of a part no for the last manufacturer distinguishes them. Doesn't that make the second a duplicate of the first with missing information? Ie. redundant?

      NGK_STK. NGK_P/N NGK ALT ACCEL AC_DELCO AUTOLITE BECK/ARNLEY 1027 AP9FS N/A N/A 84TS 32 176-5178 1027 AP9FS N/A N/A 84TS 32 N/A

      There appears to be a significant column of information missing from the above table?

      I could imagine that the above data represents the recommendations by the different manufacturers for the plugs in their range that would be applicable to two difference vehicles. Say the first is the normally aspirated version of some mark, and the second is the turbo-charged version. And whilst most of the manufacturers recommend the same plug for both, the BECK/ARNLEY plug is unsuitable for the latter variant. And, they have no suitable alternative in their range.

      My point is, that whilst plugs from different manufacturers may be interchangable for a given vehicle, each manufacturers plugs have different ranges of operating parameters, which means that plugs from two different manufacturers are not interchangable for all applications.

      The upshot is, the reason you are having so much trouble coming up with a normalisation schema, is because the key field--the vehicle--is missing from your table. Any attempt at normalisation based upon grouping of part numbers without taking the vehicle into consideration is at best doing your customers a dis-service. It could pursuade them to purchase plugs that are unsuitable for their particular vehicle, that might have limited life due to (say) overheating. Or worse, that could damage their engines by (say) holing their pistons by burning too hot.

      At worst, it could be dangerous.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        For example, in the following two lines, only the presence of a part no for the last manufacturer distinguishes them. Doesn't that make the second a duplicate of the first with missing information? Ie. redundant?

        Grrr. That's what I was trying to avoid when presenting this data. Remember when I said I'd be trimming off a number of the columns? That's what happened.

        A lot of the data is redundant - that's just how it's organized - but not because it's for different vehicles or whatever; it's because whoever put it together didn't use an easier way to say "vendor X's plugs 123,124, and 125 match vendor A's plug 999". Instead, they end up saying

        A B C D X 999 001 002 003 123 999 001 002 003 124 999 001 002 003 125

        It's dumb, and I know it's one of the things I have to factor out - but I figured that I'd be taking care of lots of redundancy anyway, and this would just get taken care of as part of the parsing process.

        My point is, that whilst plugs from different manufacturers may be interchangable for a given vehicle, each manufacturers plugs have different ranges of operating parameters, which means that plugs from two different manufacturers are not interchangable for all applications.

        I assure you that this is not the way it works; plugs with different operating parameters get a different part number. The parameter ranges vary widely, but they're standardized - for exactly the reasons you state - and they're comprehensible as data.

        I was at an auto parts store two days ago, and had them look up a plug for me - an NGK B7HS-10 - and they pulled up a screen that showed all the equivalents, which exactly matched the list that I had. They never asked me what vehicle I had, and it wouldn't have helped them if they did: it was a Yamaha 15HP outboard motor, and I no auto parts store is going to list that as a vehicle. :)


        --
        "Language shapes the way we think, and determines what we can think about."
        -- B. L. Whorf