Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

SQL JOINs vs WHERE statements

by Cody Pendant (Prior)
on Aug 18, 2003 at 04:07 UTC ( #284480=perlquestion: print w/replies, xml ) Need Help??

Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

Let me start off by saying this isn't a perl question as such but I hope you'll indulge me

Simple question: can all SQL JOIN statements be expressed either as JOINs or WHERE syntax? In other words, is the syntax "INNER JOIN ON foo = bar" unique, and can do things that can't be done any other way, or is it just a Tim-Toady kind of expression of the same thing?

This all started with my desire to make a database for a TV show here with proper normalisation of what's a kind of three-dimensional structure.

So now I have a table of people, a table of roles and a table of episodes, and a fourth table with three numeric columns, which contains the information that in episode n, person n plays role n.

So in order to find out who plays which role in episode one, I can (thanks to Jeffa) do this:

SELECT people.firstname, people.lastname, role.name FROM roles INNER JOIN eps_people_roles ON roles.id = eps_people_roles.role INNER JOIN people ON eps_people_roles.person = people.id WHERE eps_people_roles.ep = 1

but unless I'm completely crazy, I can also do this:

SELECT people.firstname, people.lastname, role.name FROM people, roles, eps_people_roles WHERE ( roles.id = eps_people_roles.role AND eps_people_roles.person = people.id ) AND eps_people_roles.ep = 1

which seems to give the same results. I'm asking, to be honest, because I just like that second kind of syntax better.



($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: SQL JOINs vs WHERE statements
by NetWallah (Canon) on Aug 18, 2003 at 04:45 UTC

      Thank you. But, just because I like everything spelled out as if I was hungover and not very smart, are the two statements exactly the same in effect?

      Is there perhaps some other kind of consideration like efficiency or speed involved? I'm guessing not.

      And why "inner" anyway? I just do not get the meaning of inner and outer in this context. I know what they mean, but I don't know why they have those names.



      ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print
        I'm at the fringes of my SQL knowledge here (after all - my handle is NETWallah, not SQL-wallah). Anyway - Yes, AFAIK, the performance of both your SQL statements should be the same. To verify, you could have your SQL platform give you the SQL execution plan for each, and compare them.

        Here is the (Thankfully brief) INNER join vs others explanation from Microsoft - it may provide a little insight into the choice of words:

        INNER
        Specifies that all matching pairs of rows are returned. Discards unmatched rows from both tables. This is the default if no join type is specified.

        LEFT [ OUTER ]
        Specifies that all rows from the left table not meeting the specified condition are included in the result set in addition to all rows returned by the inner join. Output columns from the left table are set to NULL.

        RIGHT [ OUTER ]
        Specifies that all rows from the right table not meeting the specified condition are included in the result set in addition to all rows returned by the inner join. Output columns from the right table are set to NULL.

        FULL [ OUTER ]
        If a row from either the left or right table does not match the selection criteria, specifies the row be included in the result set, and output columns that correspond to the other table be set to NULL. This is in addition to all rows usually returned by the inner join.

        Is there perhaps some other kind of consideration like efficiency or speed involved? I'm guessing not.

        They should really be the same statement. The only difference is that in the second the join is implicit, but besides the "visual" things nothing else changes.

        Michele.

Re: SQL JOINs vs WHERE statements
by demerphq (Chancellor) on Aug 18, 2003 at 09:02 UTC

    My understanding is that the join syntax was a later addition to SQL to disambiguate certain classes of where clause. Not all SQL engines support the syntax. Notably Sybase. (Which I use so im not going to get into further as I dont have the option of the first SQL you posted :-)


    ---
    demerphq

    <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
      Actually Sybase does support ANSI JOIN clauses (as of ASE 11.x, I think). However, as you I'm more comfortable with implicit joins using the WHERE clause (including using the =* and *= operators for inner/outer joins).

      Michael

        (as of ASE 11.x, I think)

        Im looking at a Sybase 11.5 reference manual right now and I can't find it at all. And yes, im much more used to using *= and =* instead. That verbose join syntax always kinda gets me... :-)

        But it is nice to know its available in 12.5


        ---
        demerphq

        <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: SQL JOINs vs WHERE statements
by bean (Monk) on Aug 18, 2003 at 16:39 UTC
    Just to throw my 2 cents in here...
    1. Yes, they are absolutely equivalent.
    2. I have never seen the verbose syntax (INNER|LEFT|RIGHT|FULL JOIN) used in a professional, production system, probably because Oracle prior to 9i didn't support it. (I had completely forgotten that it existed.)
    3. If you can limit one of the tables on an indexed column before joining, it may be faster. So in your example, if eps_people_roles.ep is indexed, you should do this:
      SELECT people.firstname, people.lastname, role.name FROM people, roles, eps_people_roles WHERE eps_people_roles.ep = 1 AND ( eps_people_roles.role = roles.id AND eps_people_roles.person = people.id )

    Update
    If you have the JOIN keyword available and it makes the query easier to read, by all means use it. It was added to SQL syntax for a reason...
      >If you can limit one of the tables on an >indexed column before joining, it may be faster.

      That's a helpful tip, thank you.



      ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print
Re: SQL JOINs vs WHERE statements
by glwtta (Hermit) on Aug 18, 2003 at 17:23 UTC
    From what I understand, in at least some engines it makes some difference, some of the time (emphasis on some). At least according to this page.

    I also vaguely remember having to do something silly in Oracle 8 to get the equivalient of a LEFT OUTER join; not sure what that was all about though.

      There is nothing silly or unusual about the left outer join in Oracle 8.

      Left outer join:
      table1.column1 (+) = table2.column2
      Right outer join:
      table1.column1 = table2.column2 (+)
      Full outer join:
      table1.column1 (+) = table2.column2 (+)
      Inner join:
      table1.column1 = table2.column2
      It isn't the *= syntax, but it is consistent and logical.
        Yeah, from what I remember I didn't know about the (+) syntax at the time, and what someone at work showed me was a UNION of an inner join with the values of a "null join" which would give you the rows that didn't join with the second table. Still weird, but not Oracle's fault :) In any case, I haven't done much with Oracle 8 and still know little about it.
      The resulting data set should be the same in both cases, but you are correct that the SQL optimizer might decide to choose different paths to resolve the query.

      FYI - in Sybase multi-table queries are optimized 4 tables at a time (by default - this is configurable for recent versions of ASE (12.0 and later, IIRC)). It should be noted that query optimization (and understanding why the server decides to use a certain query plan) is a bit of a black art, especially when working on really large data sets.

      Michael

Re: SQL JOINs vs WHERE statements
by CountZero (Bishop) on Aug 18, 2003 at 21:27 UTC

    Consider the following situation (slightly adapted from an example in the MySQL docs):

    You have a staff table that contains both staff and manager's details. If you wanted a list of staff, with their corresponding manager's details as well, you would refer to the same table twice, in essence joining the table to itself. eg.
    SELECT s1.StaffId, s1.StaffName, s1.Position, s1.ManagerId, s2.StaffNa +me AS ManagerName, s2.Position AS ManagerPosition FROM staff AS s1 INNER JOIN staff AS s2 ON(s1.ManagerId=s2.StaffId)

    Would you be able to do that with a WHERE clause instead of a JOIN?

    And how would you limit the results to all staff with position x?

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      Would you be able to do that with a WHERE clause instead of a JOIN?

      SELECT stf.StaffId, stf.StaffName, stf.Position, stf.ManagerId, mgr.StaffName AS ManagerName, mgr.Position AS ManagerPosition FROM staff AS stf, staff AS mgr where stf.ManagerId=mgr.StaffId

      I think... :-)


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

        (Warning: possible duplicate posting)

        Yes indeed, but the "comma" is really only a join in disguise:

        INNER JOIN and , (comma) are semantically equivalent. Both do a full join between the tables used. Normally, you specify how the tables should be linked in the WHERE condition. MySQL manual

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: SQL JOINs vs WHERE statements
by EdwardG (Vicar) on Aug 19, 2003 at 17:31 UTC

    You're not crazy, you can indeed get an equivalent join using a WHERE clause.

    But I don't recommend it.

    It mixes up your different types of clauses (JOIN vs WHERE), which makes your SQL harder to understand, and worse, you can get unintuitive results.

    For example, what would you expect the following to return?

    create table people (person_id integer not null primary key, surname v +archar(40) not null) insert into people values (1,'Smith') insert into people values (2,'Jones') insert into people values (3,'Harry') insert into people values (4,'Peanut') create table town (person_id integer not null references people(person +_id), town varchar(40) null) insert into town values (1,'Bristol') insert into town values (2,'Washington') insert into town values (3,NULL) insert into town values (4,'Cairns') SELECT surname, town FROM people, town WHERE people.person_id *= town.person_id AND (town <> 'Washington' AND town IS NOT NULL)
    Would you be surprised to get the following results?
    surname town + ------------------------------ --------------------------------------- +- Smith Bristol Jones NULL Harry NULL Peanut Cairns

    (Hint: You should be surprised, the SELECT was trying to exclude NULLS)

    The source of this ambiguity is that the IS NOT NULL part of the WHERE clause got evaluated before the join part, and the join part then re-introduced the NULLs.

    In contrast, the LEFT OUTER JOIN makes the intention clearer (albeit somewhat self-defeating in this example)

    SELECT surname, town FROM people LEFT OUTER JOIN town ON people.person_id = town.person_id WHERE (town <> 'Washington' AND town IS NOT NULL)

    Results:

    surname town + ---------------------------------------- ----------------------------- +----------- Smith Bristol Peanut Cairns
    Update: It seems that Oracle gives different results to MS SQL Server 2000, where this code was tested.
      Actually, in Oracle I get the second set of results (no matter what order I put the where clauses). What database did you use, EdwardG?
      Update
      BTW, it's bad form not to name the columns in an insert. Instead of
      insert into people values (1,'Smith')
      you should do
      insert into people ( person_id, surname ) values ( 1,'Smith' )
        I agree about the named columns, normally insist on them when reviewing, but that piece of code wasn't the pertinent part of the demonstration, just the setup. Thanks though.

        Oh, and I used MS SQL Server to test that code. Looks like there's also a portability issue :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://284480]
Approved by Zaxo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2022-10-06 07:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred way to holiday/vacation is:











    Results (26 votes). Check out past polls.

    Notices?