Friday, March 30, 2012

Query about Ansi-Syntax

I need to write a stored procedure where I need to joing 3 tables A, B and C having approximately 200K, 500K, 800K rows respectively.

Query:
1) If I use ansi-syntax (inner join) as against non-ansi syntax (A.col1 = B.col1), I get a better performance.
Any idea why?

2) If I write a query (shown below), it tries to join table A and B returning large number of rows.

Select A.Col1, A.Col2
from A, B
where A.Col3 = 'xyz'

Why does it try to join the table B with A though there is no join specified.As far as the second query is concerned this is the same as writing

Select A.Col1, A.Col2
from A join B
where A.Col3 = 'xyz'

Basically, don't put the table in the from clause if you are not going to use it because it makes the query REALLY inefficient. This is a cross join and will probably give you heaps of extra records.|||actually, ejustuss, your query generates an error

if we want a cross join, we have to say CROSS JOIN, not just JOIN

and yes, there are many situations where we want to join one table to another, and yet select columns only from one|||This old school syntax can cause a lot of heartburn when you get into multiple outer joins. I do not think MS SQL 2K supports *=

Select A.Col1, A.Col2
from A, B
where A.Col3 = 'xyz'|||I do not think MS SQL 2K supports *=yes, it (still) does
create table Oldschool1
( id tinyint not null primary key identity
, foo varchar(9)
)
insert into Oldschool1 (foo) values ('Curly')
insert into Oldschool1 (foo) values ('Larry')
insert into Oldschool1 (foo) values ('Moe')
insert into Oldschool1 (foo) values ('Shemp')
insert into Oldschool1 (foo) values ('Joe')
insert into Oldschool1 (foo) values ('Curly Joe')

create table Oldschool2
( id tinyint not null primary key identity
, bar varchar(9)
)
insert into Oldschool2 (bar) values ('Curly')
insert into Oldschool2 (bar) values ('Larry')
insert into Oldschool2 (bar) values ('Moe')

select t1.id,t1.foo,t2.id,t2,bar
from Oldschool1 as t1
, Oldschool2 as t2
where t1.foo *= t2.bar
order by 1 :)|||As far as your first question,

The difference between ANSI (inner join) syntax and non-ANSI is that the optimizer will not try to derive joins that are ANSI compliant. That's because you've explicitly defined the joins. The optimizer will have to derive non-ANSI compliant joins. In many cases the query plans will be the same. However there's a greater risk the optimizer won't use the optimal plan if it has to derive the joins.

There's probably some people here that can explain this a lot better than me. Hope this helps.|||As far as your second question,

The results you're getting from that query are called a cartesian product. Since a join hasn't been specified it will match each record from on table to each record from the other. Thus if you have two tables with 100 rows each 10,000 rows will be returned.|||What I was trying to say was that in example 2 of the original post the join was implicit. Sorry about the error in syntax. Where would you want a query of the form

Select A.Col1, A.Col2
from A cross join B
where A.Col3 = 'xyz'

?|||where would you want a cross join?

1. to generate test data, e.g. a large range of dates from integers

2. with a left outer join to find missing many-to-many rows

3. to join a one-row table with application constants such as today's interest rate|||I was actually refering to the query as originally put with no reference to any column in the second table AT ALL. The generation of test data seems like a legitimate use though the other 2 surely require a reference to a column from the 2nd table in the the query somewhere even if only in the join condition.

Its just that I have seen queries with additional tables in the from clause which were unnecessary and unused and actually increased the amount of time taken to run the query substanitially.|||oh, that question

okay, well, the join would be used to ensure existence of a related row, even if you didn't need to actually return any data from the second table

for example, assume car owners are in one table, and parking tickets are in another, you might say "give me the names of every owner who had a parking ticket" and not return any row from the parking ticket table at all|||Well in this case would you not say something along the lines of

select car.owner
from car,ticket
where car.owner = ticket.reciever

?

my point being that you are actually referencing a column from the ticket table in the where clause (or join condition depending on how you write the query) - which is not what is happening in the original example.|||...which is not what is happening in the original example.that's right, it's a cross join

No comments:

Post a Comment