<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: EJAPP Top 10 countdown: #1 &#8211; Incorrect database usage</title>
	<atom:link href="http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-database-usage/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-database-usage/</link>
	<description></description>
	<lastBuildDate>Thu, 18 Mar 2010 13:27:36 +0100</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Wouter van Reeven</title>
		<link>http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-database-usage/comment-page-1/#comment-7043</link>
		<dc:creator>Wouter van Reeven</dc:creator>
		<pubDate>Thu, 03 May 2007 07:56:26 +0000</pubDate>
		<guid isPermaLink="false">http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-usage-of-databases/#comment-7043</guid>
		<description>Hi Ed,


This remark originated from me. In your example, let&#039;s say we would like to execute a query like this

select * from A, B, C, D where A.b = B.id and B.c = C.id and A.d = D.id

Please note that A has a foreign key to both B and D. I noticed in several cases that the query execution time decreases a lot if size(A) &gt; size(B) &gt; size(D). If e.g. size(A) &gt; size(D) &gt; size(B) the query should be

select * from A, D, B, C where A.d = D.id and A.b = B.id and B.c = C.id

I do agree that this is only the case when one table has foreign keys to more than one other table and if one of those other tables contains much more rows than the others.


Greets, Wouter van Reeven</description>
		<content:encoded><![CDATA[<p>Hi Ed,</p>
<p>This remark originated from me. In your example, let&#8217;s say we would like to execute a query like this</p>
<p>select * from A, B, C, D where A.b = B.id and B.c = C.id and A.d = D.id</p>
<p>Please note that A has a foreign key to both B and D. I noticed in several cases that the query execution time decreases a lot if size(A) &gt; size(B) &gt; size(D). If e.g. size(A) &gt; size(D) &gt; size(B) the query should be</p>
<p>select * from A, D, B, C where A.d = D.id and A.b = B.id and B.c = C.id</p>
<p>I do agree that this is only the case when one table has foreign keys to more than one other table and if one of those other tables contains much more rows than the others.</p>
<p>Greets, Wouter van Reeven</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ed Kusnitz</title>
		<link>http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-database-usage/comment-page-1/#comment-6947</link>
		<dc:creator>Ed Kusnitz</dc:creator>
		<pubDate>Tue, 01 May 2007 17:56:38 +0000</pubDate>
		<guid isPermaLink="false">http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-usage-of-databases/#comment-6947</guid>
		<description>Our dba took exception to this same point about the ordering of tables:
&quot;Oracle 9i executes queries a LOT faster when the tables in both the FROM and the WHERE clauses are ordered from large to small.” 
     Well, it depends what he means by this, but this is pretty much wrong. Let&#039;s say that I&#039;m joining parent table A to a couple of it&#039;s child lookup tables:

 

select *

from A

inner join B on A.b=B.id

inner join C on A.c=C.id

inner join D on A.d=D.id

 

    In this example, it doesn&#039;t matter *at all* what order I list the 3 joins in. Since we&#039;re inner joining, the contents of B, C, and D each act as filters on A; we also want to do as little IO as possible to get the correct answer as quickly as possible. Thus, the fastest way to do this operation (imagine you had to do it by hand in Excel, if that helps) is to start with the smallest child table, use it&#039;s index on &quot;id&quot; (hopefully) to quickly match a row in A to a value in the child table, and then remove everything that doesn&#039;t match from our buffered copy of A[1]. That means that when we then compare to the remaining 2 child tables, we&#039;ve got the smallest possible set of rows to join to the child&#039;s index. We repeat the procedure by picking the smaller (ie, the better filter) of the 2 remaining child tables first, and we do the &quot;loosest&quot; join last.

 

    That&#039;s a simple example of finding an optimum execution plan. In order to find it, the 2 pieces of information I needed were (1) the sizes of the tables involved, and (2) whether or not the join columns were indexed. If you wanted to be more precise while trying to decide which join-filter to apply first, you might also take into consideration things like how easy index was to use (if it&#039;s got a lot of columns you don&#039;t need in it, the IO in the index can be significant) and the cardinality (selectiveness) of your join condition in the child table, if that condition isn&#039;t unique. 

 

    These are all things the optimizer looks at, guaranteed. Sizes are gathered by statistical sampling; you can see those stats in the system view all_tables, for example. Indexes are obviously in the data dictionary as well, and the optimizer has stats to refer to for those that tell it about cardinality and fragmentation.

 --Can you explain more what you mean?

    AFAIK, the only case where join order matters in something like this is when size and cardinality are the same, which is really an edge case...</description>
		<content:encoded><![CDATA[<p>Our dba took exception to this same point about the ordering of tables:<br />
&#8220;Oracle 9i executes queries a LOT faster when the tables in both the FROM and the WHERE clauses are ordered from large to small.”<br />
     Well, it depends what he means by this, but this is pretty much wrong. Let&#8217;s say that I&#8217;m joining parent table A to a couple of it&#8217;s child lookup tables:</p>
<p>select *</p>
<p>from A</p>
<p>inner join B on A.b=B.id</p>
<p>inner join C on A.c=C.id</p>
<p>inner join D on A.d=D.id</p>
<p>    In this example, it doesn&#8217;t matter *at all* what order I list the 3 joins in. Since we&#8217;re inner joining, the contents of B, C, and D each act as filters on A; we also want to do as little IO as possible to get the correct answer as quickly as possible. Thus, the fastest way to do this operation (imagine you had to do it by hand in Excel, if that helps) is to start with the smallest child table, use it&#8217;s index on &#8220;id&#8221; (hopefully) to quickly match a row in A to a value in the child table, and then remove everything that doesn&#8217;t match from our buffered copy of A[1]. That means that when we then compare to the remaining 2 child tables, we&#8217;ve got the smallest possible set of rows to join to the child&#8217;s index. We repeat the procedure by picking the smaller (ie, the better filter) of the 2 remaining child tables first, and we do the &#8220;loosest&#8221; join last.</p>
<p>    That&#8217;s a simple example of finding an optimum execution plan. In order to find it, the 2 pieces of information I needed were (1) the sizes of the tables involved, and (2) whether or not the join columns were indexed. If you wanted to be more precise while trying to decide which join-filter to apply first, you might also take into consideration things like how easy index was to use (if it&#8217;s got a lot of columns you don&#8217;t need in it, the IO in the index can be significant) and the cardinality (selectiveness) of your join condition in the child table, if that condition isn&#8217;t unique. </p>
<p>    These are all things the optimizer looks at, guaranteed. Sizes are gathered by statistical sampling; you can see those stats in the system view all_tables, for example. Indexes are obviously in the data dictionary as well, and the optimizer has stats to refer to for those that tell it about cardinality and fragmentation.</p>
<p> &#8211;Can you explain more what you mean?</p>
<p>    AFAIK, the only case where join order matters in something like this is when size and cardinality are the same, which is really an edge case&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: James Stansell</title>
		<link>http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-database-usage/comment-page-1/#comment-6898</link>
		<dc:creator>James Stansell</dc:creator>
		<pubDate>Mon, 30 Apr 2007 20:49:06 +0000</pubDate>
		<guid isPermaLink="false">http://blog.xebia.com/2007/04/29/ejapp-top-10-countdown-1-incorrect-usage-of-databases/#comment-6898</guid>
		<description>The point about &quot;Changing the order of the tables in the FROM and WHERE clauses&quot; for oracle 9i can also apply to 8i and 10g when the rule-based optimizer (RBO) (RULE mode) is active.  In general the point doesn&#039;t apply when the cost-base optimizer (CBO) is active.

This gets back to 2 of your other points: 1) configure your DB properly; and 2) understand the query plans that are being used.

There&#039;s a LOT of outdated or just plain wrong information available on the internet.  I&#039;ve learned the hard way regarding Oracle.  The bottom line is to verify that the advice you follow actually does improve performance for your app.  Bonus points for understanding why the tip worked for you.

-james.</description>
		<content:encoded><![CDATA[<p>The point about &#8220;Changing the order of the tables in the FROM and WHERE clauses&#8221; for oracle 9i can also apply to 8i and 10g when the rule-based optimizer (RBO) (RULE mode) is active.  In general the point doesn&#8217;t apply when the cost-base optimizer (CBO) is active.</p>
<p>This gets back to 2 of your other points: 1) configure your DB properly; and 2) understand the query plans that are being used.</p>
<p>There&#8217;s a LOT of outdated or just plain wrong information available on the internet.  I&#8217;ve learned the hard way regarding Oracle.  The bottom line is to verify that the advice you follow actually does improve performance for your app.  Bonus points for understanding why the tip worked for you.</p>
<p>-james.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
