See, it is much easier to understand the thing backwards.
First the 5NF; a table (relational variable) is in the 5NF if decomposing it would not remove any redundancies. So, it is final NF as far as removing redundancy is concerned.
The original table obviously has some redundancy. It claims that "Smith represents Ford." twice, and "Smith represents GM." twice.
So let's see is it possible to decompose this into two or more projections and reduce some redundancy.
Let's start backwards.
Company exists. {COMPANY}
Agent exists. {AGENT}
Product exists. {PRODUCT}
Company makes Product. {COMPANY, PRODUCT}
Agent represents Company. {AGENT, COMPANY}
A pause here; suppose a rule was "If an agent represents a company, and the company makes a product, then the agent sells that product".
This would be simply {AGENT, COMPANY} JOIN {COMPANY, PRODUCT}
; but this would generate an extra tuple, namely (Jones, Ford, truck)
; which is not true because Jones does not sell trucks.
So, not every agent sells every product, hence it is necessary to state that explicitly.
- Agent sells Product.
{AGENT, PRODUCT}
Now if we join
{AGENT, COMPANY} JOIN {COMPANY, PRODUCT} JOIN {AGENT, PRODUCT}
that extra tuple is eliminated by the join to the {AGENT, PRODUCT}
.
To grasp things intuitively, the rule can be modified a bit.
Original
If an agent sells a certain product, and he represents a company
making that product, then he sells that product for that company.
Modified (same meaning)
If an agent sells product, and agent represents company,
and the company makes that product, then agent sells that product for
that company.
Explained (substitute from bullet points above)
If {AGENT, PRODUCT}
and {AGENT, COMPANY}
and {COMPANY, PRODUCT}
then {AGENT, COMPANY, PRODUCT}
.
So, the rule allows for the join to happen -- and hence the decomposition.
Now compare that to the predicate of the original table:
Agent represents a Company and sells some Product that the company makes.
Not the same as the rule, so it is open to anomalies which would violate the rule -- see Bill Karwin's example.
EDIT (see comments below)
Suppose that we have the original table, but not the rule.
It is obvious that there is some redundancy in the table, so we may wonder if there is a way to remove that redundancy somehow -- usual way is decomposition into projections of the table.
So, after some tinkering, we figure out that it can be decomposed into {AGENT, PRODUCT}, {AGENT, COMPANY}, {COMPANY, PRODUCT}
. Current data certainly allows for that -- as per your example.
And we do that, and whenever interested in "Which agent sells which product from which company?" the answer is simply
{AGENT, COMPANY} JOIN {COMPANY, PRODUCT} JOIN {AGENT, PRODUCT}
Then Honda shows up, and they make cars and trucks too. Well, no problem there, just insert (Honda, truck) , (Honda, car)
into {COMPANY, PRODUCT}
.
Then Smith decides to sell Honda cars, but not trucks. Sorry, no way, oops! Because he already sells cars and trucks, if he wants to represent Honda, he has to sell both.
Because we would have tuples
(Smith, Honda) (Honda, truck) (Smith, truck)
(Honda, car) (Smith, car)
So we have introduced the rule! Really did not want to -- was just trying to get rid of some redundancy.
The question is now, was the original dataset just a fluke, or was it a result of a rule which was enforced somehow outside of the DB?
The author (Kent) claims that the rule exists and the design does not match it. Certainly, it would not be a problem for the original table to accept (Smith, Honda, car)
only -- not requiring (Smith, Honda, truck)
.
Theoretical point (ignore if boring)
The rule
If {AGENT, PRODUCT} and {AGENT, COMPANY} and {COMPANY, PRODUCT} then {AGENT, COMPANY, PRODUCT}
; for every (Agent, Company, Product)
triplet.
explicitly states that join dependency
* { {AGENT, COMPANY}, {COMPANY, PRODUCT}, {AGENT, PRODUCT} }
holds for the original table.
As often stated, cases like this are rare; actually so rare that even textbook examples have to introduce weird rules in order to explain the basic idea.
EDIT II (the fun part, but may help understanding)
Suppose that the rule does not exist, and there is explicit requirement that any agent can sell what ever he wants from any company -- hence the rule would be plain wrong.
In that case we have the original table
{AGENT, COMPANY, PRODUCT}
I would argue that:
Being all-key, it is in BCNF.
It can not be decomposed (current data may allow it, but future does not).
It is in BCNF, all key, it can not be decomposed, hence it is in 5NF.
It is in 5NF and is all-key, hence it is in 6NF.
So, it is the presence or non-existence of the rule that determines if the table is in BCNF or 6NF -- same table same data.