Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
954 views
in Technique[技术] by (71.8m points)

postgresql - Are there any options to a join table for many-to-many associations?

Let's say I have two tables with data that isn't supposed to change through the use of my application (i.e. domain tables) for something like the model of a car and the color of a car.

The models table lists different types of car models that exist.

The colors table lists different types of colors that a car can come in.

Now, it might not be evident right away, but there exists a relationship between the two. A color may only be available for a certain model, or more likely several models of the same make. A model certainly doesn't come in a specific color, but it does come with a selection, or choice, of colors that I do want to store in the database.

So there exists a many-to-many relationship between them, which suggests that I should store the details of that relationship in a join table. So, if you forgive my own notation, it would look something like this:

Volvo V70 <-> Pearl White
Volvo V70 <-> Emerald Green
Volvo V70 <-> Night Black
Volvo V70 <-> Salmon Pink
Volvo V70 <-> Ocean Blue
Volvo V70 <-> Raspberry Red

Volvo V60 <-> Pearl White
Volvo V60 <-> Emerald Green
Volvo V60 <-> Night Black
Volvo V60 <-> Salmon Pink
Volvo V60 <-> Ocean Blue
Volvo V60 <-> Raspberry Red

...

This is A LOT of repeated text. Easier would be if I could just do

[Volvo V70, V60, S60, S40] <-> [Pearl White, Emerald Green, Night Black, Salmon Pink, Ocean Blue, Raspberry Red]

and move on to the next car model and set of colors.

Are there any options to a regular join table that can simplify this process in any way?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  • If models don't share colour groups then the design would be one table:

    model [model] comes in color [color]
    
  • If models share colour groups then have two tables:

    model [model] comes in the colors of group [group]
    group [group] has color [color]
    

    These tables join with projection to the first table:

    SELECT model, color FROM model_group NATURAL JOIN group_color
    
  • If a model can have exceptional available and/or unavailable colours in addition to or instead of a group then have exception tables. A table's group is now its default colours (if any):

    model [model] has default color group [group]
    group [group] has color [color]
    model [model] is exceptionally available in color [color]
    model [model] is exceptionally unavailable in color [color]
    

    The exception tables are then respectively UNIONed with and MINUSed/EXCEPTed from a JOIN-plus-PROJECT/SELECT to give the first table:

    SELECT group, color FROM model_default NATURAL JOIN group_colour
    EXCEPT SELECT * FROM model_unavailable
    UNION SELECT * FROM model_available
    

"Redundancy" is not about values appearing in multiple places. It is about multiple rows stating the same thing about the application.

Every table (and query expression) has an associated fill-in-the-(named-)blanks statement template (aka predicate). The rows that make a true statement go in the table. If you have two independent predicates then you need two tables. The relevant values go in the rows of each one.

Re rows making statements about the application see this. (And search my other answers re a table's "statement" or "criterion".) Normalization helps because it replaces tables whose rows state things of the form "... AND ..." by other tables that state the "..." separately. See this and this.

If you share groups and only use a single two-column table for model and color then its predicate is:

FOR SOME group
    model [model] comes in the colors of group [group]
AND group [group] has color [color]

So the second bullet removes a single "AND" from this predicate, ie the source of a "multivalued dependency". Otherwise if you change a model's group or a group's colours then you have to simultaneously consistently change multiple rows. (The point is to reduce errors and complexity from redundancy, not save space.)

If you don't want to repeat the strings for implementation(-dependent) reasons (space taken or speed of operations at the expense of more joins) then add a table of name ids and strings and replace your old name columns and values by id columns and values. (That's not normalization, that's complicating your schema for the sake of implementation-dependent data optimization tradeoffs. And you should demonstrate this is needed and works.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...