• Join tables with SQL

    My last article was digging is what SQL could do as advanced queries likes for example creating reports with basically a single query, using grouping and functions. However it seems that a little more basic talk about some more fundamentals operations are required. So I will talk here about JOINS, as before I will give you the needed tables here after so you can try step by step on your side how things works. Again I assume you know how to use phpMyAdmin or any other tools which let you "talk" to the database. I made also a relational database drawing to clarify a bit how the tables are related to each others:

    http://engine.nowhere-else.org/article5_tables.txt



    I must first start with a little history this time, and explain what the word database means. In the 1960s people started to "index" data to be able to access them more quickly, "indexes" are like rolodex which let you find quickly where something is stored inside a large library. For me those early indexing of the data is the real start of the database area. At that time there wasn't a common way to store and retrieve data from those databases. Each software / vendor was offering their own set of tools to handle the data, but it was already a great improvement over simple files. The real start-up of databases on the PC can be the time of dBase (2 or 3, and further), as it was offering complete tools to create even "forms" to help users fill in data or search. Those early databases were already quiet powerful but lacked what today is called a "relational" database. Basically it wasn't possible to mix the result from two tables directly (until later versions at least), and there wasn't "constraint" which checked that you would not delete a "master" row without deleting all the related data and so on.

    To come back to our own days, we (nearly) all use SQL to access a database, and this is already a HUGE step forward as you don't have to learn from scratch each time you need to work with a new database, and you may even have some chance to port your software from one database to the other without a complete rewrite (well this one is not completely working honestly). The second HUGE step forward is what makes a database a "relational" database. But what is this? Basically a relational database is a database which understand that the data may be split over multiple tables. For example you have one player, but this player do have multiple items in his inventory, to do so, you will have one row in the player table and multiple rows in the inventory table, yet those rows are somehow linked and this is a RELATION.

    But wait, what does it mean that a database understand what a relation is? Well this is something a bit more complex as there is multiple features involved here, and sadly, MySQL doesn't support them all with the default engine (myIsam), anyhow let's try to see them:
    - Possibility to retrieve with one query data from multiple tables (MySQL support it)
    - Possibility to define the relation on the database design level (myIsam doesn't support it)
    - Possibility to block the deletion of a "master" row if related data are still present, or delete all in "cascade" (myIsam doesn't support it)

    Well as we see, MySQL with myIsam offers only limited support, but it's actually the most important one. So yes, MySQL is a relational database, as it support relations between tables.

    Yes but in practice how does that work? Actually the concept is rather simple, you do a select on multiple tables instead of a single and in the where condition you specify how those tables are "glued" together. This is called inner join (or full join):

    SELECT PLAYER.USERNAME, OBJECT.NAME, INVENTORY.NB_ITEMS
    FROM PLAYER,OBJECT,INVENTORY
    WHERE PLAYER.PLAYER_ID=INVENTORY.PLAYER_ID
    AND INVENTORY.OBJECT_ID=OBJECT.OBJECT_ID

    This retrieves all the owned items, their name, and who owns them. And if you read it in English we could read it like "read from the table player and object and inventory joined where PLAYER.PLAYER_ID match INVENTORY.PLAYER_ID and where INVENTORY.OBJECT_ID match OBJECT.OBJECT_ID". Does that start to clarify a bit? Well, we could try to remove the "matching" part and just try to select from 2 tables without any where, and guess what, it will work, but produce some odd result. This is called a cross operation as it will take each row of the first table, and each of the second and mix those. Let's use the two small tables SMALL_A and SMALL_B for that:

    SELECT * FROM SMALL_A, SMALL_B

    SMALL_A have 3 rows and SMALL_B 4 and guess how many rows we have with this selected? Well 3*4 = 12 so we have 12 resulting rows. Honestly I never used such queries, as I don't find any use in this, but this shows how the database works under, as the inner join is just a "filtered" version of this cross join.

    As we saw, the inner join is really useful, as it allows to retrieve data from multiple tables in one shot. But this is not the only kind of join relational database support, as there is as second group called outer join, however you will need to decide if you want a LEFT or RIGHT outer join. I would strongly suggest that you DO NOT mix LEFT and RIGHT joins in the same query as it will make the query really hard to read. Choose one and you will be up and running as normally you can do the same thing with LEFT or RIGHT joins. If we sum up, here is the 3 different joins:



    Let's try to make use of those LEFT join to see how it works and what it does. RIGHT join works the same but taking the right side of the operation, therefore I will make my examples only with the LEFT one.

    To make the use of a outer join, we need to build a query where only some of the linked rows do have a match, otherwise we would not see the difference with a inner join.

    SELECT PLAYER.USERNAME, INVENTORY.NB_ITEMS FROM PLAYER LEFT JOIN INVENTORY ON (PLAYER.PLAYER_ID = INVENTORY.PLAYER_ID)
    WHERE INVENTORY.OBJECT_ID = 2 OR INVENTORY.OBJECT_ID IS NULL

    As you see from the results, some players don't have this item ID 2 in their inventory and therefore the number will appear as NULL, but still the row appears, if we do it with an inner join:

    SELECT PLAYER.USERNAME, INVENTORY.NB_ITEMS FROM PLAYER, INVENTORY WHERE PLAYER.PLAYER_ID = INVENTORY.PLAYER_ID AND (INVENTORY.OBJECT_ID = 2 OR INVENTORY.OBJECT_ID IS NULL)

    You will see that you get only 2 rows. I left the "INVENTORY.OBJECT_ID IS NULL" in this last query even if it's not useful.

    If we come back to our left join, as said, we have 2 rows with a NULL value, in PHP it's not such a big issue to handle it, but you may want to have something instead of NULL, in this case you can do the following.

    SELECT PLAYER.USERNAME, IFNULL(INVENTORY.NB_ITEMS,0) FROM PLAYER LEFT JOIN INVENTORY ON (PLAYER.PLAYER_ID = INVENTORY.PLAYER_ID)
    WHERE INVENTORY.OBJECT_ID = 2 OR INVENTORY.OBJECT_ID IS NULL

    With the IFNULL function, you will either get the real value or the value you want as default in case the field is NULL.
    Comments 6 Comments
    1. sniko's Avatar
      sniko -
      This was a really good read! I think it should be in a much more obvious place, as I found it on the main page > database, which I think is rarely explored.
    1. Spudinski's Avatar
      Spudinski -
      Please note that MySQL is case-sensitive about column names.
      Although it's OS dependent, most Linux distributions follow this practice.
    1. a_bertrand's Avatar
      a_bertrand -
      Table names you mean Spud, not column names.... I think.
    1. Spudinski's Avatar
      Spudinski -
      Quote Originally Posted by a_bertrand View Post
      Table names you mean Spud, not column names.... I think.
      *Digs up old MySQL book*

      Indeed you are correct.

      MySQL stores each database (also called a schema) as a subdirectory of its data directory
      in the underlying filesystem. When you create a table, MySQL stores the table
      definition in a .frm file with the same name as the table. Thus, when you create a
      table named MyTable, MySQL stores the table definition in MyTable.frm. Because
      MySQL uses the filesystem to store database names and table definitions, case sensitivity
      depends on the platform. On a Windows MySQL instance, table and database
      names are case insensitive; on Unix-like systems, they are case sensitive. Each storage
      engine stores the table’s data and indexes differently, but the server itself handles
      the table definition.
      Ref.: High Performance MySQL, 2nd Edition - O'Reilly Media
    1. Sim's Avatar
      Sim -
      Also, there are reserved words for column names as well as them being case sensetive.
    1. a_bertrand's Avatar
      a_bertrand -
      Sim: column names are NOT case sensitive... However yes some words are not allowed like "select":
      http://dev.mysql.com/doc/refman/5.5/...ved-words.html
  • Recent Articles