Category Archives: MySQL

Index Hinting

If you do not specify any indexes for a table, MySQL needs to search every row in the table in order to return the results. This is known a full table scan. It is extremely undesirable for tables with a large row count such as products, users, or session information as it can take a long time to pull the data you need.

Indexes/Indices

I’m still learning a bit about how indexes work, but my high level understanding of them is that they create “sub-tables” (that’s my own way to think about them). In these sub-tables it has a column for the column being indexed that is ordered low to high and then a second column that contains a reference the main table rows. For instance:
People

ID First Last Age
1 Levi Jackson 25
2 Lindsay Webster 26
3 Ellie Webster 1
CREATE INDEX person_age ON People (Age);

The index would look like this in my head:

Age ID
1 3
25 1
26 2

 

So by creating indexes you make it easier to find results by age because it can do something like a binary search to find the row quicker. Normally the query optimizer will pick out the best index to use, but sometimes you may find yourself needing to pick the index you want the query to use, or making sure it DOES NOT use a certain index.

Use Index

Tell MySQL to use a particular index. The optimizer can still choose to do a full table scan if it deems that to be quicker than using the index.

SELECT * FROM people USE INDEX(person_age) WHERE ID > 3 AND age > 20;

This will end up doing a full table scan instead of using the person_age index as it is quicker.

Force Index

Tell MySQL to use a particular index, but to only fall back on doing a full table scan if it absolutely cannot use the index to return results.

SELECT * FROM people FORCE INDEX(person_age) WHERE ID > 3 AND age > 20;

Without the FORCE INDEX, it would use the primary key to find the results.

Ignore Index

Tell MySQL not to use a particular index.

SELECT * FROM people IGNORE INDEX(person_age) WHERE ID > 3 AND age > 20;

This will use just the primary key to find the results. Without the IGNORE INDEX it would have used the person_age index to narrow the result set down a bit.

 

Something I found to be interesting is that when comparing USE INDEX and FORCE INDEX, that the plain ol’ queries would have used the primary key by default, but that by using USE INDEX it was casting aside the possibility of using the primary key and instead was choosing between the key you specified and a full table scan. So using a USE INDEX may have quite the negative repercussions should the query optimizer find it is not an efficient key to use.

MySQL Joins Rehashing

I was reading this post from Jeff Atwood about visually showing the different types of SQL joins. It got me thinking about the different MySQL joins and I wanted to learn more about them. So this post is my rehashing of the different type of joins. Enjoy!

I’ll be using tables like the following for my explanations:

People

ID Name
1 Levi
2 Lindsay
3 Ellie

Places

ID Location
1 Rhode Island
2 Vermont
3 Maine
4 New Hampshire

Inner Join

Most people have heard of an inner join before. In fact, i’d go as far as saying it’s the most commonly used join. This joins together all the rows specified into one big result set.

// This will return all of the columns from both tables
SELECT * FROM tableA INNER JOIN tableB ON tableA.ID = tableB.productID;

// This will return just the 3 fields specified
SELECT a.ID, a.Name, b.ProductID FROM tableA AS a INNER JOIN tableB AS b ON a.ID = b.productID;

Cross Join

A cross join will take two tables and mesh them together in every possible combination.
The output of this query would be:

SELECT * FROM People CROSS JOIN Places;
ID Name ID Location
1 Levi 1 Rhode Island
1 Lindsay 1 Rhode Island
1 Ellie 1 Rhode Island
1 Levi 2 Vermont
1 Lindsay 2 Vermont
1 Ellie 2 Vermont
1 Levi 3 Maine
1 Lindsay 3 Maine
1 Ellie 3 Maine
1 Levi 4 New Hampshire
1 Lindsay 4 New Hampshire
1 Ellie 4 New Hampshire

On the surface the Cross Join doesn’t seem very useful. I had to Google around quite a bit to find use cases for it. As best I can tell it is useful when you have a standard set of values in table 2 that pertain to all of the values from table 1, such as color options or times.

Straight Join

Normally the query optimizer will figure out the quickest path to join tables together. This may end up being that it parses the last table joined and working back to the first table. Straight joins are just a way to force MySQL to read the tables left to right.

SELECT People.Name, Places.Location FROM People STRAIGHT JOIN Places ON People.ID = Places.ID;

Left Join (or Left Outer Join)

Returns all of the rows from the left table along with the matching rows from the right table. If there is no match it will return null for the column.

SELECT People.Name, Places.Location FROM People LEFT JOIN Places ON People.ID = Places.ID;
Name Location
Levi Rhode Island
Lindsay Vermont
Ellie Maine

Right Join (or Right Outer Join)

The opposite of the Left Join, it returns all of the rows from the right table along with the matching rows from the left table.

SELECT People.Name, Places.Location FROM People RIGHT JOIN Places ON People.ID = Places.ID;
Name Location
Levi Rhode Island
Lindsay Vermont
Ellie Maine
null New Hampshire

Natural Join

A natural join is when MySQL looks for columns with the same name to join on. For instance:

States

ID State Abbreviation
1 Rhode Island RI
2 Vermont VT
3 Maine ME
4 New Hampshire NH

Addresses

ID Town
State
1 Warwick Rhode Island
2 Burlington Vermont
3 Berwick Maine
4 Laconia New Hampshire
SELECT * FROM Addresses NATURAL JOIN States;
ID State
Town
Abbreviation
1 Rhode Island Warwick RI
2 Vermont Burlington VT
3 Maine Berwick ME
4 New Hampshire Laconia NH

Natural Left/Right Join

Same as a normal left/right join, only it used the magic of the natural join to join on columns with the same name.

Thoughts

The most useful in my opinion seem to be inner, left, and right joining. Inner gives you combined results excluding null values, and left and right allow you to search for rows that are null in certain spots easily.

Natural joining is a bit too open for my liking. I prefer to explicitly list the columns to join on.

While I understand cross joining, I don’t think it will be useful in my day to day work at this point. It seems like it has a pretty specific use, and that use isn’t very often.