SQL Formatting

4 minute read

With all the abstraction layers between code & database, the need for writing SQL statements deminishes. But over the years I've developed my own way of formatting them. And before I'll have to wave queries goodbye forever ; ) I just want to put it out there.

Code spends more time being read then being written. I think naturally this is true for queries as well. So it might help if we teach ourselves some guidelines as how to nicely format them.

We can use a couple of instruments inside SQL to format our queries.

  • SQL whitespace is ignored by MySQL's parser, so let's use it to make our lives easier, right?
  • Dummy conditions have no effect in SQL statements. Things like: WHERE 1.

Ok let's just begin with some of my SQL layout habits.

All Clauses Get Their Newline

This will provide a clear seperation of different parts of the query, making it more easy to read & comprehend. And it will enable you to better maintain the query cause you can jump between different clauses bij just pressing up & down.

Before

SELECT * FROM `books` ORDER BY `title`

After

SELECT *
FROM `books`
ORDER BY `title`

N.b.: Whitespace is ignored by MySQL's parser, but not by MySQL's query cache on versions <5 , thanks to 'foobar' for pointing that out. So if you want query cache to pay off.. Pick one format & stick with it ; )

Accompanied Fieldnames Get Their Own Newline

This will allow you to very easily (or even dynamically) add or remove certain fields from the query.

Before

SELECT `id`, `title`, `rating`
FROM `books`
ORDER BY `title`

After

SELECT
  `id`,
  `title`,
  `rating`
FROM `books`
ORDER BY `title`

Lonely Fieldnames Stay on the Same Line

This will keep the query more or less compact and avoid that even the simplest query takes up 10 lines.

Before

SELECT
  `id`,
  `title`
FROM
  `books`
ORDER BY
  `title`

After

SELECT
  `id`,
  `title`
FROM `books`
ORDER BY `title`

Where 1

So now that we've played with whitespace a bit, there are other things that have no effect in SQL as well. Like 1. This short expression can really help us define conditions in a more uniform way.

Let's look at an example.

Before

SELECT `id`
FROM `books`
WHERE
  `published` = 'yes'
  AND `rating` > 5`

Hm, too bad, published = 'yes' is formatted differently from rating > 5 because it doens't have the AND word. We'd have to account for that every time we change the conditions, or if we were generating this query automatically: 'Always prefix with AND… UNLESS it's the first condition'. Bad for layout. Bad for automation.

After

SELECT `id`
FROM `books`
WHERE 1
  AND `published` = 'yes'
  AND `rating` > 5`

One other big advantage of writing your conditions in such a uniform format, is that it becomes really easy to temporarily turn off conditions with MySQL Comments:

SELECT `id`
FROM `books`
WHERE 1
  -- AND `published` = 'yes'
  AND `rating` > 5`

..without breaking SQL syntax. Cause remember that if I had done this with the before query, I would have gotten a syntax error.

The Performance Hit of 1

While this dummy syntax provides developers with some great comfort, of course we have to make sure this addition doesn't come at a price. MySQL guru Erwin Bleeker benchmarked on multiple occasions with 1 bilion queries (no cache), to find that the results only differed by one hundreds of a second on average.

This is how he benchmarked:

mysql> select benchmark(1000000000, (select SQL_NO_CACHE 1 from employees WHERE 1 limit 1));
+-----------------------------------------------------------------------------------+
| benchmark(1000000000, (select SQL_NO_CACHE 1 from medewerkers WHERE 1 limit 1)) |
+-----------------------------------------------------------------------------------+
| 0                                         |
+-----------------------------------------------------------------------------------+


1 row in set (24.84 sec)


mysql> select benchmark(1000000000, (select SQL_NO_CACHE 1 from employees limit 1));
+-------------------------------------------------------------------------+
| benchmark(1000000000, (select SQL_NO_CACHE 1 from medewerkers limit 1)) |
+-------------------------------------------------------------------------+
| 0                                     |
+-------------------------------------------------------------------------+


1 row in set (24.83 sec)

Where 0

As you may have guessed, our dummy condition also works for OR queries. Just negate the dummy condition: 0, look:

SELECT `id`
FROM `books`
WHERE 0
  OR `published` = 'yes'
  OR `rating` > 5`

Joins

Let me just show you how I go about this:

SELECT
  `authors`.`id`,
  `authors`.`name`,
  `authors`.`birthday`,
  COUNT(`books`.`id`) AS book_pub_cnt
FROM `authors`
LEFT JOIN `books` ON (1
  AND `books`.`author_id` = `authors`.`id`
  AND `books`.`published` = 'yes'
)
WHERE 1
  AND `authors`.`alive` = 'yes'
GROUP BY `authors`.`id`
  ORDER BY `authors`.`name`

Furthermore

You may have noticed that I:

Use Backticks to Enclose All Database Entities

That's just good habit, this way if you ever have an ambiguous fieldname ('active', or 'status', could be interpretted as statements), your database will know that you mean the fieldname or table, and not the statement.

Use Single Quotes to Enclose Strings

So you can enclose the entire query in double quotes, and be able to use variables from PHP without concatenation. On the other hand: You should really use prepared statements, and in my eyes concatination is better than let PHP automatically substitute your vars. But hey, if you need to choose anyway, might as well be single quotes.

Use Spaces Instead of Tabs

I tend to do this in PHP & other languages as well. It allows for consistent layout in all possible editors & views. The tab character sometimes also have unwanted (e.g. autocomplete) impact when you paste into a console. There's an interesting post about it here.

Conventions in General

Sometimes conventions rely more on taste than reason. In that case look at it this way:

It's irrelevant if people drive on the right or left side of the road. As long as they all do the same :)

Now I'm not implying there will be fatal accidents if your fellow programmers use different styles. But it will definitely contribute to the success for your project if you are able to agree on standards.

Leave a Comment Right Here