SQL Performance Explained Everything Developers Need to Know about SQL Performance

Author: Markus Winand
4.9
This Year Hacker News 2
This Month Hacker News 1

Comments

by boshomi   2019-07-12
@MarkusWindand: Thank you for the Websites [1],[2],[3] and the book »SQL Performance Explained«[4].

[1] https://www.amazon.de/SQL-Performance-Explained-Everything-p...

by adamnemecek   2018-01-07
You should try to understand how databases in general work, it will help you with your query writing.

One thing you have to realize is that once you get a little advanced, you have to get to the details of the single SQL implementations, it's not about SQL but about Postgres.

I've found these books really valuable

# SQL Performance Explained Everything Developers Need to Know about SQL Performance

https://www.amazon.com/Performance-Explained-Everything-Deve...

This book fundamentally talks about how to effectively use and leverage the SQL indices. Talks about all the important implementations (Postgres, MySQL, Oracle, SQL Server).

# Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

https://www.amazon.com/Designing-Data-Intensive-Applications...

This book gets mentioned a bunch around here and for a good reason. There aren't too many concrete resources on making your systems "webscale" and this one is really good.

# PostgreSQL 9.0 High Performance

https://www.amazon.com/PostgreSQL-High-Performance-Gregory-S...

Discusses all the different settings and tweaks you can do in Postgres. It's crazy how much of a perf gain you can get just by twiddling the parameters of the database, i.e. all the tricks you can do when the single instances are bottle necks.

There's a similar book for MySQL https://www.amazon.com/High-Performance-MySQL-Optimization-R...

# PostgreSQL 9 High Availability Cookbook

https://www.amazon.com/PostgreSQL-9-High-Availability-Cookbo...

Discusses how do you go from 1 Postgres instance to 1+ instance. Talks about replication, monitoring, cluster management, avoiding downtime etc i.e. all the tricks you can do to manage multiple instances. Again there's a similar book for MySQL https://www.amazon.com/MySQL-High-Availability-Building-Cent...

Last but not least check out the postgres documentation, people consider it a standard of what good documentation looks like https://www.postgresql.org/docs/9.6/static/index.html

Also last but not least, read up on relational algebra (the foundation of SQL) https://en.wikipedia.org/wiki/Relational_algebra. I've always found SQL to be extremely verbose (the syntax reminds me of idk COBOL or smth) but there's another query language called Datalog, that's for our purposes similar to SQL but the syntax is much more legible.

E.g. check out these snippets from these slides (page 29) (and check out the whole class too)

https://pages.iai.uni-bonn.de/manthey_rainer/IIS_1617/IIS201...

Datalog:

s(X) <- p(X,Y).

s(X) <- r(Y,X).

t(X,Y,Z) <- p(X,Y), r(Y,Z).

w(X) <- s(X), not q(X).

SQL:

CREATE VIEW s AS (SELECT a FROM p)

UNION

(SELECT b FROM r);

CREATE VIEW t AS

SELECT a, b, c

FROM p, r

WHERE p.b = r.a,

CREATE VIEW w AS (TABLE s)

MINUS (TABLE q);