Week #2: 100 Days of SQL Optimisation

How Well-Designed Query Structures Can Boost SQL Query Efficiency

Nov 14, 2024

Optimising SQL is like constructing a building. Without a solid blueprint, you might end up with inefficiencies or wasted resources. By carefully planning each join, filter, and aggregation, you ensure that your queries are built on a strong foundation, leading to a polished and efficient final result.

The examples in this post come from IMDb's Non-Commercial Datasets1, which provide a comprehensive look at movies, ratings, and more. This rich data allows us to explore real-world optimisations that bring tangible performance gains.

In this post, we dive into week two of our 100-day SQL optimisation journey.

Day 8: Choosing the right Join Type

Goal: Retrieve a list of movie titles and their corresponding average rating.

Details:

Day 9: Single-column vs. Multi-column Indexes

Goal: Retrieve a list of movies released after 2010, sorted by startYear and runtimeMinutes.

Details:

Day 10: Distinct vs. Window Functions

Goal: Retrieve the highest-rated movie for each genre.

Details:

Day 11: Index-Only Scans

Goal: Retrieve the primary title and start year of all movies released after 2000.

Details:

Day 12: Changing the Order of Joins

Goal: Retrieve a list of movie titles, their average ratings, and the names of directors for movies released after 2015.

Details:

Day 13: Proper Joins vs. Cartesian Product

Goal: Retrieve a list of movie titles and the names of directors for movies released after 2015.

Details:

Day 14: Temporary Tables

Goal: Retrieve a list of movie titles with high ratings (>8) that were released after 2015.

Details:

Conclusion

In this second week of our "100 Days of SQL Optimisation" journey, we explored several powerful techniques to improve query performance:

Switching LEFT JOINs to INNER JOINs to reduce unnecessary data processing.
Applying filters early to minimise rows involved in joins.
Creating multi-column indexes for efficient sorting and scanning.
Using window functions like ROW_NUMBER() to replace inefficient DISTINCT operations.
Leveraging Index-Only Scans to reduce I/O operations.
Reordering joins to avoid nested loops and reduce lookups.
Using temporary tables to avoid redundant computations and enhance data reuse.

These optimisations significantly reduced execution times, making our queries faster and more efficient.

If you're interested in learning more practical SQL optimisations, follow along with our 100-day journey. Subscribe to stay updated, each day, you'll gain new insights into making your SQL queries more powerful and efficient!

If you missed the Week 1 recap, you can catch up by checking it out here:

Week #1: 100 Days of SQL Optimisation

Pipeline to Insights

November 7, 2024

Think of each SQL query like a high-stakes puzzle. At first, you might try every piece, hoping something fits. But with the right strategies, indexing, filtering, and window functions, you solve it faster, with everything snapping perfectly into place.

Read full story

Join The Conversation!

We hope you found this post helpful. If you have any questions, comments, or experiences you'd like to share, we'd love to hear from you!

Pipeline To Insights

Week #1: 100 Days of SQL Optimisation

Discussion about this post