Week #2: 100 Days of SQL Optimisation
How Well-Designed Query Structures Can Boost SQL Query Efficiency
Optimising SQL is like constructing a building. Without a solid blueprint, you might end up with inefficiencies or wasted resources. By carefully planning each join, filter, and aggregation, you ensure that your queries are built on a strong foundation, leading to a polished and efficient final result.
The examples in this post come from IMDb's Non-Commercial Datasets1, which provide a comprehensive look at movies, ratings, and more. This rich data allows us to explore real-world optimisations that bring tangible performance gains.
In this post, we dive into week two of our 100-day SQL optimisation journey.
Day 8: Choosing the right Join Type
Goal: Retrieve a list of movie titles and their corresponding average rating.
Details:
Day 9: Single-column vs. Multi-column Indexes
Goal: Retrieve a list of movies released after 2010, sorted by startYear
and runtimeMinutes
.
Details:
Day 10: Distinct vs. Window Functions
Goal: Retrieve the highest-rated movie for each genre.
Details:
Day 11: Index-Only Scans
Goal: Retrieve the primary title and start year of all movies released after 2000.
Details:
Day 12: Changing the Order of Joins
Goal: Retrieve a list of movie titles, their average ratings, and the names of directors for movies released after 2015.
Details:
Day 13: Proper Joins vs. Cartesian Product
Goal: Retrieve a list of movie titles and the names of directors for movies released after 2015.
Details:
Day 14: Temporary Tables
Goal: Retrieve a list of movie titles with high ratings (>8) that were released after 2015.
Details:
Conclusion
In this second week of our "100 Days of SQL Optimisation" journey, we explored several powerful techniques to improve query performance:
Switching LEFT JOINs to INNER JOINs to reduce unnecessary data processing.
Applying filters early to minimise rows involved in joins.
Creating multi-column indexes for efficient sorting and scanning.
Using window functions like ROW_NUMBER() to replace inefficient DISTINCT operations.
Leveraging Index-Only Scans to reduce I/O operations.
Reordering joins to avoid nested loops and reduce lookups.
Using temporary tables to avoid redundant computations and enhance data reuse.
These optimisations significantly reduced execution times, making our queries faster and more efficient.
If you're interested in learning more practical SQL optimisations, follow along with our 100-day journey. Subscribe to stay updated, each day, you'll gain new insights into making your SQL queries more powerful and efficient!
If you missed the Week 1 recap, you can catch up by checking it out here:
Join The Conversation!
We hope you found this post helpful. If you have any questions, comments, or experiences you'd like to share, we'd love to hear from you!
Data Source: https://developer.imdb.com/non-commercial-datasets/