Deep SQL Questions Answered: A Complete Guide to Advanced Database Concepts

Published on October 27, 2024

SQL, or Structured Query Language, is a powerful tool for managing and querying databases. However, as you progress in your SQL journey, you might come across concepts that seem intimidating or confusing. This guide addresses some of the most commonly asked yet often misunderstood advanced SQL questions. Whether you're looking to deepen your database knowledge or clarify a few doubts, you'll find insightful answers and practical examples here.

1. WHERE vs HAVING: What's the Difference?

The WHERE clause filters records before grouping, while HAVING filters after grouping. You can use HAVING with aggregate functions like SUM(), AVG(), and others to filter groups, while WHERE is used to filter individual rows.

SELECT department, SUM(salary) FROM employees WHERE status = 'active' GROUP BY department HAVING SUM(salary) > 100000; 

2. What are "Window Functions"?

Window functions, such as ROW_NUMBER(), RANK(), and SUM() OVER(), allow you to perform calculations across a set of rows related to the current row, without collapsing them into a single output. Unlike GROUP BY, which aggregates data, window functions preserve the individual row structure.

SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank FROM employees; 

3. What are Common Table Expressions (CTEs)?

Common Table Expressions (CTEs) are temporary result sets defined within a query using the WITH clause. Unlike subqueries, CTEs improve readability, can be recursive, and can be referenced multiple times.

WITH DepartmentSalary AS ( SELECT department, SUM(salary) AS total_salary FROM employees GROUP BY department ) SELECT department FROM DepartmentSalary WHERE total_salary > 100000; 

4. What is a "Deadlock" in SQL?

A deadlock occurs when two transactions are waiting on each other to release resources. To avoid deadlocks, reduce lock duration, access resources in a consistent order, and implement retry logic when necessary.

Tips to avoid deadlocks:

  • Keep transactions short and efficient.
  • Always access tables in the same order.
  • Implement retry logic.

5. What Are "Partitioned Tables"?

Partitioned tables divide large tables into smaller segments, improving query performance by limiting the number of rows scanned, especially for large datasets.

CREATE TABLE sales ( sales_id INT, sale_date DATE, amount DECIMAL ) PARTITION BY RANGE (sale_date); 

6. What Are "Materialized Views"?

Materialized views store the results of a query physically. They can be faster to read but may become outdated.

CREATE MATERIALIZED VIEW mv_sales AS SELECT department, SUM(amount) AS total_sales FROM sales GROUP BY department; 

7. What Are "Covering Indexes"?

Covering indexes include all the columns required for a query, allowing the database to return results directly from the index.

CREATE INDEX idx_sales ON sales(department, amount); 

8. What Is a Subquery in SQL?

A subquery is embedded within another query to filter results or perform calculations. A join combines rows from two or more tables based on related columns.

Example of a Subquery:

SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees); 

Example of a Join:

SELECT e.name, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id; 

9. What is the Purpose of the GROUP BY Clause?

The GROUP BY clause groups rows sharing the same values in specified columns into summary rows, often paired with aggregate functions.

SELECT department_id, COUNT(*) AS employee_count FROM employees GROUP BY department_id; 

10. What Is the Difference Between TRUNCATE and DELETE?

Both TRUNCATE and DELETE remove records, but TRUNCATE is more efficient and does not log each row's removal.

  • DELETE can be rolled back and supports WHERE clauses.
  • TRUNCATE is faster but often irreversible.

11. What Is a Foreign Key?

A foreign key is a column in one table referencing the primary key of another table, ensuring referential integrity.

CREATE TABLE orders ( order_id INT PRIMARY KEY, customer_id INT, FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ); 

12. What Is a Self Join?

A self join connects a table with itself, typically used for hierarchical structures.

SELECT e1.employee_id, e1.name, e2.name AS manager_name FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.employee_id; 

13. What Are Index Types?

Indexes improve retrieval speed. Key types include B-tree, hash, and composite indexes.

CREATE INDEX idx_name ON employees(name); 

14. What Is Normalization?

Normalization organizes data to reduce redundancy and dependency, ensuring database efficiency, scalability, and maintainability.

15. What Is Denormalization?

Denormalization adds redundancy to improve read performance, especially in data warehouses but complicates data updates.

16. How Does SQL Server’s ROW_NUMBER() Function Differ from RANK()?

  • ROW_NUMBER() assigns a unique number to each row.
  • RANK() assigns ranks but skips on ties.
  • DENSE_RANK() does not skip any numbers.
SELECT employee_id, salary, ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num, RANK() OVER (ORDER BY salary DESC) AS rank, DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank FROM employees; 

17. How Do You Optimize SQL Queries?

To optimize SQL queries:

  • Use indexes for faster lookups.
  • Structure queries efficiently—avoid excessive subqueries.
  • Use the EXPLAIN keyword to analyze execution plans.
  • Limit returned data—avoid SELECT *.

Conclusion

SQL is a powerful language, and mastering advanced topics makes you an effective database administrator or developer. Understanding concepts like window functions, partitioned tables, and materialized views will enhance your query writing and database design skills. Explore these topics and apply them in your projects to gain confidence with advanced SQL concepts.