SQL Query to Get the List of Users by Email
In this article, we will explore a complex SQL query that fetches the list of friends’ email addresses based on a provided input email. We will start with understanding the sample data and then move on to explaining the given solution, its shortcomings, and how to improve it.
Understanding the Sample Data
We have two tables: users and user_relations. The users table contains user information such as user_id and email. The user_relations table stores the relationship between users, including the requestor_user_id, receiver_user_id, and friend_status.
| users |
| user_id | |
|---|---|
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
| 4 | [email protected] |
| 5 | [email protected] |
| 6 | [email protected] |
| 7 | [email protected] |
| user_relations |
| user_relation_id | requestor_user_id | receiver_user_id | friend_status |
|---|---|---|---|
| 1 | 2 | 4 | 1 |
| 2 | 2 | 6 | 1 |
| 3 | 2 | 7 | 1 |
| 4 | 5 | 2 | NULL |
| 5 | 5 | 7 | NULL |
| 6 | 7 | 2 | NULL |
| 7 | 7 | 4 | 1 |
| 8 | 7 | 5 | 1 |
| 9 | 7 | 6 | 1 |
| 10 | 4 | 2 | 1 |
| 11 | 4 | 3 | 1 |
| 12 | 4 | 5 | 1 |
| 13 | 4 | 6 | 1 |
| 14 | 4 | 7 | 1 |
We are given an example input email: [email protected]. We need to find the email addresses of users who have a friend status of 1 with the user having this email.
The Given Solution
The provided solution uses two JOINs, which seems inefficient. Let’s break down the query and understand its shortcomings:
SELECT
case when u.user_id = r.receiver_user_id then requestor.email else receiver.email end as friend_email
FROM users u
JOIN user_relations r
ON (r.requestor_user_id = u.user_id OR r.receiver_user_id = u.user_id)
AND r.friend_status = 1
LEFT JOIN users requestor ON requestor.user_id = r.requestor_user_id
LEFT JOIN users receiver ON receiver.user_id = r.receiver_user_id
WHERE u.email in ('<a>[email protected]</a>')
GROUP BY friend_email
HAVING COUNT(DISTINCT u.user_id) > 1
The query has several issues:
- It uses two LEFT JOINs, which can lead to incorrect results if the table is not properly indexed.
- The
WHEREclause filters on individual user emails instead of using a single email as an input parameter. This makes it difficult to reuse the query with different inputs.
Improving the Solution
As suggested in the provided solution, we don’t care whether friends are receivers or requestors. Therefore, we can simplify the query by removing the two JOINs and focusing on finding users who have a friend status of 1:
SELECT DISTINCT
u2.email as friend_email
FROM users u
INNER JOIN user_relations r
ON (r.requestor_user_id = u.user_id OR r.receiver_user_id = u.user_id)
INNER JOIN users u2
ON (r.requestor_user_id = u2.user_id OR r.receiver_user_id = u2.user_id)
AND u2.user_id <> u.user_id
WHERE u.email = '<a>[email protected]</a>'
AND r.friend_status = 1
However, we can further improve this query by removing the inner JOIN with u and only joining with r. This reduces the number of rows being processed:
SELECT DISTINCT
u2.email as friend_email
FROM user_relations r
INNER JOIN users u2
ON (r.requestor_user_id = u2.user_id OR r.receiver_user_id = u2.user_id)
AND u2.user_id <> r.requestor_user_id
WHERE r.friend_status = 1
AND u2.user_id IN (
SELECT user_id
FROM users
WHERE email = '<a>[email protected]</a>'
)
This query is more efficient and easier to understand.
Conclusion
In this article, we explored a complex SQL query that fetches the list of friends’ email addresses based on a provided input email. We analyzed the given solution, its shortcomings, and how to improve it. By removing unnecessary JOINs, using IN instead of filtering individual user emails, and reducing the number of rows being processed, we can create more efficient and easier-to-understand SQL queries.
Last modified on 2023-08-22