Question

Which item was the most popular for each customer?

Solution

WITH number_of_purchases_per_customer_and_product AS (
    SELECT customer_id, product_id, COUNT(*) num_of_purchases
    FROM sales
    GROUP BY customer_id, product_id
),
purchases_ranked_by_customer AS (
    SELECT 
        customer_id, 
        product_id, 
        num_of_purchases,
        DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY num_of_purchases DESC) AS rank
    FROM number_of_purchases_per_customer_and_product
)
SELECT customer_id, menu.product_id, menu.product_name, num_of_purchases
FROM purchases_ranked_by_customer
INNER JOIN menu ON menu.product_id = purchases_ranked_by_customer.product_id
WHERE rank = 1;
customer_idproduct_idproduct_namenum_of_purchases
C3ramen3
B3ramen2
A3ramen3

Walkthrough

Similar to Question 3, this looks like another “Top N per group” instance with N=1 where rows in a group share the customer_id, and we wish to rank the rows by the total number of purchases per product.

Let’s start by building an intermediate table that stores the number of purchases for every pair of (customer_id, product_id).

SELECT customer_id, product_id, COUNT(*) num_of_purchases
FROM sales
GROUP BY customer_id, product_id;
customer_idproduct_idnum_of_purchases
B32
A33
A11
C33
B12
B22
A22

Next, we’d like to form groups of these rows based on the shared value of customer_id and rank the rows in the groups by the descending order of num_of_purchases.

WITH number_of_purchases_per_customer_and_product AS (
    SELECT customer_id, product_id, COUNT(*) num_of_purchases
    FROM sales
    GROUP BY customer_id, product_id
)
SELECT 
    customer_id, 
    product_id, 
    num_of_purchases,
    DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY num_of_purchases DESC) AS rank
FROM number_of_purchases_per_customer_and_product;
customer_idproduct_idnum_of_purchasesrank
A331
A222
A113
B321
B122
B223
C331

Finally, the rows of interest are those where rank = 1, so let’s filter it out. Seems like we’ll have to chain the CTE.

WITH number_of_purchases_per_customer_and_product AS (
    SELECT customer_id, product_id, COUNT(*) num_of_purchases
    FROM sales
    GROUP BY customer_id, product_id
),
purchases_ranked_by_customer AS (
    SELECT 
        customer_id, 
        product_id, 
        num_of_purchases,
        DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY num_of_purchases DESC) AS rank
    FROM number_of_purchases_per_customer_and_product
)
SELECT customer_id, menu.product_id, menu.product_name, num_of_purchases
FROM purchases_ranked_by_customer
INNER JOIN menu ON menu.product_id = purchases_ranked_by_customer.product_id
WHERE rank = 1;
customer_idproduct_idproduct_namenum_of_purchases
C3ramen3
B3ramen2
A3ramen3