Using a window function in SQL to find the shortest viable data period for multiple items

I have a large table of data, and the important columns for me are the clickdate, productid and clickcost columns. The cost represents how much I’ve spent on marketing the product. I want to look at every product once it’s had a cost of at least $50. (Before then, there’s not enough data for me to make an intelligent decision; for some products this happens within a week, but for other, more niche products it may take a month, or even longer. I don’t want to make decisions faster than once every seven days though, so the data doesn’t get too choppy because we’re not accounting for trending through each week.

A handy way to do this is with a window function in SQL:

SELECT productid, max(clickdate) as min_look_back
FROM
(
select productid, clickdate, sum( clickcost) over (partition by expediaid order by clickdate desc) as clicksum
FROM spend_report 
where clickdate >= '2014-01-14'::date - interval '180 days' -- look back over the past 6 months
) as a
WHERE a.clicksum >= 50
and clickdate <= '2014-01-07' 
GROUP BY productid;

I've done this before, and forgotten how to do it. This is thus advice for Future Me when I next need to remember it.

Please suggest more elegant solutions if you have them. One thing I'd like to improve here would be dynamically setting the date range (which I've not done here because that would be postgres dependent) instead of hardcoding it.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.