Using a window function in SQL to find the shortest viable data period for multiple items

I have a large table of data, and the important columns for me are the clickdate, productid and clickcost columns. The cost represents how much I’ve spent on marketing the product. I want to look at every product once it’s had a cost of at least $50. (Before then, there’s not enough data for me to make an intelligent decision; for some products this happens within a week, but for other, more niche products it may take a month, or even longer. I don’t want to make decisions faster than once every seven days though, so the data doesn’t get too choppy because we’re not accounting for trending through each week.

A handy way to do this is with a window function in SQL: Continue reading “Using a window function in SQL to find the shortest viable data period for multiple items”

A way to avoid some carelessness with postgres

Part of my day job involves loading data from lots of different files into a database, and one way this can go horribly wrong is if I attempt to load the same file twice. Either the database will get mad at me because I’ll be trying to violate referential integrity, or my end users will get mad at me when I manage to fill the database with duplicate data, or I’ll get mad at me for having to waste several hours sorting out a self made mess. Here’s my homebrewed solution for this, to allow me to take complicated code and entrust it to other people (or to myself, if it’s a Friday afternoon and I’m stressed and distracted with a hundred other things). If you’re not interested in databases, avert your eyes now.
Continue reading “A way to avoid some carelessness with postgres”