In this article, I’ll skim over the why and what, and dive into some examples of SQL queries that I’ve found particularly useful as a product manager, with examples to help make it more concrete.
This article is intended to show some basic useful examples. Additional resources are listed at the end.
What is SQL, and why should I care?
SQL stands for Standard Query Language. As a product manager, you may be able to use SQL to directly get answers to your questions from your company’s database (the place where app and customer data is stored). As I mentioned in my article on data tools I use, this is what I use most often in my day-to-day as a product manager; I explained some examples of what I use it for in that article.
I’ve got a few other caveats at the end, but really quickly: different database types have slightly different syntax, so you may get some errors depending on what your company is using! Unfortunately, I’m not a data expert (caveat #2!) so I’m not quite sure where all the distinctions lay. I’ve linked to some resources throughout that should help.
Our example for today
Special shout out to my almost-7-year-old, who really enjoyed helping me make this table!
Table name: fruits
name | color | inventory_count | is_sweet |
strawberry | red | 7 | true |
pineapple | yellow | 10 | true |
apple_r | red | 5 | true |
apple_y | yellow | 3 | |
apple_g | green | 2 | true |
banana | yellow | 15 | true |
blueberry | blue | 50 | false |
raspberry | red | 1 | true |
lemon | yellow | 10 | false |
Part 1 will cover a basic query structure as well as ‘where’ phrases and ‘count.’ Part 2 will include Group By, Having, Order By, and Inner Join.
Basic SQL query structure
SELECT column1, column2 FROM tablename;
For example, to get the first column in the table above, you would write:
SELECT name FROM fruits;
You can also use an asterisk to select all columns. To return the entire ‘fruits’ table, you could do:
SELECT * FROM fruits;
WHERE
This is a critical modifier that you’ll use pretty much every time you make a query.
Think of it like this: “show me these columns from this table where thus-and-such is true.”
You can probably think of all kinds of situations where this would be useful! For example, “show me accounts from account_table where credit cards are active,” or “show me customers where they have not signed in for three months.”
Let’s look for all the fruits that are red. We’re only asking for the ‘name’ column here, so that’s all we get back, but we’re asking the query to show us all names from rows where the color is red. Note the single quotes around the word ‘red.’
SELECT name FROM fruits WHERE color = 'red';
name |
strawberry |
apple_r |
raspberry |
You can also look for numbers that are equal. Here, no quotes are needed. We get back the only fruit with an inventory count of 15. I’ve also included the inventory count in the column list, so that comes back as well:
SELECT name, inventory_count FROM fruits WHERE inventory_count = 15;
name | inventory_count |
banana | 15 |
You can also use other modifiers. Some popular examples below; explanations use the base:
SELECT name FROM fruits WHERE [....]
;
Indicator | Meaning | Explanation |
!= | Not equals | color !='red' would have returned all the fruits EXCEPT the three that are red |
> | Greater than | inventory_count > 49 would return ‘blueberry’, as it has an inventory count of 50 |
>= | Greater than or equal to | inventory_count >=50 would also return ‘blueberry’! |
< | Less than | inventory_count<2 would return ‘raspberry’, as we sadly only have 1 raspberry |
<= | Less than or equal to | inventory_count<=2 would give me ‘raspberry’ and ‘apple_g’ (we have 1 raspberry and 2 green apples) |
IS FALSE | Where something is false | is_sweet IS FALSE would return ‘blueberry’ and ‘lemon’ . Only valid where the data is either true or false |
IS TRUE | Where something is true | is_sweet IS TRUE would return everything OTHER than ‘blueberry’ and ‘lemon’ |
IS NULL | ‘Null’ simply means empty, or blank | is_sweet IS NULL would return ‘apple_y’ since there’s no value for is_sweet for that row |
IS NOT NULL | Not empty or blank | is_sweet IS NOT NULL would return everything OTHER than ‘apple_y’ |
You can use AND and OR to string together multiple conditions, like this:
SELECT name FROM fruits WHERE color = 'red' AND inv_count>=5;
name |
strawberry |
apple_r |
LIKE
Another really useful tool is LIKE.
You can use this when you’re not sure exactly what you’re looking for, or are looking for only part of a word. The % here acts as a ‘wildcard’; in this case, ‘show me all words that end in berry, no matter what comes before that.’
SELECT name FROM fruits WHERE name LIKE '%berry';
name |
strawberry |
blueberry |
raspberry |
Learn more about wildcards, including how they differ between MS Access and a SQL server.
What if my table had just ‘berry’ in it, would that show? Yep, since % can include 0 characters.
Here’s something trickier: what if my table had ‘Berry’? That would NOT show, since SQL is case sensitive when it comes to searching text, meaning capitalization matters.
Tip: use “lower” (or “upper”) to get rid of capitalization issues.
If I had mixed cases in my table, I would have done WHERE LOWER(name) LIKE '%berry'
(or WHERE UPPER(name) LIKE '%BERRY'
). That would have made sure all the names are the same case when being compared to my phrase!
COUNT
This might be the most useful tool in your toolbox! Count the number of rows given a certain situation.
Using what we’ve learned so far, we could do:
SELECT COUNT(name) FROM fruits WHERE name like '%berry';
This would return 3, since there are three rows with names ending in ‘berry.’
You can do count (*), but you risk counting blank rows as well.
Tip: You can use DISTINCT to make sure you’re not getting any duplicates.
Let’s say I had “strawberry” in two rows in my table. The query above would then return 4. DISTINCT helps us avoid that.
SELECT COUNT(DISTINCT name) FROM fruits WHERE name LIKE '%berry';
Even in the case where I have strawberry twice, this would return 3.
Learn more about count basics.
COUNT CASE WHEN
I’m usually exploring data, so I often have a lot of situations that I want to get counts for. How many customers did x? What about y? How many did x and y? etc.
I could do a bunch of queries like the one above, each with a different ‘where’ clause, but that’s pretty cumbersome.
This is where “case when” comes in handy. Think of it as ‘count the cases when this is true’ within the table and parameters of your ‘where’ clause.
SELECT COUNT (CASE WHEN name LIKE '%berry' THEN name END) FROM fruits;
Just as above, this would return 3.
And if we’re worried about duplicates, we could do:
COUNT (DISTINCT CASE WHEN name LIKE '%berry' THEN name END);
That’s also why I like to do something like “THEN name” – you can do “THEN 1,” and that’s what I see most commonly, but then you can’t use DISTINCT.
Where this gets useful is when you want to do multiple counts. Note that just as you put commas between columns in your ‘select’ phrase, you do the same between count phrases until your last count.
SELECT
COUNT (DISTINCT CASE WHEN is_sweet IS FALSE THEN name END),
COUNT (DISTINCT CASE WHEN color='red' THEN name END),
COUNT (DISTINCT CASE WHEN color='yellow' AND is_sweet IS
FALSE THEN name END)
from fruits;
Count | Count1 | Count2 |
2 | 3 | 1 |
Now, ‘count’ isn’t a very helpful label! You can use AS to label your various queries. Note those phrases have double quotes around them.
SELECT
COUNT (DISTINCT CASE WHEN is_sweet IS FALSE THEN name END) AS "# not sweet",
COUNT (DISTINCT CASE WHEN color='red' THEN name END) AS "# of red fruits" ,
COUNT (DISTINCT CASE WHEN color='yellow' AND is_sweet IS FALSE THEN name END) AS "# yellow not sweet"
FROM fruits;
# not sweet | # of red fruits | # yellow not sweet |
2 | 3 | 1 |
Caveats
- As I mentioned, I’m no data expert. I learned a bit online, but mostly from looking at the queries of others at my company. There may be mistakes in this article.
- Details specific to your company and database can trip you up. For example, we have a ton of ‘test’ accounts that we all use when poking around in our software. If I forget to exclude those in my WHERE clause, that messes up my numbers considerably.
- Ask for help from others at your company! Start by asking to see their queries. Find someone knowledgable of whom you can ask questions. We have a Slack channel where I can ask questions – mostly SQL-related – of the product’s lead developers. Even if you become a SQL guru, you may run into situations where the data table has info such as statuses that are only in numbers, and you need a developer to tell you what those numbers map to (i.e. 1 is ‘in progress,’ 2 is ‘complete,’ etc.).
Other SQL for Product Manager articles:
- https://www.departmentofproduct.com/blog/sql-skills-for-product-managers/ (has a lot of background/context before getting into syntax)
- https://medium.com/pm101/basic-sql-skills-for-product-managers-eff2a9980c14
- https://medium.com/@joengreitz/sql-for-pms-a-crash-course-3d5382802d54
- https://www.xupler.com/sql-for-product-managers/
- W3schools is also a good resource to understand the various options, and has ‘try it yourself’ options as well as exercises to make sure you understood the concepts.
- TeamTreeHouse offers the basics; I like their ‘playground’ feature where you can play with SQL as you watch the videos.
What do you think, did this all make sense to you? Are there mistakes? Do you have other resources you’d recommend? Let me know your thoughts!
Part 2 is now up! It includes Group By, Having, Order By, and Inner Join.