How to slice, dice and visualize data with sql queries

In this post we are going to talk about how to use the query tool to query and visualize data.
If you aren't familiar with the query tool, you may want to read the first post about it here https://www.graphjson.com/guides/queries
Today we are going to construct a query where we can visualize a cohort of users and how they retain over time. Let's imagine we have a social network and we log two types of events
The first event is a register event. The JSON might something like this
{
user_id: "123",
event: "register"
}
The second event is a news feed view event. The JSON might look something like this
{
user_id: "123",
event: "feed_view"
}
Now the query we want to construct is one that will show for all users that registered in some month X. How many came back at X+1 month, X+2 month etc.
The first query we will construct is getting a unique list of user ids that signed up in a particular month. The query we will use is
SELECT
user_id
FROM
(
SELECT
toUnixTimestamp(
toDateTime(toStartOfMonth(toDateTime(event_timestamp, 'America/Los_Angeles')))
) AS month,
JSONExtractString(json, 'user_id') AS user_id
FROM
all_events
WHERE
JSONExtractString(json, 'event') = 'register'
AND month = 1635739200
LIMIT
10
)
The next query we'll construct is getting user activity grouped by month.
SELECT
toUnixTimestamp(
toDateTime(toStartOfMonth(toDateTime(event_timestamp, 'America/Los_Angeles')))
) AS month,
JSONExtractString(json, 'user_id') AS user_id
FROM
all_events
WHERE
JSONExtractString(json, 'user_id') IN user_ids
AND JSONExtractString(json, 'event') = 'feed_view'
GROUP BY
month,
json
Lastly we'll compose all of these queries together into a single query using WITH clauses
WITH user_ids AS (
SELECT
user_id
FROM
(
SELECT
toUnixTimestamp(
toDateTime(toStartOfMonth(toDateTime(event_timestamp, 'America/Los_Angeles')))
) AS month,
JSONExtractString(json, 'user_id') AS user_id
FROM
all_events
WHERE
JSONExtractString(json, 'event') = 'register'
AND month = 1635739200
LIMIT
10
)
), activity AS (
SELECT
toUnixTimestamp(
toDateTime(toStartOfMonth(toDateTime(event_timestamp, 'America/Los_Angeles')))
) AS month,
JSONExtractString(json, 'user_id') AS user_id
FROM
all_events
WHERE
JSONExtractString(json, 'user_id') IN user_ids
AND JSONExtractString(json, 'event') = 'feed_view'
GROUP BY
month,
json
), (
SELECT COUNT(*) FROM user_ids
) AS cohort_size
SELECT
COUNT(user_id) / cohort_size * 100 as percentage,
month AS timestamp
FROM
activity
GROUP BY
month
And bam! We have the results.
Now all we need to do is configure the visualize tool and we have the graph we want!
If we want we can now click on the "Export" button and export it to a dashboard. We can then share a link to this dashboard to our investors! We can also optionally add password protection on the dashboard!