originally written circa 2020, updated July 2023
A Quick Ecosystem Rundown
It’s all about leveraging time
General Overview
Imagine the data eng + data analytics team
Teams should spend their time doing the most complex tasks that their skill sets will allow
If two teams have some overlap where the skill set of one ends at the skillset of another
skill | Complexity (0-5) | Data Eng | Analytics |
Write async functions | 5 | ✅ | ❌ |
singer | 4 | ✅ | ❌ |
lookml | 3 | ✅ | ✅ |
looker reporting | 2 | ❌ | ✅ |
Manual reporting | 1 |
it makes the most sense for the the more technically advanced team to focus on more technically complex work — it’s also more economical
Even in a case where they aren’t; you always have an opportunity to train up team B rather than being cost inefficient with Team A
skill | Complexity (0-4) | Data Eng | Analytics |
Write async functions | 3 | ✅ | ❌ |
singer | 2 | ✅ | ❌ |
lookml | 1 | ✅ | ❌ |
looker reporting | 0 | ❌ | ✅ |
But in the new world, you have analytics engineers as well
skill | Complexity (0-5) | Data Eng | Analysts | Analytics Engineers |
Write async functions | 5 | ✅ | ❌ | ❌ |
singer | 4.5 | ✅ | ❌ | ❌ |
dbt | 4 | ✅ | ❌ | ✅ |
lookml | 3 | ✅ | ❌ | ✅ |
looker reporting | 2 | ❌ | ✅ | ✅ |
Manual reporting | 1 | ❌ | ✅ | ✅ |
Now, it may seem that analytics engineers (AE) are just more skillful analysts. That is not the case. What’s so powerful about analytics engineers is that they form a bridge between analytics and data eng and provide leverage. They provide this leverage by acting as a well informed liaison and data steward between the EL of ELT and reporting, and help to bring the best of both worlds together.
Read more about that here.
Note: if your team is using some variant of EL tool / dbt / BI tool, you’re likely operating in an ELT paradigm. (examples of EL tools: stitch, fivetran; example of BI tools: mode, looker)
Analytics
- Analytics that’s doing ad hoc reporting doesn’t have time to create self serve tools
- When analytics has time to create self serve tools, then they can also focus on looking for insights proactively
- When they get ahead of reporting needs, then they can do more exploratory and discovery work
- This is important because otherwise it’s a constant game of catch-up and there’s never a chance to statistically validate new datasets
- Analytics and DE have some knowledge overlap — rather than overlap , AN should own BI and QA against BI logic and the rest should be documented, rather than having two teams effectively reviewing at different points (eg DE reviews at dbt level, then AN takes a look once they’re report building in looker)
Why looker is a powerful tool:
- Looker enables teams to automate reporting to a certain degree by creating dashboards the refresh data and allow users a view into a rolling lookback
- Eg looking at a historic view of signups, conversion rates, and cx tickets for the last month, knowing that partway through the month there were changes to the product
- Dashboards can also be used to show current health of a part of the business
- Eg changes to a wireless network are being deployed and you want to see if service is being impacted by looking to cx ticket volume
- Automating reporting in this way enables analytics to spend less time manually querying for data and getting it presentable
Data Engineering
- Strengths : building out new pipeline + automating data flows
- Data engineering should be leveraging their time building new processes
- Any task that’s relatively simple should be automated
- Eg accelerating pipelines during on a regular cadence if needed
- Creation of base/raw models that pull data relatively close to their source form, but out of a loading warehouse and into a prod one
- Plugging in looker models
- Eg writing lookml for simple models
- Simple being any model that goes directly from being a reporting/ mart table to being and explore
- So pulling in as a view (bringing in fields by references to table) and referencing in a model file (reference view, reference any tables you’d like to join the table to)
- Simple being any model that goes directly from being a reporting/ mart table to being and explore
- Eg writing lookml for simple models
- Data engineering should have a strong grasp/understanding of quality for incoming data and should be testing for that before it ever hits reporting
- Data eng should only own tools they have a good pulse check on — if not familiar with part of the downstream workflow/output that gets handed off to users, QA is inefficient
So where do you place the data team?
Curated articles
Essentials
- https://cultivating-algos.stitchfix.com/
- Read this for a strong overview on the different aspects of org structure, culture, and how data fits in.
- https://blog.getdbt.com/data-team-structure-examples/
- Read this to get a sampling of how different companies have chosen to structure their teams and why.
- https://medium.com/@itunpredictable/data-as-a-product-vs-data-as-a-service-d9f7e622dc55
- This article helps provide a thought framework around maturity of a data platform and how data platforms/teams can evolve
- https://locallyoptimistic.com/post/one-size-fits-none/
- Read this to get an idea of different areas of consideration to take when thinking about how a data org is structured
Recommended
- https://blog.getdbt.com/analytics-engineers-operate-with-leverage/
- https://share.vidyard.com/watch/JdN5yHVhRTquB31weESqKJ
- This is a panel featuring both leadership roles and IC’s in the data space, discussing the future of what analytics looks like
Additional Reading
- https://medium.com/@djpardis/models-for-integrating-data-science-teams-within-organizations-7c5afa032ebd
- https://towardsdatascience.com/how-to-build-an-analytics-team-for-impact-in-an-organization-21bb05925587
- https://www.eckerson.com/articles/organizing-for-success-part-iii-how-to-organize-and-staff-data-analytics-teams
What does the community say?
First names and photos are omitted to minimize unconscious bias, as are any pronouns when referring to others in the conversation.