originally written circa 2020, updated July 2023

A Quick Ecosystem Rundown 

It’s all about leveraging time

General Overview

Imagine the data eng + data analytics team

Teams should spend their time doing the most complex tasks that their skill sets will allow

If two teams have some overlap where the skill set of one ends at the skillset of another

skillComplexity (0-5)Data EngAnalytics
Write async functions5
singer4
lookml3
looker reporting2
Manual reporting1

 it makes the most sense for the the more technically advanced team to focus on more technically complex work — it’s also more economical

Even in a case where they aren’t; you always have an opportunity to train up team B rather than being cost inefficient with Team A

skillComplexity (0-4)Data EngAnalytics
Write async functions3
singer2
lookml1
looker reporting0

But in the new world, you have analytics engineers as well

skillComplexity (0-5)Data EngAnalystsAnalytics Engineers
Write async functions5
singer4.5
dbt4
lookml3
looker reporting2
Manual reporting1

Now, it may seem that analytics engineers (AE) are just more skillful analysts. That is not the case. What’s so powerful about analytics engineers is that they form a bridge between analytics and data eng and provide leverage. They provide this leverage by acting as a well informed liaison and data steward between the EL of ELT and reporting, and help to bring the best of both worlds together.

Read more about that here.

Note: if your team is using some variant of EL tool / dbt / BI tool, you’re likely operating in an ELT paradigm. (examples of EL tools: stitch, fivetran; example of BI tools: mode, looker)

Analytics

  • Analytics that’s doing ad hoc reporting doesn’t have time to create self serve tools
  • When analytics has time to create self serve tools, then they can also focus on looking for insights proactively
  • When they get ahead of reporting needs, then they can do more exploratory and discovery work
  • This is important because otherwise it’s a constant game of catch-up and there’s never a chance to statistically validate new datasets
  • Analytics and DE have some knowledge overlap — rather than overlap , AN should own BI and QA against BI logic and the rest should be documented, rather than having two teams effectively reviewing at different points (eg DE reviews at dbt level, then AN takes a look once they’re report building in looker)

Why looker is a powerful tool:

  • Looker enables teams to automate reporting to a certain degree by creating dashboards the refresh data and allow users a view into a rolling lookback
    • Eg looking at a historic view of signups, conversion rates, and cx tickets for the last month, knowing that partway through the month there were changes to the product
  • Dashboards can also be used to show current health of a part of the business
    • Eg changes to a wireless network are being deployed and you want to see if service is being impacted by looking to cx ticket volume
  • Automating reporting in this way enables analytics to spend less time manually querying for data and getting it presentable

Data Engineering

  • Strengths : building out new pipeline + automating data flows
  • Data engineering should be leveraging their time building new processes
  • Any task that’s relatively simple should be automated
    • Eg accelerating pipelines during on a regular cadence if needed
    • Creation of base/raw models that pull data relatively close to their source form, but out of a loading warehouse and into a prod one
    • Plugging in looker models
      • Eg writing lookml for simple models
        • Simple being any model that goes directly from being a reporting/ mart table to being and explore
          • So pulling in as a view (bringing in fields by references to table) and referencing in a model file (reference view, reference any tables you’d like to join the table to)
  • Data engineering should have a strong grasp/understanding of quality for incoming data and should be testing for that before it ever hits reporting
  • Data eng should only own tools they have a good pulse check on — if not familiar with part of the downstream workflow/output that gets handed off to users, QA is inefficient

So where do you place the data team?

Curated articles

Essentials

Recommended

Additional Reading

What does the community say?

First names and photos are omitted to minimize unconscious bias, as are any pronouns when referring to others in the conversation.