Modern Data 101
Analytics Heroes 🎧
From Big Tech to Startups: Redefining Roles of Data Engineers as Strategists and much more with Matthew Weingarten | S1:E8
0:00
Current time: 0:00 / Total time: -36:03
-36:03

From Big Tech to Startups: Redefining Roles of Data Engineers as Strategists and much more with Matthew Weingarten | S1:E8

Everything from Managing North Stars for Businesses and Mastering Data Modeling to Tackling Governance and Future-Proofing Data Infrastructures!

We added a summarised version below for those who prefer the written word, made easy for you to skim and record top insights! 📝

Additional note from community moderators: We’re presenting the insights as-is and do not promote any specific tool, platform, or brand. This is to simply share raw experiences and opinions from actual voices in the analytics space to further discussions.

Prefer watching over listening? Watch the Full Episode here ⚡️

Introducing Matthew Weingarten | Our Analytics Hero at Your Service 🦸🏻‍♂️

Matthew is a Senior Data Engineer with a rich background in building scalable data solutions across industry giants like Meta, Disney, and Nielsen. With a strong focus on IoT data, cloud technologies, and high-performance data pipelines, Matthew brings deep expertise in handling complex datasets and optimizing data infrastructure. Beyond his technical skills, he’s an active writer on Medium, sharing valuable insights with the data engineering community.

We’ve covered a RANGE of topics with Matthew. Dive in! 🤿

TOC

  • Work at Samsara and Experience with Big Brands

  • Managing North Star Metrics for Businesses

  • Ensuring Data Quality for Complex Data From Multiple Sources

  • Evolution of Governance from the POV of a Data Engineer

  • Fundamentals to Focus to Grow as a Data Engineer

  • Keys to Mastering Programming and Data Modelling

  • Balancing Legacy with Modern Solutions

  • How a Data Engineer Should Assess Fitness of Data Tools

  • Projects that Solidified Understanding of Data Modeling

  • The Huge Gaps in Current Data Management and Analytics Stacks

  • Emerging Technologies in the Data Ecosystem

  • The High-Scale Impact of the Semantic Layer

  • Increasing Transparency on How Data Engineers Impact Business

  • Catalogs are Actually Reducing Ad-Hoc Requests

  • Future-Proofing Data Infrastructures

  • Experience with Data Products, What Makes a Good Data Product, and How Does Consuming Them Benefit Us

  • Go-to Resources to Stay Updated

  • Meeting the Real Matthew


Before diving in, sign up to get notified when the next episode goes LIVE! ⏺️


Work at Samsara and Experience with Big Brands

At Samsara, my focus is on enabling data usage and literacy across the organization, ensuring high-quality, timely data for stakeholders in the IoT space. Having worked with large-scale data systems at Meta and Disney, I’ve seen common challenges everywhere—whether it’s gigabytes or petabytes of data, issues like quality and late-arriving data persist. The key is leveraging past experience to navigate them smoothly.


Managing North Star Metrics for Businesses

Defining North Star metrics starts with close collaboration with stakeholders to identify key success indicators. These must be backed by strong data queries and flexible slicing on meaningful dimensions. A metric is only as useful as the context it provides, so ensuring its stability means keeping an eye on its underlying dimensions and continuously refining them.


Ensuring Data Quality for Complex Data From Multiple Sources

Data quality is non-negotiable. The goal is to detect issues before stakeholders do, using proper checks, observability, and alerts. Understanding data deeply—knowing primary keys, trends, and expected fluctuations—helps in setting up proactive monitoring. If issues arise, fast detection and clear communication ensure minimal disruption.


Evolution of Governance from the POV of a Data Engineer

Governance has evolved significantly, especially with regulations like GDPR. Early on, many teams scrambled to comply, but today, security and compliance are foundational. At Disney, handling sensitive customer data meant hashing and limiting exposure. Compliance isn't just a legal necessity—it’s key to building trust.


Fundamentals to Focus on to Grow as a Data Engineer

Start with software engineering fundamentals—they provide a strong foundation for data engineering. SQL, data modeling, and pipeline design are essential, but programming practices make the difference. Fundamentals of Data Engineering by Joe Reis is a must-read, as core principles remain relevant despite rapid industry changes.


Keys to Mastering Programming and Data Modeling

Data modeling is harder to master because there’s no single "right" approach. Models must be scalable, efficient, and easy for stakeholders to use, which can be conflicting goals. It’s a skill learned over time through experience, not something that can be "mastered" overnight.


Balancing Legacy with Modern Solutions

Migrating legacy infrastructure to modern cloud solutions is challenging but essential. A Nielsen project required transitioning connected TV data from local servers to cloud-based solutions like AWS and Databricks to handle big data growth. Staying updated with evolving trends in data engineering is key, as today's modern platforms will be legacy tomorrow.


How a Data Engineer Should Assess Fitness of Data Tools

Small-scale experimentation is the best way to assess new tools. For instance, DuckDB offers fast local querying, making it useful for testing without cloud overhead. When evaluating tools, data engineers should balance immediate wins with long-term projects, considering migration effort and business priorities.


Projects that Solidified Understanding of Data Modeling

Leading a data migration project early in my career helped me grasp the full data pipeline—ingestion, transformation, and delivery to stakeholders. Hands-on experience, whether through professional projects or personal data sets, is crucial for mastering data engineering concepts.


The Huge Gaps in Current Data Management and Analytics Stacks

Three major challenges for data engineers:

  1. Inconsistent tooling causing unexpected failures,

  2. Difficulty in setting up robust local testing environments before production, and

  3. Ensuring data teams are well-integrated with the broader organization to maximize value.


Emerging Technologies in the Data Ecosystem

Generative AI and LLMs are reshaping the data ecosystem, but companies must be cautious in their adoption. While these tools can enhance processes like data cataloging, ensuring they retrieve the right datasets remains a challenge. They're not perfect, but when they work well, they can make life much easier.


The High-Scale Impact of the Semantic Layer

Semantic layers are gaining traction as they help align data with business context. While I haven’t fully developed one, we’re working toward it. It’s crucial to have a structured layer that allows people to easily locate and understand where data is housed.


Increasing Transparency on How Data Engineers Impact Business

The impact of data engineering is measured by how well insights drive business success. If teams can access timely, accurate data to inform decisions, we know we're making an impact. While it may not always be visible in direct sales figures, effective data enables high-performing teams and better business outcomes.


Catalogs Are Actually Reducing Ad-Hoc Requests

Yes, catalogs help reduce ad-hoc requests by centralizing data discovery, minimizing constant queries like "where is this data?" Though maintaining them is effort-intensive, they free up engineers to focus on development rather than answering repetitive questions. The tools are improving, and broader adoption enables more self-service analytics.


Future-Proofing Data Infrastructures

Nothing is ever truly future-proof in data. The best approach is designing for scalability and flexibility—anticipating evolving needs instead of solving just one problem at a time. Constant collaboration with stakeholders and awareness of upcoming changes help avoid rework and inefficiencies.


Experience with Data Products & What Makes a Good Data Product

Everyone interacts with data products—whether enabling clickstream analytics at Disney or building core datasets at Samsara. The key is creating structured, reusable data solutions tailored to stakeholder needs.

  • Defining Data Products: It starts with identifying use cases, collaborating with stakeholders, and iterating based on feedback. A data product can't be built in isolation—it requires continuous input to be effective.

  • What Makes a Good Data Product? It should be scalable, easily answer key questions, and be useful beyond just one team. Reusability and efficiency are critical indicators of success.

  • How Data Consumers Benefit: Domain analysts benefit when data is easily accessible, allowing them to pull insights without relying on engineers. Providing the right tools for self-service analytics fosters better data literacy across teams.


Go-to Resources to Stay Updated

Data Engineering Weekly is my go-to resource—shoutout to Ananth for that. I also follow company blogs from DoorDash, Meta, Spotify, and others leading in data engineering. Staying updated is key because no one works in isolation—ideas evolve and influence the industry.


Meeting the Real Matthew

I’m still working on having a life outside of work! But when I do, concerts, sports, and getting outdoors help me unwind—Seattle winters make that tough, though.

  • On a Superpower: If I could have one, it’d be freezing time or extending the day beyond 24 hours—there’s just never enough time.

  • On a Non-Work Goal: I’m passionate about bridge—been playing since I was 13. It’s a mix of logic, strategy, and competition, and I even travel for tournaments. It keeps me sharp and gives me something to strive for beyond work. When needed, like during a storytelling festival.


📝 Note from Editor
The above insights are summarised versions of Matthew’s actual dialogue. Feel free to refer to the transcript or play the audio/video to capture the true essence and details of his as-is insights. There’s also a lot more information and hidden bytes of wonder in the interview, listen in for a treat!


Thanks for reading Modern Data 101! Subscribe for free to receive new posts and support our work.


Guest Connect 🤝🏻

Connect with me on LinkedIn 🙌🏻

Discussion about this episode