Key Takeaways

Wayne Eckerson and Kevin Petrie offered their key takeaways from the March 30 event on Modern Data Pipelines hosted by Eckerson Group.

Key Takeaways

Keynote

  • Data environments get more complex by the day. New users, use cases, sources, targets, etc. put strain on the pipelines that must deliver timely, accurate data for analytics
  • The market for data pipeline management includes four segments:
    • Ingest: extract and load data from source to target
    • Transform: filter, merge, and format data for consumption
    • DataOps: optimize pipelines with CI/CD, testing, and monitoring
    • Orchestrate: schedule and execute workflows across pipelines and applications
  • Users of data pipelines include data engineers that specialize in pipelines and data analysts and scientists that prepare data to support analytics. They also include analytics engineers that transform and validate data for consumption by analysts.
  • Data pipeline tools divide along two axes: specialty vs. suite, and cloud vs. hybrid focus.

Suite cloud products help manage ingestion, transformation, DataOps, and pipeline orchestration with a suite for cloud environments

Suite hybrid products offer these capabilities for hybrid environments

Specialty cloud products focus on specific aspects of data pipeline management for cloud environments

Specialty hybrid products have a similar focus on specific aspects of pipeline management for hybrid environments

Practitioners

  • Jennifer: agility, need for real time, unstructured, low code
  • Ranajay: focus on managing costs, adopting prescriptive and predictive analytics

Panel

  • “Table stakes” requirements include security, sources/targets, and low code/no code
  • Data products must support any source, target, or channel, and be managed with metadata
  • Many enterprises and vendors focus their data warehouse initiatives on Snowflake
  • Shift happens! This means data drift, referring to changes in schema, metadata, infrastructure, and possibly ML model outputs over time. Users must adapt to this, preferably with automated tools.
1

2

3

4

5

6

7

8