Challenges & Product Evaluation Criteria
The challenges facing modern data pipelines are five-fold. As described earlier, rising data volumes, driven by data transformation and proliferating data sources, put significant strains on traditional pipelines. An accelerating business environment, meanwhile, drives the need for real-time data delivery and access. In addition, data engineers, analytics engineers, data scientists, and analysts—especially the more business-oriented ones—can struggle to build the right pipeline scripts in SQL, Python, or other popular languages. All the while, data environments grow more diverse, adding elements to accommodate new business demands and projects. A final challenge is compliance with stringent regulations, for example to ensure privacy and avoid bias.
With these challenges in mind, Eckerson Group recommends evaluating data pipeline products according to five criteria.
- Breadth of functionality. Your data pipeline tool should support—either natively or via easy integration with third-party tools—multiple data integration patterns, such as ETL, ELT, ELTL, and reverse ETL.
- Performance and scale. Your tool should put minimal processing burden on your current architecture, avoid source agents, use log-based change data capture, and meet your business’ most rigorous service level agreements (SLAs).
- Ease of use. Your tool should require minimal training, automate basic tasks, provide an intuitive graphical interface, easily change pipeline elements, and improve the productivity of your data team.
- Open architecture. Your tool should provide an open architecture that works with various sources, targets, processors, and programming languages. Open data formats and APIs can help ensure your tools interoperate with one another and give you the flexibility you need to migrate or integrate data as needed.
- Governance. Look for a tool that centralizes pipeline metadata, provides granular role-based access controls, masks sensitive data, tracks lineage, and audits user actions to assist compliance efforts.
Eckerson Group recommends evaluating data pipeline products according to their breadth of functionality, performance and scale, ease of use, open architecture, and governance