A data pipeline is an automated sequence of processes that moves data from source systems through transformation and integration steps to a destination system — typically a data warehouse, reporting layer, or analytical platform. Data pipelines handle the extraction, transformation, and loading (ETL) or extraction, loading, and transformation (ELT) of data, applying data quality checks, business rules, and format standardisation along the way. The reliability and performance of data pipelines directly determines the currency, completeness, and accuracy of data available in reporting systems.
Why This Matters
Data pipelines are the plumbing of a reporting infrastructure — invisible when working correctly, but immediately impactful when they fail or produce incorrect outputs. A reporting environment is only as reliable as its least reliable pipeline: if any pipeline fails silently or applies incorrect transformations, the management reports that depend on it will contain wrong data without any visible indication. Building robust, monitored, documented data pipelines is a foundational investment in reporting reliability.
Where This Fits
This term sits within the BI & AI area of Performance & Control.
Related Terms
Related Knowledge
To be added when relevant Knowledge Hub articles are published