Modern DataWarehousing is moving towards ELT
ELT (Extract, Load, Transform) is variations of ETL (Extract, Transform, Load). With ELT data transformations occur once data is loaded to a destination(Data Lake or Datawarehouse) . With the advancements of cloud technologies, like Azure Data Factory, Talend , the use of ELT has seen increased adoption.
ELT is a orchestration (Data integration) process of moving raw data from single or multiple source server to a destination and then applying transformation.
For the last couple of decades ETL (extract, transform, load) has been the traditional approach for data warehousing and analytics. The ELT (extract, load, transform) approach changes the old paradigm. But, what’s actually happening when the “T” and “L” are switched? The only difference is with traditional approach (T ) transformation was done on the fly before loading it into the destination but with ELT transformation is applied once the data is moved into destination.
ETL requires management of the raw data, including the extraction of the required information and running the right transformations to ultimately serve the business needs. Each stage – extraction, transformation and loading – requires interaction by data engineers and developers, and dealing with capacity limitations of traditional data warehouses. Using ETL, analysts and other BI users have become accustomed to waiting, since simple access to the information is not available until the whole ETL process has been completed.
Examples of transformations include:
- Deriving a field based on some logic.Like deriving Full Name using FirstName & LastName from the raw data
- Aggregating numerical sums
- Removing duplicates
- Handling slowly changing dimension
- Combining data from different tables and databases
In the ELT approach, after you’ve extracted your data, you immediately start the loading phase – moving all the data sources into a single, centralized data repository. With today’s infrastructure technologies using the cloud, systems can now support large storage and scalable compute. Therefore, a large, expanding data pool and fast processing is virtually endless for maintaining all the extracted raw data.
In this way, the ELT approach provides a modern alternative to ETL. However, it’s still evolving. Therefore, the frameworks and tools to support the ELT process are not always fully developed to facilitate load and processing of large amount of data. The upside is very promising – enabling unlimited access to all of your data at any time and saving developers efforts and time for BI users and analysts.