English
AnalyticsCreator Congress 2021-detailed demo case- customer in the construction engineering industry
This session shows how AnalyticsCreator can be used to implement SCD Type 2 historization quickly in a data warehouse. The demo uses simple CSV source files for orders and people, then shows how changes are detected across monthly loads and stored with valid-from and valid-to dates.
Questions
- How does AnalyticsCreator support SCD Type 2 historization?
- Why is historization important in a data warehouse?
- How can AnalyticsCreator reduce manual work when building historized tables?
- How are CSV source files loaded into AnalyticsCreator?
- How does AnalyticsCreator detect changed records?
- How can data lineage help during a data warehouse migration?
Key Takeaways
- AnalyticsCreator helps create SCD Type 2 historization without manually coding the full logic.
- Historization is important when source systems overwrite values but analytical reporting needs historical correctness.
- The customer project involved replacing an ERP source while keeping reporting structures compatible.
- Data lineage was a key reason for using AnalyticsCreator in the new architecture.
- The new data warehouse reduced cube load time from around four hours to around 20 minutes through delta loading.
- The demo uses two CSV files: orders and people.
- People are assigned to regions, and responsibility changes are tracked historically.
- AnalyticsCreator generates the staging, persisted staging, core, and deployment structures.
- SSIS packages are generated and executed through Visual Studio.
- SCD Type 2 can be configured per column.
- Some columns can be excluded from change detection or set to SCD Type 1.
- The demo shows how valid-from and valid-to dates change after monthly loads.
Transcript
Hello everyone, and welcome back after the break.
We are starting the second part of the day. First, we have the BI Automation presentation, followed by Adesso’s session on Data Vault with AnalyticsCreator. At one o’clock, we will have the Q&A session. If you have questions, please post them in the chat.
My name is Robert Buch, and I am a BI architect at BI Automation. In this session, I will first introduce a customer project where we implemented a data warehouse with AnalyticsCreator. After that, Elmir will show a short demo focused on historization.
The customer was an international engineering consultancy with around 2,500 employees, 40 offices, and operations across five continents. Their project-related and financial data was consolidated at the headquarters in Austria. The goal was to feed an existing controlling cube with data from a new ERP source.
The customer moved from Agresso to Deltek Maconomy. They wanted the existing controlling cube, reports, and dashboards to continue working with the new source data, while still preserving compatibility with historical Agresso data.
In practice, this was complex. The old architecture used several sources, including ERP data, intranet data, Active Directory information, and CSV files. Data lineage was difficult to understand, and the old cube used full loads.
We chose AnalyticsCreator because it provides visible data lineage, automatic documentation, and generated SSIS packages. The new controlling cube now loads in around 20 minutes using delta loading, while the old full-load cube takes around four hours.
A data warehouse is important when companies need to combine multiple data sources, protect production systems, calculate KPIs, and improve performance for many users.
Historization is also essential. Production systems often overwrite changed values, which can make historical reports incorrect. Slowly Changing Dimension Type 2, or SCD Type 2, solves this by storing validity periods with fields such as Date From and Date To. This allows the warehouse to show which value was valid at a specific point in time.
Because we cannot show customer data, we prepared a simple demo using two CSV files: Orders and People.
The People table contains four regions: North, South, East, and West, with one responsible person for each region. The Orders table references the region. When the responsible person changes, the warehouse should keep the old responsibility for historical orders and use the new responsibility for new data.
Elmir creates a new AnalyticsCreator repository and a CSV connector. The same connector is used for both source files. He creates the sources, reads their structures, defines primary keys, adds imports, and creates SSIS packages.
Next, he adds historization. For Orders, he uses SCD Type 2 and sets missing source behaviour to Do Not Close, because new monthly data is loaded without replacing all historical orders. For People, he also uses SCD Type 2, but one remark field is set to SCD Type 1 so it can update without creating a new historical version.
He then creates a reference from Orders to People using the Region column, adds a simple core and data mart structure, creates a calendar dimension, and uses a macro to convert the order date into a calendar key.
Elmir creates a deployment package that generates the database structures and SSIS packages. The load process is shown through a prepared video, because the demo simulates several months by changing the system date between runs.
In the first load, the initial People and Orders data is imported. In the November load, a new person becomes responsible for the West region. AnalyticsCreator detects the change, closes the old record, and inserts the new current record. In the December load, an employee in the South region changes surname, and AnalyticsCreator creates a new historical version.
A query then shows who was responsible for each order at the relevant point in time. This confirms that SCD Type 2 historization worked correctly.
The demo shows how quickly historization can be implemented with AnalyticsCreator. If requirements change later, columns can be switched between SCD Type 2, SCD Type 1, or no tracking with only a few configuration changes.