English
Data Automation for beginners, automate data warehouses, marts, Power BI
AnalyticsCreator automates data warehouse and data mart development by generating SQL, ETL logic, Azure Data Factory pipelines, and Power BI models from a graphical metadata model. The webinar shows how a data platform can be created, historized, deployed to Azure, and connected to Power BI faster than manual development.
Questions
- What is AnalyticsCreator?
- How can AnalyticsCreator automate data warehouse development?
- How does AnalyticsCreator create data marts and Power BI models?
- How does AnalyticsCreator handle historization?
- Can AnalyticsCreator deploy to Azure?
- How does AnalyticsCreator generate Azure Data Factory pipelines?
Key Takeaways
- AnalyticsCreator orchestrates the full lifecycle of a data warehouse, data lakehouse, or data mart.
- The application uses a graphical metadata model to generate source code instead of manual programming.
- AnalyticsCreator supports Microsoft-focused deployment targets including SQL Server, Azure, Azure Data Factory, and Power BI.
- Data Warehouse Wizard can generate a draft warehouse model from source metadata.
- The demo uses AdventureWorks 2019 as a source database.
- AnalyticsCreator can generate staging, persisted staging, core, and data mart layers.
- Historization can be configured per column using SCD logic.
- Snapshot historization can expose current and previous versions of data.
- Deployment packages can generate DACPAC files, Azure Data Factory ARM templates, and Power BI
Transcript
Hello everybody, and thank you for joining today’s virtual classroom: Data Automation for Beginners. In this session, we will show how to automate data warehouses, data marts, Power BI models, and data lakehouses, and how AnalyticsCreator can help you create a data platform up to ten times faster.
AnalyticsCreator is a data automation technology that orchestrates the full lifecycle of a data warehouse, data lakehouse, or analytical data model. It supports design, development, change management, and deployment.
AnalyticsCreator is built for experts, but also for people who are starting their data journey and want to create data warehouses and data marts for analytical applications. It uses a holistic graphical data model to generate source code instead of requiring manual programming. It is mainly used to improve agility and replace traditional ETL technologies.
We have several customer success stories on our website. One example is Robert Bosch, which built a large Microsoft-based data warehouse with thousands of users using AnalyticsCreator. The original development estimate was around 12 months for the first version. With AnalyticsCreator, it was completed in three months.
Another example is mymuesli. They started as beginners in data warehousing and are now advanced AnalyticsCreator users. You can read the full story on our website. There is also a YouTube video showing how another customer created integration layers in Azure Storage 20 times faster than originally expected.
The vision behind AnalyticsCreator is independence. AnalyticsCreator is a pure design-time tool with no runtime component. First, you create a holistic data model. From that model, AnalyticsCreator generates source code and deploys it automatically to SQL Server, Azure, Power BI, or another supported target environment.
After deployment, AnalyticsCreator is no longer required to run the solution. There is no runtime dependency and no vendor lock-in. Customers and partners can use and modify the generated source code freely.
AnalyticsCreator focuses 100% on the Microsoft stack, including Azure, SQL Server, and the wider Microsoft environment. AnalyticsCreator can export to other environments, but code generation is focused on Microsoft technologies.
AnalyticsCreator also has an open repository. All generated metadata and model definitions are stored in SQL Server, allowing customers and partners to build add-ons or extend functionality.
The architecture starts with raw data from ERP systems and other sources. With partner connectivity tools, more than 250 sources can be connected. AnalyticsCreator then creates the data warehouse, lakehouse, or analytical structure using approaches such as Kimball or Data Vault.
It can also generate tabular models for Power BI, traditional OLAP models, Qlik models, and Tableau models. Other frontend tools can connect directly to the generated data warehouse. The holistic data model covers the full process and steers the data flows, transformations, and analytical models.
The process starts by connecting to a data source. AnalyticsCreator then helps define the data and extract metadata. If metadata is available, for example from SAP or another ERP system, it can be retrieved automatically. If no metadata is available, such as with a CSV file, the structure can be described manually.
Next, an intelligent wizard uses the selected data and metadata to create a draft analytical model and data warehouse structure, from the lower layers up to the data mart layer. You can then refine the model by adding dimensions, KPIs, historization logic, and other development details. Finally, AnalyticsCreator generates the source code and deploys the solution to Azure, SQL Server, or another supported Microsoft environment.
AnalyticsCreator is highly ranked in the BARC Data Management Survey and has won several KPI awards over multiple years. The survey is available for download on our website.
AnalyticsCreator supports many use cases, including building new data warehouses, modernizing existing environments, generating Data Vault models, creating data lakehouses, and working with Synapse-based architectures.
Dimitri now walks through the process of modeling a data warehouse with AnalyticsCreator.
He begins by creating a new data warehouse repository. This repository is a SQL Server database that contains the full definition of the data warehouse system.
For this demo, the AdventureWorks 2019 database is used as the source. Dimitri adds a connector to AdventureWorks. AnalyticsCreator supports connectors such as SQL Server, Oracle, flat files, ODBC, SAP, OData, Azure Blob Storage, and more.
After updating the connection string and testing the connection successfully, the connector is ready.
Dimitri starts the Data Warehouse Wizard, which automatically creates the first draft version of the data warehouse.
For this demo, tables from the Human Resources schema are selected, and a classical Kimball data warehouse with facts and dimensions is created. AnalyticsCreator also supports Data Vault and mixed architectures.
AnalyticsCreator imports the selected tables, historizes them, creates dimensions, and generates fact transformations from Employee Department History and Job Candidate. Naming conventions are configured, such as using a dim prefix for dimensions and a fact prefix for fact transformations. A calendar dimension is also created for a defined time period.
After the wizard finishes, AnalyticsCreator displays the generated layer diagram. The source layer shows the data sources. The staging layer imports the data. The persistent staging layer stores historized tables. The core layer contains transformations, facts, and dimensions. The data mart layer exposes the structure used later for Power BI.
In the source layer, Dimitri can inspect the structure of the Human Resources Department table, including columns and data types.
In the staging layer, imported tables keep the same structure as the source tables. The import package or pipeline defines how the data is loaded and maps source columns to target columns.
Filters, variables, and scripts can be added to the import process. For example, a timestamp variable can restrict imported data to records modified after a certain point, supporting differential loading. AnalyticsCreator can generate either SSIS packages or Azure Data Factory pipelines for the import.
The persistent staging layer stores historized data. It includes additional columns such as Date From, Date To, and SATZ_ID, a surrogate key. Each import compares new data with historized data, detects changes, and preserves previous versions.
Historization is configured in the historization package or pipeline. Each column can use a different historization type.
SCD Type 2 closes the old row and adds a new row when a change is detected. SCD Type 1 updates the current row without storing history. Columns can also be set to ignore changes.
Missing data behavior can also be configured. Missing rows can be closed, left open, or replaced with an empty record using defined values or values from the previous valid row. Filters can restrict historization to recent source data, while older historized data remains unchanged.
AnalyticsCreator can also historize sources that already contain historical data. Historization is implemented through generated stored procedures, which can be reviewed and manually modified if required.
A question was raised about whether staging, persistent staging, and warehouse tables are stored separately. They are separate tables. Staging tables are usually cleared during import, while persistent staging tables remain stable and keep the historized data.
Another question asked whether history can be stored only in staging tables. Historization is optional. The wizard proposes a typical Kimball architecture with staging and persistent staging, but users can define their own architecture without strict limitations.
The core layer contains transformations, typically implemented as SQL views. AnalyticsCreator supports regular transformations, manual views, SQL Server scripts, stored procedures, and external transformations such as SSIS packages.
In the Department transformation, every source column is exposed in the view. Predefined transformations can be applied, such as trimming strings or converting null strings to “N.A.” Users can also create their own reusable transformations.
Unknown members can also be generated. If a fact references a department that does not exist in the dimension table, the reference can be mapped to an unknown member. This avoids missing dimension references in the data warehouse.
The Employee Department History fact transformation is generated from foreign keys imported from AdventureWorks. AnalyticsCreator identifies relationships to tables such as Department, Employee, and Shift, and builds the fact transformation automatically.
Because the joined tables are historized, AnalyticsCreator uses snapshot historization. A snapshot table stores at least one snapshot date, usually the current date. Each historized table is filtered to the version valid for that snapshot date.
If the snapshot table contains only the current date, the fact transformation returns current facts. Additional snapshot dates, such as previous month-end dates, can be added to access historical versions of the data. The snapshot table can later be used as a snapshot dimension in the data mart layer.
Snapshot historization is optional. If only current data is needed, the historization type can be set to current only.
Transformations are usually views, but they can be persisted into physical tables to improve performance. This is called materialization or persisting.
AnalyticsCreator generates persisting packages or pipelines and stored procedures for this process. Full persisting reloads the table each time. Other options include merge, historical, and incremental persisting, where only changed, new, or deleted data is handled.
Scripts can be executed before or after the persisting process.
Dimitri adds calendar dimension IDs to the Employee Department History fact transformation. Instead of using raw Start Date and End Date columns, he adds foreign keys to the calendar dimension.
AnalyticsCreator prepares a calendar dimension as a view containing dates for a specific time period. This dimension can be modified or extended if required.
Calendar IDs are generated using macros. A macro is a reusable T-SQL statement with placeholders. When the macro is called, the placeholders are replaced by the supplied parameters. In this case, the macro converts Start Date and End Date into calendar dimension IDs.
The columns are then renamed to FK_StartDate and FK_EndDate.
Dimitri adds additional fields to Employee Pay History, such as Rate Change Date and Rate, and then synchronizes the data warehouse.
Synchronization materializes the model into SQL Server and generates the database structure. It also validates the generated SQL. When Dimitri intentionally adds an invalid expression, AnalyticsCreator detects the error and highlights the affected transformation. After removing the invalid column, synchronization succeeds.
The data mart layer is the reporting interface of the data warehouse. It contains properties used for OLAP cubes and Power BI models.
Objects from the core layer are exposed in a data star. Employee Department History is used as a fact transformation, while Department is exposed as a dimension.
Because the fact contains multiple date references, Dimitri creates role-playing calendar dimensions for Start Date, End Date, and Rate Change Date. The model is synchronized again, and the new calendar dimensions appear in the data mart layer.
Measures are then added for facts, such as distinct counts for departments and employees, sums for rate, and a measure for Job Candidate. Measure names can be generated from templates to ensure uniqueness in Power BI or OLAP models.
The completed model can be stored in the AnalyticsCreator cloud. Each user has their own cloud storage, and previous versions remain available when a repository is saved again.
The repository can also be saved as a SQL file and later reloaded. Because it is a text-based SQL file, it can be stored in version control systems such as Git.
At this point, the warehouse exists as a model. To create the physical data warehouse, Dimitri generates a deployment package.
AnalyticsCreator can deploy on-premises or to the Microsoft cloud. It can generate SQL Server databases, Azure SQL databases, SSIS packages, Azure Data Factory pipelines, multidimensional or tabular OLAP cubes, and Power BI models.
For this demo, the warehouse is deployed to an Azure SQL database named AV. AnalyticsCreator generates DACPAC files for database deployment. Direct deployment from AnalyticsCreator is useful for test and development environments, while production deployments can be handled using Visual Studio solutions or Microsoft deployment tools such as sqlpackage.exe.
AnalyticsCreator generates an XMLA script containing the Power BI model definition and deploys it to a Power BI Premium workspace through the XMLA endpoint.
It also generates Azure Data Factory pipelines for import, historization, persisting, and workflow orchestration. The workflow pipeline executes the other pipelines in the correct order.
After deployment, the Azure SQL database contains the generated warehouse tables, and the Power BI workspace contains the new dataset with the same structure as the data mart layer.
The Azure Data Factory ARM template generated by AnalyticsCreator is then imported into Azure Data Factory, creating the required pipelines and resources.
In Azure Data Factory, the generated pipelines include import, historization, persisting, and workflow pipelines. Linked services are configured for the AdventureWorks source and Azure SQL target.
The workflow pipeline is executed, loading data from the on-premises AdventureWorks database into Azure SQL. Once the pipeline succeeds, the Power BI dataset credentials are configured and the model is refreshed.
Power BI then creates Quick Insights based on the loaded data. At this point, the data warehouse has been created from scratch, deployed to Azure, connected to Power BI, and populated with data.
The same model can also be deployed on-premises or in a mixed environment, for example with the data warehouse on-premises and Power BI in the cloud. The model does not need to change.
The model shown in the demo was generated by the Data Warehouse Wizard, but users can also build their own models. AnalyticsCreator supports complex data warehouse projects, including SAP-based environments with multiple data stars.
AnalyticsCreator is also a data lineage tool. Users can select a source table and immediately see where its data is used across transformations, facts, data marts, and reporting structures. This makes it easier to understand how data flows and transforms across large data warehouse projects.
You can request a trial version on the AnalyticsCreator website. The recording will also be available on YouTube, so you can follow the demo step by step during your trial.