March 9th, 2022 – The Incremental ETL Architecture by John O’Dwyer of DataBricks

Welcome to this month’s meeting!  As a member of Oregon Data Community you are also a member of the PNW Data Community!  All of the local data groups are working together, with each of us taking turns hosting the monthly meeting(s)!!!  Yes, meetings!  We will always have our “evening” meeting on the 2nd Wednesday of the month at 4 PM PST, and a second meeting on the 4th Wednesday of the month at 12 PM PST!!

We are now using Teams!   Link will be published closer to the meeting

In order to encourage questions during the presentation, we want people to use the raise hands option in Teams and not talk over each other. Please be mindful of the other attendees!

We will be starting at 4:00 PM PST/7:00 PM EST

Schedule (may be fluid):
4:00 – 4:15 PM – Announcements and other information
4:15 PM – 1st Presentation

Our wonderful sponsors are providing gifts each month!

Title: The Incremental ETL Architecture

Abstract: Incremental ETL in a conventional Data Warehouse has been possible for some time but scale, cost, accounting for state and the lack of access for machine learning make it not ideal. Until now, Incremental ETL in a Data Lake has not been possible due to factors such as updating data and identifying changed data in a big data table. Incremental ETL also makes the medallion table architecture possible and efficient so that all consumers of data can have the correct curated data sets for their needs. We will discuss the advances in big data technology that make Incremental ETL possible as well as the architecture as a whole.

Bio: John O’Dwyer is a Developer Advocate at Databricks where he helps empower the Databricks, Spark, Delta Lake and MLflow communities.

Here’s John!



This entry was posted in Announcements. Bookmark the permalink.