Ryan Chandler, Senior Data Scientist, Caterpillar

Ryan Chandler

Senior Data Scientist, Caterpillar

Rapid Graph ETL using Azure Functions

Most organizations have a wide variety of data sources that have a unique format for storing information. Trying to refactor this data to be uploaded to Neo4j can prove to be a daunting task, especially as users continue to add to their data model.

The solution I developed for Security Data Intelligence at Microsoft allows us to solve this problem. Using the Microsoft Azure platform, we're able to scale our Graph Extract-Transform-Load (ETL) operations from the smallest to largest datasets from virtually any platform in any format, and refreshing this data at whatever rate the end-user desires. In this talk, I'll walk you through our use of Azure Function Applications and Azure Storage Accounts to accomplish this task. Plus, the code used in our solution will be made available on GitHub with a few examples so you can try this solution on your own.

About

Cory Gehr is a Service Engineer in Microsoft's Digital Security and Risk Engineering group where he focuses on Data Intelligence. His primary goal is investigating the use of graph databases to mitigate threats to the company and to improve the incident response process.