Using Jupyter notebooks with Papermill and Dagster
In this tutorial, we'll walk you through integrating a Jupyter notebook with Dagster using an example project. Before we get started, let's cover some common approaches to writing and integrating Jupyter notebooks with Dagster:
-
Doing standalone development in a Jupyter notebook. You could then create two Dagster assets: one for the notebook itself and another for data-fetching logic. This approach, which we'll use to start the tutorial, allows you to configure existing notebooks to work with Dagster.
-
Using existing Dagster assets as input to notebooks. If the data you want to analyze is already a Dagster asset, you can directly load the asset's value into the notebook. When the notebook is complete, you can create a Dagster asset for the notebook and factor any data-fetching logic into a second asset, if applicable. This approach allows you to develop new notebooks that work with assets that are already a part of your Dagster project.
By the end of this tutorial, you will:
- Explore a Jupyter notebook that fetches and explores a dataset
- Create a Dagster asset from the notebook
- Create a second Dagster asset that only fetches the dataset
- Load existing Dagster assets into a new Jupyter notebook