Simplify your Team's Data Pipelines with the Xarray Workshop


Xarray Workshop Overview

This two-hour workshop introduces participants to the Xarray project for manipulating multi-channel data (e.g., as occur commonly in geosciences, etc.). Participants practice using Xarray for data analysis extending techniques from Pandas & NumPy to high-dimensional labeled arrays.

We assume participants have prior experience using the Python language and, in particular, using standard Python tools for data analysis (notably NumPy, Pandas, Jupyter). Some prior experience working with NETCDF, HDF, or related file formats for representing scientific data sets is useful but not required.

Learning Objectives

At the end of this workshop, participants should be able to:


  • Explore & analyze high-dimensional Xarray labeled data with NumPy- or Pandas-style operations (e.g., groupby, indexing, selection,  broadcasting, etc.)

  • Design Dask-based pipelines with Xarray for out-of-core computation on large datasets

  • Persist or ingest Xarray core data structures using with various standard file formats

