Featured workshops

  • Instructor: Philipp Angerer + Lukas Heumos

    Introducing the cookiecutter template and how to use it. We should also introduce the scverse ecosystem and how to get included in it.

  • Instructor: Mikaela Koutrouli

    An overview on how do to your first contribution to scverse community, even if you are not a developer including: replying on discourse queries, contributing to documentation, best practices book, posting and using issues even for problems you have solved, where to find the contributing guidelines

  • Instructor: Luca Marconato

    Spatially resolved omics technologies are revolutionizing our understanding of biological tissues by introducing new ways to characterize tissue architectures and identify cell-cell interactions. However, handling large single-cell and spatial omics datasets poses significant challenges due to data volume and heterogeneity. AnnData and SpatialData offer innovative, robust solutions for these issues.

    The workshop will provide an in-depth look at the SpatialData framework, a Python-based solution for managing spatial omics datasets, and its interlay with the AnnData Python package. Participants will learn to ingest and represent data from various commercial technologies in a unified format. The session will cover techniques for manipulating multi-sample, multi-modal datasets, including constructing common coordinate frameworks via coordinate transformations.

    Attendees will discover methods for creating detailed visualizations, interactively exploring large datasets, and storing the annotations in the AnnData and SpatialData format. Additionally, the workshop will highlight data manipulation functions such as cropping, annotation transfer, and deep learning applications and discuss ensuring interoperability across different software platforms and programming languages for scalable, reproducible workflows.

  • Instructor: Severin Dicks

    Learn how to incorporate NVIDIA GPU acceleration in your day-to-day data analysis. This workshop will demonstrate how to use RAPIDS-SingleCell to accelerate traditional Scanpy and other scverse workflows. Attendees will gain practical insights into leveraging GPUs outside of deep learning, exploring the capabilities of RAPIDS and CuPy. Discover how these tools can enhance your package and improve your single-cell data analysis efficiency.

  • Instructor: Roshan Sharma

  • Instructor: Ilan Gold

    As both individual datasets and atlases grow, so to should the capacities of our data structures.  Listen in to find out how AnnData is addressing big data, and preparing for the future.

  • Instructor: Yimin Zheng

    Biological data visualization is challenged by the growing complexity and size of datasets. While single-plot visualization methods struggle to capture the full picture of datasets, researchers turn to composable visualizations that are usually specialized to a domain requiring familiarity with multiple visualization tools. To unify the creation of composable visualization, we introduce a novel and intuitive general visualization paradigm termed "cross-layout,” which integrates multiple plot types in a cross-like structure. This paradigm allows for a central main plot surrounded by secondary plots, each capable of layering additional features for enhanced context and understanding. To operationalize this paradigm, we present "Marsilea", a Python library designed for creating composable visualizations with ease. Marsilea is notable for its modularity, diverse plot types, and compatibility with various data formats. This talk will bring attendees insights into composable visualizations, and they will learn how to use Marsilea to express different aspects of their single-cell or spatial omics data into a composable visualization. Marsilea is accessible to everyone with basic knowledge of Python, open-sourced at https://github.com/Marsilea-viz/marsilea.

  • Instructor: Clarence Mah

    The goal of this workshop is to demonstrate how spatial transcriptomics data can be used to extract subcellular insights about cell biology, primarily emphasizing analysis with Bento – a package in the Scverse ecosystem – that utilizes the SpatialData framework enabling interoperability with Scanpy and Squidpy. The workshop will include a brief intro to spatial transcriptomics technologies capable of subcellular measurements. Then we will introduce hands-on exercises (Jupyter notebooks) for spatial analysis, such as annotating subcellular spatial patterns and domains, gene colocalization, and measuring cell morphology. We will conclude with a short Q&A session for more open-ended discussion.

  • Instructor: Chris Tastad

    Advancements in tools and data management ecosystems have led to a mature state for collaboration across research compute efforts. The refinement of code hosting, dependency management, DevOps, documentation, and automation have established a range of workflows that remove barriers and accelerate scientific endeavors across teams and communities. Still, many of these systems place an emphasis on the management and organization of code. The complexity and importance of code base management cannot be overstated, but there remains a comparable set of challenges around handling the evolution of a shared dataset. Much like shared code, shared data can be complex, iterative, transitory, and distributed. Collaborative efforts with shared data also retain the same needs for data traceability, reproducibility, and portability. Among all available collaborative resources, a common problem stands in the way of addressing these needs - data are large.

    To confront this, our group has implemented an approach to using Data Version Control (DVC) for team science with bioinformatics applications and single cell data. DVC is a data science toolset created by Iterative.ai for change tracking inputs in machine learning workflows. At its core, this resource offers a generic framework for codified data versioning and provenance which sits on top of well-established tools. This key element of the DVC paradigm allows for the separation of data from metadata in change tracking. Object-linked metadata are managed through git while objects can be stored in a separate remote storage. This work offers a guide of the DVC framework that is intended to service requirements specific to the needs of bioinformatics and single cell data. We show how to tailor tracking of data in a manner that can efficiently address challenges around the diverse set of intermediary single cell outputs. We also place a particular emphasis on the need to bridge gaps in data hand-offs that may exist between hybrid collaborations of dry and wet lab scientists. Altogether, we hope this application of a lesser-known toolset offers an on ramp to the adoption of stronger data management practices, enabling collaboration across single cell biology and beyond.

  • Instructors: Robrecht Cannoodt, Malte Luecken

    The "Benchmarking Open Problems in Single Cell Analysis" workshop aims to address critical challenges in the field by fostering community engagement in the development of robust benchmarks. This 90-minute session will introduce participants to the core mission of the Open Problems in Single-Cell Analysis, followed by an interactive session where we will build a new benchmark from scratch. As part of this tutorial, participants not only learn about the technical aspects of setting up a benchmark within the Open Problems framework, but will also learn about essential best practices in benchmarking computational methods.

  • Instructors: Louise Deconinck, Benjamin Rombaut, Robrecht Cannoodt

    In order to use the best performing methods for each step of the single-cell analysis process, bioinformaticians need to use multiple ecosystems and programming languages. This is unfortunately not that straightforward. We will give an overview of the different levels of interoperability, and how it is possible to integrate them in a single workflow.

    For package developers, making methods accessible is important. We will provide information on how to do this well on the package and method level.

  • Instructor: Ryan Williams, Maximillian Lombardo

    As the field of single-cell RNA sequencing continues to evolve, researchers are increasingly interested in using these datasets to train foundational models for a wide range of applications. While training models on smaller datasets that fit into memory is relatively straightforward, scaling up beyond single machines presents significant technical challenges. In this workshop co-presented by TileDB and the Chan Zuckerberg Initiative, participants will receive hands-on experience training models on a large dataset comprising 70 million cells and learn about the key technologies and resources that make this possible.


We also invite scientists, professionals, and developers working in the fields of computational biology to submit proposals for Tutorials and workshops at the scverse conference 2024. The purpose of the Tutorials and Workshops program is to build knowledge and provide hands-on training sessions on relevant packages within the scverse community.

Workshops can be of any form ranging from presentations to brief talks or panel discussions, but should also include hands-on exercises. Potential topics for Tutorials and workshops include but are not limited to:

  • Data analysis of single-cell data across modalities

  • AI and Machine Learning for single-cell data

  • Guiding developers on how to make sustainable and reusable software

  • Best practices for data analysis and/ or coding standards (e.g., GitHub actions, Python coding standards, etc.)

Sponsors 2024

DIAMOND

PLATINUM

GOLD

SILVER