Core concepts
Before diving into development, let's explore the fundamental concepts of DipDup. Key terms are highlighted in bold and will be examined in detail in subsequent sections.
Overview
DipDup is a Python SDK for building customized blockchain indexers. These indexers are off-chain services that collect, process, and store data from blockchain networks in a structured database format.
Each DipDup indexer consists of two main components: a YAML configuration file and a Python package. The configuration file defines which contracts to index, what data to extract, and where to store the results. DipDup's configuration system supports advanced features like templates, environment variable substitution, and multi-file merging, enabling fully declarative indexer definitions. If you're familiar with The Graph, you'll notice similarities to Subgraph manifests.
An index represents a collection of contracts and processing rules treated as a single entity. Your configuration can include multiple indices that operate in parallel, though they cannot share data as execution order isn't guaranteed.
The Python package contains several components:
- ORM models that define your domain-specific data structures
- Handlers (callbacks) that implement business logic for transforming blockchain data into your models
- Optional components like typeclasses, SQL scripts, GraaphQL queries etc. to extend DipDup's functionality
The result is a service that populates your database with indexed blockchain data, which you can then use to build custom API backends or integrate with existing solutions. DipDup provides built-in Hasura GraphQL Engine integration, exposing your indexed data through REST and GraphQL APIs with zero configuration. Alternatively, you can use other API engines like PostgREST or develop your own.
Multi-chain support
DipDup now supports multiple blockchain ecosystems:
- EVM-compatible (Ethereum, Binance Smart Chain, etc.)
- Starknet
- Substrate-based blockchains (Polkadot, Kusama, etc.)
- Tezos
This multi-chain capability allows you to build indexers for a wide range of decentralized applications across different blockchain environments using a consistent development approach.
Storage architecture
DipDup uses PostgreSQL or SQLite as its database backend. All data is stored in a single database schema created during the initial run. This schema should be exclusively used by DipDup, as changes to index configuration or models trigger reindexing — a process that drops the entire schema and restarts indexing from scratch. You can, however, mark specific tables as immune to preserve them or configure custom actions for different reindexing scenarios.
We do not recommend to use database schema migrations to avoid complexity and potential data consistency issues. When DipDup detects changes to models or index definitions the ReindexingRequiredException
is raised. If you really need migrations, check out Migrations section to enable aerich
integration.
Updates are applied atomically on a block-by-block basis, ensuring data integrity. If indexing is interrupted, DipDup will check the database state upon restart and continue from the last processed block. The DipDup state is stored in the database for each index and can be used by API consumers to determine the current indexer head.
Handling chain reorganizations
Chain reorganizations occur when some blocks (including all their operations) are rolled back in favor of others with higher fitness. DipDup processes these reorganization events by restoring a previous database state. You can also implement custom rollback logic by modifying the on_index_rollback
system hook in Python.
Keep in mind, that automatic database rollback works only with ORM models. If you use raw SQL queries, you need to handle reorganizations manually.