Quickstart

This page will guide you through the steps to get your first selective indexer up and running in a few minutes without getting too deep into the details.

A selective blockchain indexer is an application that extracts and organizes specific blockchain data from multiple data sources, rather than processing all blockchain data. It allows users to index only relevant entities, reducing storage and computational requirements compared to full node indexing, and query data more efficiently for specific use cases. Think of it as a customizable filter that captures and stores only the blockchain data you need, making data retrieval faster and more resource-efficient. DipDup is a framework that helps you implement such an indexer.

Let's create an indexer for the USDt token contract. Our goal is to save all token transfers to the database and then calculate some statistics of its holders' activity.

Install DipDup

A modern Linux/macOS distribution with Python 3.12 installed is required to run DipDup.

The recommended way to install DipDup CLI is pipx. We also provide a convenient helper script that installs all necessary tools. Run the following command in your terminal:

Terminal
curl -Lsf https://dipdup.io/install.py | python3.12

See the Installation page for all options.

Create a project

DipDup CLI has a built-in project generator. Run the following command in your terminal:

Terminal
dipdup new

Choose From template, then EVM network and demo_evm_events template.

Note
Want to skip a tutorial and start from scratch? Choose Blank at the first step instead and proceed to the Config section.

Follow the instructions; the project will be created in the new directory.

Write a configuration file

In the project root, you'll find a file named dipdup.yaml. It's the main configuration file of your indexer. We will discuss it in detail in the Config section; now it has the following content:

dipdup.yaml
spec_version: 3.0
package: demo_evm_events

datasources:
  subsquid:
    kind: evm.subsquid
    url: ${SUBSQUID_URL:-https://v2.archive.subsquid.io/network/ethereum-mainnet}
  etherscan:
    kind: abi.etherscan
    url: ${ETHERSCAN_URL:-https://api.etherscan.io/api}
    api_key: ${ETHERSCAN_API_KEY:-''}
  evm_node:
    kind: evm.node
    url: ${NODE_URL:-https://eth-mainnet.g.alchemy.com/v2}/${NODE_API_KEY:-''}
    ws_url: ${NODE_WS_URL:-wss://eth-mainnet.g.alchemy.com/v2}/${NODE_API_KEY:-''}

contracts:
  eth_usdt:
    kind: evm
    address: 0xdac17f958d2ee523a2206206994597c13d831ec7
    typename: eth_usdt

indexes:
  eth_usdt_events:
    kind: evm.events
    datasources:
      - subsquid
      - etherscan
      - evm_node
    handlers:
      - callback: on_transfer
        contract: eth_usdt
        name: Transfer

Generate types and stubs

Now it's time to generate typeclasses and callback stubs based on definitions from config. Examples below use demo_evm_events as a package name; yours may differ.

Run the following command:

Terminal
dipdup init

DipDup will create a Python package demo_evm_events with everything you need to start writing your indexer. Use package tree command to see the generated structure:

Terminal
$ dipdup package tree
demo_evm_events [.]
├── abi
   └── eth_usdt/abi.json
├── configs
   ├── dipdup.compose.yaml
   ├── dipdup.sqlite.yaml
   ├── dipdup.swarm.yaml
   └── replay.yaml
├── deploy
   ├── .env.default
   ├── Dockerfile
   ├── compose.sqlite.yaml
   ├── compose.swarm.yaml
   ├── compose.yaml
   ├── sqlite.env.default
   └── swarm.env.default
├── graphql
├── handlers
   └── on_transfer.py
├── hasura
├── hooks
   ├── on_index_rollback.py
   ├── on_reindex.py
   ├── on_restart.py
   └── on_synchronized.py
├── models
   └── __init__.py
├── sql
├── types
   └── eth_usdt/evm_events/transfer.py
└── py.typed

That's a lot of files and directories! But don't worry, we will need only models and handlers sections in this guide.

Define data models

DipDup supports storing data in SQLite, PostgreSQL and TimescaleDB databases. We use modified Tortoise ORM library as an abstraction layer.

First, you need to define a model class. DipDup uses model definitions both for database schema and autogenerated GraphQL API. Our schema will consist of a single model Holder with the following fields:

addressaccount address
balancetoken amount held by the account
turnovertotal amount of transfer/mint calls
tx_countnumber of transfers/mints
last_seentime of the last transfer/mint

Here's how to define this model in DipDup:

models/__init__.py
from dipdup import fields
from dipdup.models import CachedModel


class Holder(CachedModel):
    address = fields.TextField(primary_key=True)
    balance = fields.DecimalField(decimal_places=6, max_digits=20, default=0)
    turnover = fields.DecimalField(decimal_places=6, max_digits=20, default=0)
    tx_count = fields.BigIntField(default=0)
    last_seen = fields.BigIntField(null=True)

    class Meta:
        maxsize = 2**18

Using ORM is not a requirement; DipDup provides helpers to run SQL queries/scripts directly, see Database page.

Implement handlers

Everything's ready to implement an actual indexer logic.

Our task is to index all the balance updates. Put some code to the on_transfer handler callback to process matched logs:

handlers/on_transfer.py
from decimal import Decimal

from demo_evm_events import models as models
from demo_evm_events.types.eth_usdt.evm_events.transfer import TransferPayload
from dipdup.context import HandlerContext
from dipdup.models.evm import EvmEvent
from tortoise.exceptions import DoesNotExist


async def on_transfer(
    ctx: HandlerContext,
    event: EvmEvent[TransferPayload],
) -> None:
    amount = Decimal(event.payload.value) / (10**6)
    if not amount:
        return

    await on_balance_update(
        address=event.payload.from_,
        balance_update=-amount,
        level=event.data.level,
    )
    await on_balance_update(
        address=event.payload.to,
        balance_update=amount,
        level=event.data.level,
    )


async def on_balance_update(
    address: str,
    balance_update: Decimal,
    level: int,
) -> None:
    try:
        holder = await models.Holder.cached_get(pk=address)
    except DoesNotExist:
        holder = models.Holder(
            address=address,
            balance=0,
            turnover=0,
            tx_count=0,
            last_seen=None,
        )
        holder.cache()
    holder.balance += balance_update
    holder.turnover += abs(balance_update)
    holder.tx_count += 1
    holder.last_seen = level
    await holder.save()

And that's all! We can run the indexer now.

Next steps

Run the indexer in memory:

dipdup run

Store data in SQLite database (defaults to /tmp, set SQLITE_PATH env variable):

dipdup -c . -c configs/dipdup.sqlite.yaml run

Or spawn a Compose stack with PostgreSQL and Hasura:

cd deploy
cp .env.default .env
# Edit .env file before running
docker-compose up

DipDup will fetch all the historical data and then switch to realtime updates. You can check the progress in the logs.

If you use SQLite, run this query to check the data:

sqlite3 /tmp/demo_evm_events.sqlite 'SELECT * FROM holder LIMIT 10'

If you run a Compose stack, open http://127.0.0.1:8080 in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.

Congratulations! You've just created your first DipDup indexer. Proceed to the Getting Started section to learn more about DipDup configuration and features.

Help and tips -> Join our Discord
Ideas or suggestions -> Issue Tracker
GraphQL IDE -> Open Playground