Scaling Kaspa's Data Infrastructure

What Kaspa development can provide, what exchanges and wallet providers must build themselves, and how other chains solved this problem.

March 2026 • Updated with exchange infrastructure analysis

The Problem — Reframed

This Isn't a One-Size-Fits-All Problem

The initial framing — "fix the explorer database" — assumed every exchange and wallet uses the same stack. They don't. Not even close.

Coinbase runs Kafka with 30 brokers processing billions of events daily into a Delta Lake. Kraken runs Rust microservices with Aeron for microsecond-latency messaging. Binance runs isolated Kafka clusters across regions. A small exchange might use a third-party API provider and never touch a node.

The core issue: Kaspa's existing indexers are opinionated — they output to one specific backend (PostgreSQL or a proprietary local store). Exchanges that use Kafka, Snowflake, ClickHouse, gRPC, or anything else have to build their entire ingestion pipeline from scratch, starting at the raw node RPC level.

Exchange Infrastructure

How Exchanges Actually Work

Exchange infrastructure varies enormously by scale. There is no standard stack.

Tier 1 — Large Exchanges

Binance, Coinbase, Kraken, OKX • 100+ engineers, fully custom
  • Run their own full/archive nodes for every supported chain
  • Custom internal indexers — not using community tools
  • Enterprise message queues — Kafka (Coinbase, Binance), Aeron (Kraken)
  • Multi-tier databases — operational DB + data warehouse (Snowflake, Delta Lake) + real-time analytics (StarRocks, ClickHouse)
  • Multiple ingestor instances connected to different nodes for redundancy
  • Deduplication + event ordering in the queue system before persistence
  • Custom deployment automation — Coinbase's Snapchain spins up new blockchain nodes in minutes via EBS snapshots

Tier 2 — Medium Exchanges

KuCoin, Gate.io, Bybit, Bitfinex • 10-50 engineers, mixed approach
  • Own nodes for top 20-50 chains, managed RPC services (QuickNode, Ankr) for long-tail
  • Simpler Kafka or RabbitMQ setups
  • PostgreSQL as primary, possibly with a data warehouse for analytics
  • Smaller blockchain teams, less custom tooling
  • More likely to use community-built indexers if available

Tier 3 — Small Exchanges & Wallets

Startups, regional exchanges • 1-5 engineers, third-party dependent
  • Almost entirely reliant on node-as-a-service providers (NowNodes, Ankr, QuickNode)
  • Managed blockchain APIs — services like CryptoAPIs.io handling deposit detection across 100+ chains
  • Standard web stack — PostgreSQL/MySQL, Redis, basic queue
  • May use white-label solutions or CCXT library
  • Most likely to be affected by explorer API lag — they depend on public APIs

Key insight: You can't hand every exchange the same fix — they all run different systems. What Kaspa can provide is a clean, universal entry point into the data. What each exchange does with that data is up to them.

Responsibility

Who Owns What

Not everything is Kaspa development's problem. Not everything is the exchange's problem either.

ComponentOwnerStatus
Full node software
rusty-kaspa with gRPC/wRPC
Kaspa Done
UTXO index
Address-based queries + subscriptions
Kaspa Done
RPC notification system
Block, virtual chain, UTXO change events
Kaspa Done
Un-opinionated ingestor
Node events → pluggable output sinks
Kaspa Missing — the gap
Integration documentation
Deposit detection, withdrawals, reorgs
Kaspa Partial
SDKs
Tx construction in Rust, Go, TS, Python
Kaspa + Community Rust (core), Python (community), others partial
Community explorer
simply-kaspa-indexer + REST API
Community Works (PostgreSQL-only)
Processing pipeline
Kafka, RabbitMQ, internal queues
Exchange Their architecture
Database / data store
PostgreSQL, Snowflake, ClickHouse, etc.
Exchange Their choice
Deposit detection + crediting Exchange Their business logic
Withdrawal construction Exchange Their key management
Hot/cold wallet architecture Exchange Their security model
Compliance / AML / KYT Exchange Their regulatory obligation
Redundancy + failover Exchange Their SLA
The Missing Piece

An Un-Opinionated Kaspa Ingestor

The node already provides excellent RPC subscriptions. The problem is there's no standardized bridge between raw node events and the diverse backends exchanges use.

Both existing Rust indexers are opinionated:

simply-kaspa-indexer

  • Outputs to PostgreSQL only
  • Opinionated schema design
  • Handles 10 BPS on high-end NVMe
  • Great for explorers, unusable for Kafka-based exchanges

Kasia Indexer

  • Writes locally to disk
  • Filters for Kasia protocol messages only
  • Application-specific, not general-purpose
  • Not usable for exchange integration

What's needed is a reference ingestor — a standalone Rust application using VSPCv2:

// Un-opinionated Kaspa ingestor architecture ┌──────────────────────────────────────────────────────────────┐ Kaspa Ingestor INPUT ├── Subscribes to rusty-kaspa node via wRPC/gRPC ├── NotifyUtxosChanged (address subscriptions) └── Uses VSPCv2 for efficient sender address resolution NORMALIZE ├── Canonical protobuf/JSON schema for all events ├── Block events, tx events, UTXO events, acceptance └── DAG reorg / virtual chain change events OUTPUT — pluggable sinks, configured at runtime ├── Kafka topics exchange pipelines ├── gRPC stream low-latency consumers ├── Webhooks / HTTP simple integrations ├── PostgreSQL explorers / analytics ├── RabbitMQ / SQS queue-based systems ├── S3 / Parquet data lakes / research └── stdout / file dev / debugging └──────────────────────────────────────────────────────────────┘

Design principle: Extract once, transform and load as needed. The ingestor outputs to whatever sink the consumer configures. Large exchanges run multiple instances connected to different nodes for redundancy, deduplicating events downstream in their queue system.

Integration Patterns

How Each Tier Would Use This

// TIER 1 — Large exchange (Binance/Coinbase pattern) Kaspa Node A ──wRPC──► Ingestor 1 ─┐ Kaspa Node B ──wRPC──► Ingestor 2 ─┤──► Kafka / Aeron ──► Dedup Kaspa Node C ──wRPC──► Ingestor 3 ─┘ │ ▼ ┌── Queue Readers ──┐ │ │ │ ▼ ▼ ▼ Hot Wallet Data Lake Compliance Monitor (Snowflake) (AML/KYT) // TIER 2 — Medium exchange Kaspa Node ──wRPC──► Ingestor ──► RabbitMQ ──► PostgreSQL │ Internal API // TIER 3 — Small exchange Kaspa Node ──wRPC──► Ingestor ──► PostgreSQL (direct) or Ingestor ──► Webhooks to backend // OR just use the community explorer API (current default)
Precedent

How Other Chains Solved This

The solutions that work all share one pattern: separate ingestion from application-specific indexing.

Solana

Geyser Plugin System

Plugins loaded into the validator. Single trait interface. Community output adapters for PostgreSQL, Kafka, gRPC, RabbitMQ, SQS, BigTable. The gold standard.

Ethereum

EthPandaOps Xatu + The Graph

Xatu: pluggable sinks — ClickHouse, Kafka, PostgreSQL, Parquet. The Graph: decentralized indexing via GraphQL subgraphs.

Polygon

Chain Indexer Framework

Three-layer Kafka-centric: Block Producers → Kafka → Transformers → Consumers. Raw data replayable from Kafka indefinitely.

NEAR

Lake Framework

Events as JSON on S3. Framework libraries in Rust, JS, Python. 100-500MB RAM, ~$18/month. Centralized but effective.

Cosmos

Tendermint EventBus + tx_index

Built-in indexing: LevelDB or PostgreSQL backends. EventBus for custom indexers. Similar to Kaspa's notification system.

Bitcoin

Fragmented (cautionary tale)

No standard interface. Massive fragmentation: ElectrumX, Electrs, Esplora, Blockbook. Everyone reinvented the wheel. Avoid this.

The pattern: Chains that provide an output-agnostic ingestion layer early get healthy ecosystems. Chains that don't (Bitcoin) get fragmentation. Kaspa has a window to get this right.

Kaspa's Deliverables

What Kaspa Development Should Provide

DeliverableStatus
Stable node with documented RPCDone
UTXO Index — address queries + change notificationsDone
Notification event system — blocks, virtual chain, UTXOsDone
Un-opinionated reference ingestor — Rust, VSPCv2, pluggable sinksNot started
Exchange integration docs — deposits, withdrawals, DAG reorgsPartial
Multi-language SDKs — Rust (core), Python (community), Go, TypeScriptRust + Python done, others vary
Canonical protobuf schema for all ingestor event typesNot started
Docker Compose reference — node + ingestor + Kafka + PostgreSQLNot started
Mempool notifications (GitHub #339)Not started
Community Infrastructure

Where the Community Explorer Fits

The simply-kaspa-indexer + kaspa-rest-server stack fills an important role — it powers the public explorer and is the default integration point for Tier 3 exchanges and wallet providers.

Its PostgreSQL bottleneck is real and the optimizations shipping (v2.0 denormalization, v2.1 VSPCv2, batch tuning) directly help this audience.

But it was never going to be the solution for Tier 1 or Tier 2 exchanges — and it shouldn't try to be. Those exchanges need the un-opinionated ingestor to plug into their own infrastructure.

The community explorer is valuable. It serves the public block explorer, small integrations, and developers. It just shouldn't be treated as the universal exchange integration solution. That's the ingestor's job.

Recommended Path

What Should Happen

For Kaspa development:

Build a reference un-opinionated ingestor in Rust using VSPCv2 with pluggable output sinks. Start with Kafka + gRPC + stdout. Community adds more sinks. Publish exchange integration documentation for DAG-specific concerns (virtual chain changes, UTXO handling). Continue supporting community explorer optimizations for the public explorer and Tier 3 use case.

For exchanges:

Run your own rusty-kaspa node(s) with --utxoindex. Use the ingestor to feed events into your existing pipeline. Build deposit detection, crediting, and withdrawal logic on top of your own infrastructure. The ingestor gives you the data — what you do with it is up to you.

For wallet providers:

For real-time balances: connect directly to a node via wRPC using NotifyUtxosChanged subscriptions — no explorer dependency. For transaction history: use the community explorer REST API or run your own simply-kaspa-indexer instance.

Kaspa • March 2026