Storage

Where code and data live—and how that determines what we can prove.

Storage is anywhere bytes can be retrieved: a blockchain, a p2p network, a cloud bucket, a git repo. To understand what can be verified, we decompose storage into five orthogonal properties:

  1. Addressability — How content is referenced. Determines whether verification is possible.
  2. Persistence — How long content exists. Determines auditability over time.
  3. Availability — Who hosts content. Determines liveness and censorship resistance.
  4. Access — Who is allowed to retrieve content. Determines permission requirements.
  5. Cost — What ongoing obligation exists. Determines protocol complexity.

Addressability

Addressability answers: "How do I reference this data?" Determines whether cryptographic verification is possible.

Content-Addressed

The identifier is derived from the content itself.
MD5d41d8cd98f00b204...Download verification (1992)
SHA-1da39a3ee5e6b4b0d...Stronger hash (1995)
BitTorrentmagnet:?xt=urn:btih:...P2P content lookup (2001)
Nix/nix/store/aaaa-pkg...Reproducible builds (2003)
Git3f9a2c7b8d1e4f5a...Commit SHA (2005)
Bitcoin000000000019d668...Block hash (2009)
BIP-21bitcoin:1A1zP1eP5...URI scheme (2012)
Ethereum0xd4e56740f876ae...Block/tx hash (2015)
IPFSipfs://QmYwAPJzv5...Content ID (2015)
Dockersha256:a3ed95caeb...Image digest (2015)
SRIsha384-oqVuAfXR...Browser integrity (2016)
Arweavear://8a7c3f2e1b4d...Permanent storage (2017)
BIP-173bc1qar0srrr7xfkv...Bech32 address (2017)
ENSvitalik.eth → 0x...Name → hash (2017)
SolanaSo1111111111111...Account address (2020)
Filecoinbafybeiemxf5abjwj...CID + storage deals (2020)
Nostrnote1qqqqqqqq...Event hash (2020)
EIP-3770eth:0xd8dA6BF269...Chain prefix (2021)
Ordinalsord://i0/sat/...Bitcoin inscriptions (2023)
Trust Model: Cryptographic
The hash is the proof. Verify anywhere. Trust no one.

Location-Addressed

The identifier points to a place. Content can change.
FTPftp://nic.ddn.mil/rfc/...File transfer (1971)
IP192.168.1.1Machine address (1974)
DNSexample.comName → IP (1983)
URLhttp://info.cern.ch/...Web resource (1994)
Git Refowner/repo@masterBranch pointer (2005)
S3s3://bucket/keyCloud object (2006)
AWS ARNarn:aws:s3:::bucket/...Resource name (2006)
npmlodash@^4.17.0Version range (2010)
Dockerubuntu:latestMutable tag (2013)
Trust Model: Reputation
No proof—only promises. What you get can change.

Persistence

Persistence answers: "How long will this data exist?" Determines whether the data will be there when you need it.

DurationWhat it meansExamples
Indefinite
No expiration date.
Lasts as long as the network exists. But networks can fail, hard forks can orphan data, and "forever" is an aspiration.
Bitcoin
Since 2009
Ethereum
Since 2015
Arweave
Since 2018
Fixed-term
Explicit expiration.
You know exactly when data will disappear unless renewed. Requires renewal logic.
Ethereum (State Expiry)
Untouched = expired
Solana
Rent or pruned
Filecoin
Negotiated deals
Storj
Segment TTL
Celestia
~2 weeks
EigenDA
~2 weeks
Avail
~2 weeks
S3
Pay or delete
GitHub (Private)
Pay or delete
Docker (Private)
Pay or delete
ENS
Annual renewal
DNS
Annual renewal
Best-effort
No guarantees.
Data exists while someone cares. Could be years, could be minutes.
IPFS
Depends on pinners
BitTorrent
Depends on seeders
GitHub (Public)
ToS can change
Docker (Public)
ToS can change
CDN
ToS can change

Availability

Availability answers: "Can I reach the infrastructure?" Determines resilience to outages and censorship.

Decentralized

No single point of failure. Data is replicated across independent nodes.
BitcoinFull nodesFull replication (2009)
EthereumFull nodesFull state (2015)
IPFSDHT routingContent routing (2015)
ArweaveMinersBlockweave (2018)
FilecoinStorage providersProof of storage (2020)
StorjStorage nodesErasure coding (2018)
SolanaValidatorsHigh throughput (2020)
CelestiaLight nodesData availability (2023)
EigenDARestaked operatorsData availability (2024)
AvailLight nodesData availability (2023)
NostrRelaysFederated gossip (2020)
High Liveness
Survives node failures. No single operator to censor.

Centralized

Single operator controls availability. Simpler but fragile.
GitHubgithub.comMicrosoft-operated (2008)
S3s3.amazonaws.comAWS regions (2006)
npmregistry.npmjs.orgGitHub-operated (2010)
Docker Hubhub.docker.comDocker Inc (2013)
CDNcdn.example.comProvider-dependent (1998)
Single Point of Failure
Operator downtime = your downtime. Can be censored.

Access

Access answers: "Can I retrieve and read this data?" Determines what validators need to verify execution.

LevelBarrierExamples
Public
Fetch and read freely.
No credentials needed. Anyone with a network connection can retrieve and understand the content.
Bitcoin
Full nodes
Ethereum
RPC endpoints
IPFS
Public gateways
Arweave
Open endpoints
Celestia
Light nodes
EigenDA
Disperser API
Avail
Light nodes
GitHub (Public)
Public repos
Docker (Public)
Public images
Gated
Barrier to download, not to read.
Credentials or payment required to retrieve. Once downloaded, content is readable.
Storj
Access grant
S3
IAM credentials
GitHub (Private)
OAuth token
Docker (Private)
Registry auth
Paywalled APIs
402 Payment
Encrypted
Barrier is in the bytes.
Content may be publicly available, but is cryptographically sealed. Decryption keys required.
Lit Protocol
Threshold decryption
Encrypted IPFS
Symmetric key
Client-side S3
User-managed keys
TEE enclaves
Attestation

Cost

Cost answers: "What ongoing obligation exists?" Determines protocol complexity and failure modes.

ModelObligationExamples
Upfront
Pay once, forget forever.
Cost is baked into the transaction fee or endowment. No renewal logic needed.
Bitcoin
Tx fee
Ethereum
Gas fee
Arweave
Endowment
Celestia
Blob fee
EigenDA
Blob fee
Avail
Blob fee
Ordinals
Inscription
Recurring
Pay periodically.
Requires billing logic & failure handling. Miss a payment = data loss.
Ethereum (State Expiry)
Touch to keep
Solana
Rent balance
Storj
Per GB/month
Filecoin
Deal renewal
S3
Monthly bill
GitHub (Private)
Monthly bill
Docker (Private)
Monthly bill
ENS
Annual fee
DNS
Annual fee
Subsidized
It's free, but...
Provider pays because data has value (AI training, ecosystem lock-in).
GitHub (Public)
Training data
Docker (Public)
Public images
HuggingFace
Models
npm
Ecosystem
CDN
Bandwidth
Operational
Run it yourself.
No direct fee, but requires running infra or trusting the community.
IPFS
Pinning
Nostr
Relaying
BitTorrent
Seeding
Git
Self-hosted
Nix
Caching

Property Matrix

We can map every technology in the ecosystem to these five properties to see their tradeoffs side-by-side.

TechnologyAddressabilityPersistenceAvailabilityAccessCost
BitcoinContentSince 2009Full nodesFull nodesTx fee
EthereumContentSince 2015Full nodesRPC endpointsGas fee
ArweaveContentSince 2018MinersOpen endpointsEndowment
IPFSContentDepends on pinnersDHT routingPublic gatewaysPinning
BitTorrentContentDepends on seedersDHT routingOpen protocolSeeding
NostrContentDepends on relaysFederated relaysOpen protocolRelaying
Ethereum (State Expiry)ContentUntouched = expiredFull nodesRPC endpointsTouch to keep
SolanaContentRent or prunedValidatorsRPC endpointsRent balance
FilecoinContentNegotiated dealsStorage providersPublic dealsDeal renewal
CelestiaContent~2 weeksLight nodesLight nodesBlob fee
EigenDAContent~2 weeksRestaked operatorsDisperser APIBlob fee
AvailContent~2 weeksLight nodesLight nodesBlob fee
StorjContentSegment TTLStorage nodesAccess grantPer GB/month
GitHub (Public, SHA)ContentToS can changeMicrosoftPublic reposTraining data
Docker (Public, Digest)ContentToS can changeDocker IncPublic imagesPublic images
GitHub (Private, SHA)ContentPay or deleteMicrosoftOAuth tokenMonthly bill
Docker (Private, Digest)ContentPay or deleteDocker IncRegistry authMonthly bill
GitHub (Public, Branch)LocationToS can changeMicrosoftPublic reposTraining data
Docker (Public, Tag)LocationToS can changeDocker IncPublic imagesPublic images
CDNLocationToS can changeProvider-dependentPublicBandwidth
S3LocationPay or deleteAWS regionsIAM credentialsMonthly bill
GitHub (Private, Branch)LocationPay or deleteMicrosoftOAuth tokenMonthly bill
Docker (Private, Tag)LocationPay or deleteDocker IncRegistry authMonthly bill

Verifiability

Verifiability answers: "Can validators confirm this data is correct?" The five properties combine to determine the answer.

To verify that a computation used specific data, validators need:

Addressability

Trust the bytes
Persistence
Data still exists
Availability
Infra is reachable
Access
Can retrieve it
Cost
Economically viable
The Spectrum

Profile

Who can verify?

BitcoinAnyone, anytime, forever
Celestia

Anyone, within ~2 week window

Filecoin

Anyone, while deal is active

IPFSAnyone, while pinned

GitHub (Private)

Credentialed parties, while paid

Encrypted IPFS

Key holders only, while pinned

S3 (Location)

Credentialed parties, content may change

Weakening Properties

Location Addressing

Proves you fetched something, not the right thing

Limited Persistence

Verification has a deadline—miss it and data is gone

Centralized Availability

Verification depends on specific infra being online

Gated Access

Requires credentials—not everyone can check

Recurring Cost

Someone must keep paying or data disappears

Encrypted Content

Bytes are public but meaningless without keys

Design Questions

Who verifies? Public validators or credentialed set?

When? Immediately, within a window, or on-demand?

How long? Minutes, weeks, years, or forever?

On failure? Retry, abort, or fraud proof?

Who pays? User, protocol, or storage provider subsidy?