Skip to content

feat(arns): resolve ArNS names to IPFS CIDs (ArNS→IPFS, Phase 2)#793

Open
vilenarios wants to merge 31 commits into
PE-9067-add-ipfs-cidfrom
feat/arns-ipfs-protocol
Open

feat(arns): resolve ArNS names to IPFS CIDs (ArNS→IPFS, Phase 2)#793
vilenarios wants to merge 31 commits into
PE-9067-add-ipfs-cidfrom
feat/arns-ipfs-protocol

Conversation

@vilenarios

Copy link
Copy Markdown
Contributor

Summary

Phase 2 of the IPFS work: an ArNS name whose ANT record targets an IPFS CID
(targetProtocol: ipfs) now resolves and serves through the gateway's IPFS path
— e.g. my-name.gateway.tld → IPFS content, no CID in the URL.

Stacked on #682 (base = PE-9067-add-ipfs-cid). It depends on that PR's
IPFS handler/service. Retarget to develop once #682 merges. Review only the
two commits here.

What it does

  • @ar.io/sdk 4.0.0 ANT records carry targetProtocol (0=Arweave, 1=IPFS) and
    the target may be a CID. The on-demand resolver reads targetProtocol,
    validates the target as a CID (vs a 43-char Arweave id), and surfaces
    protocol on the resolution (cached transparently).
  • The ArNS middleware routes protocol === 'ipfs' resolutions to the same
    IPFS handler used by the path/subdomain routes (sets ipfsCid/ipfsPath);
    everything else serves via the Arweave data path as before.
  • NameResolution.protocol is optional (undefined ⇒ arweave) — backward
    compatible.

Review hardening (2nd commit)

  • Cache-Control: ArNS→IPFS responses use the ArNS TTL (mutable name→CID
    binding), not immutable — a direct /ipfs/{CID} stays immutable. Prevents
    pinning a stale CID for ~a year after a record update (cf. PE-9072).
  • X-ArNS-Protocol: arweave|ipfs response header (signed; added to
    TRIGGER_HEADERS) + protocol/resolvedId in the /ar-io/resolver/:name
    JSON, so clients know whether X-ArNS-Resolved-Id is a TX id or a CID.
  • Content-Digest (RFC 9530) on IPFS responses: SHA-256 computed at
    cache-write time, emitted on cache hits, signed via CO_SIGNABLE_HEADERS.
  • Docs (ipfs-integration.md Phase 2 rewrite, glossary, CLAUDE.md) + a cache
    digest round-trip unit test.

Known limitation

Protocol awareness lives in the on-demand resolver. The trusted-gateway
resolver doesn't yet propagate targetProtocol across hops, so keep on-demand
ahead of gateway in ARNS_RESOLVER_PRIORITY_ORDER for IPFS-targeted names.
(Real fix: a protocol resolution header across gateways — follow-up.)

Verification

Verified end-to-end live (gateway + Kubo sidecar) against a real ArNS undername
pointed at a CID with targetProtocol: 1:

  • X-Ar-Io-Source: ipfs, X-ArNS-Resolved-Id = the CID, X-ArNS-Protocol: ipfs
  • Cache-Control: public, max-age=<ttl> via ArNS; immutable for direct CID
  • Content-Digest present + signed on cache hits, matches served bytes
  • Normal ArNS names still serve via Arweave (no regression)
  • typecheck + lint + unit tests green

🤖 Generated with Claude Code

Ariel Melendez and others added 6 commits June 22, 2026 11:57
Windowed GraphQL `transactions` queries against the ClickHouse-backed
indexer could return a partial page with `pageInfo.hasNextPage = false`
and no error, silently stranding every subsequent page — a cursor client
paging until `hasNextPage` is false would under-report results for dense
wallets/ranges.

Root cause: the CH legs fetch `pageSize + 1` rows with a full-key
`LIMIT 1 BY height, block_transaction_index, is_data_item, id`, but the
composite then dedups by `id` alone. Stale rows that share an `id` and
height while differing on `block_transaction_index` (a placeholder bti=0
left by an earlier indexing pass alongside the real bti) are not folded by
the SQL `LIMIT 1 BY`, yet collapse in the id-dedup. When that shrinks the
merged set to `pageSize` or fewer, `edges.length > pageSize` reads false
even though the leg came back full.

Derive `hasNextPage` from each leg's raw result before the cross-leg
id-dedup: a CH leg returning more than `pageSize` rows, or the SQLite
leg's own `hasNextPage`, means more matching rows exist. Erring toward
true is safe — the worst case is one extra page fetch that comes back
empty. No SQL or cursor changes.

Adds a regression test reproducing the production shape (same id, bti 0
vs 12, both data items) collapsing a full page to `pageSize`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The function-level TSDoc still described the pre-fix behavior (hasNextPage
computed against the deduped edge list) as intentional — that was the
PE-9124 bug. Update it to describe deriving hasNextPage from each leg's
raw result before id-dedup.

Addresses CodeRabbit review feedback on PR #792.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ANT records now carry a targetProtocol (0=Arweave, 1=IPFS) and the record
target can be an IPFS CID instead of an Arweave TX ID (@ar.io/sdk 4.0.0).
Previously the on-demand resolver read only transactionId and validated it
as a 43-char Arweave ID, so a CID-targeted name failed to resolve.

- on-demand resolver: read targetProtocol; validate the target as a CID
  when protocol is IPFS, else as an Arweave ID; surface protocol on the
  resolution (cached transparently as part of NameResolution).
- NameResolution: optional protocol field ('arweave' | 'ipfs'); undefined
  treated as 'arweave' for backward compat (e.g. trusted-gateway hops).
- arns middleware: when a name resolves to an IPFS CID and IPFS serving is
  enabled, hand off to the IPFS handler (sets ipfsCid/ipfsPath, mirroring
  the IPFS subdomain middleware) instead of the Arweave data handler.

Completes the 'ArNS -> IPFS CID' phase the IPFS PR was foundation for.
Verified: typecheck + lint clean, resolver unit tests pass, and live
on-demand Solana resolution of existing names still serves via Arweave.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fix(gql): truthful hasNextPage when id-dedup collapses a full ClickHouse page
The turbo-s3 on-demand data source shared the default awsClient (configured
from AWS_REGION/AWS_ENDPOINT and the default credentials), so it could not
target a separate AWS account or endpoint.

Add a turboAwsClient that is instantiated as its own awsLite client only
when BOTH TURBO_AWS_REGION and TURBO_AWS_ENDPOINT are set; otherwise it
references the existing awsClient, preserving current behavior. Credentials
follow the same paradigm: TURBO_AWS_ACCESS_KEY_ID / TURBO_AWS_SECRET_ACCESS_KEY
/ TURBO_AWS_SESSION_TOKEN are used when provided and otherwise fall back to
the default AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN;
when none are set, aws-lite resolves credentials from the ambient provider
chain (e.g. IAM role), matching the default client. On init failure it falls
back to the shared awsClient.

Wire turboS3DataSource to use turboAwsClient and document the new vars in
docs/envs.md and docker-compose.yaml.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-review hardening of the ArNS->IPFS feature:

- fix(cache-control): ArNS->IPFS responses no longer send immutable/1-year.
  The IPFS handler only sets `immutable` for direct /ipfs/{CID} (and {CID}.host)
  requests; when reached via an ArNS name (mutable name->CID binding) it keeps
  the ArNS-TTL Cache-Control the ArNS middleware set, so a record repoint isn't
  pinned in caches for ~a year (cf. PE-9072).
- feat(headers): emit signed `X-ArNS-Protocol: arweave|ipfs` on resolutions and
  add `protocol` (+ `resolvedId`) to the /ar-io/resolver/:name JSON, so clients
  know whether X-ArNS-Resolved-Id is a TX ID or a CID. Added x-arns-protocol to
  TRIGGER_HEADERS so it's part of the signature.
- feat(httpsig): body-bind IPFS responses with RFC 9530 Content-Digest. The
  SHA-256 is computed at cache-write time and emitted on cache hits (in
  CO_SIGNABLE_HEADERS, so HTTPSIG signs it). Misses stream without it; the
  signed ETag=CID still attests identity.
- test(ipfs): cache digest round-trip + legacy (digest-less) entry coverage.
- docs: rewrite the ipfs-integration.md Phase 2 section to the shipped
  targetProtocol design (was speculative), glossary Target Protocol entry,
  CLAUDE.md note. Documented the trusted-gateway-resolver protocol limitation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.31081% with 247 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.69%. Comparing base (86ecf16) to head (ca79d91).
⚠️ Report is 14 commits behind head on PE-9067-add-ipfs-cid.

Files with missing lines Patch % Lines
src/ipfs/ipfs-cache.ts 62.96% 100 Missing ⚠️
src/ipfs/kubo-data-source.ts 59.51% 100 Missing ⚠️
src/database/composite-clickhouse.ts 89.94% 39 Missing ⚠️
src/routes/ar-io-info-builder.ts 80.00% 3 Missing ⚠️
src/database/standalone-sqlite.ts 96.77% 2 Missing ⚠️
src/store/fs-chunk-data-store.ts 95.00% 2 Missing ⚠️
src/arweave/composite-client.ts 99.00% 1 Missing ⚠️
Additional details and impacted files
@@                   Coverage Diff                    @@
##           PE-9067-add-ipfs-cid     #793      +/-   ##
========================================================
+ Coverage                 78.32%   78.69%   +0.36%     
========================================================
  Files                       132      136       +4     
  Lines                     49403    50845    +1442     
  Branches                   3691     3809     +118     
========================================================
+ Hits                      38697    40010    +1313     
- Misses                    10658    10787     +129     
  Partials                     48       48              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Compose defaults like `${TURBO_AWS_REGION:-}` pass an empty string rather than
undefined when unset on the host, so the previous `!== undefined` gate and `??`
credential fallback would wrongly enable a misconfigured dedicated Turbo client
and suppress fallback to the AWS_* credentials. Route all TURBO_AWS_* / AWS_*
reads through the existing env.varOrUndefined helper, which treats empty/
whitespace strings as unset. Also add TSDoc to hasTurboAwsConfig and
turboAwsClient.

Addresses CodeRabbit review on PR #794.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vilenarios

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@vilenarios, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 37 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: fea89f96-c08e-41ca-96e3-0297556bb6bc

📥 Commits

Reviewing files that changed from the base of the PR and between d383ca2 and 27316b1.

📒 Files selected for processing (13)
  • CLAUDE.md
  • docs/glossary.md
  • docs/ipfs-integration.md
  • src/constants.ts
  • src/ipfs/ipfs-cache.test.ts
  • src/ipfs/ipfs-cache.ts
  • src/ipfs/ipfs-service.ts
  • src/lib/httpsig.ts
  • src/middleware/arns.ts
  • src/resolution/on-demand-arns-resolver.ts
  • src/routes/arns.ts
  • src/routes/ipfs.ts
  • src/types.d.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/arns-ipfs-protocol

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

vilenarios and others added 19 commits June 23, 2026 01:11
The ArNS->IPFS routing decision hinges on classifying an ANT record's target
as arweave vs ipfs and validating the id accordingly. Extracted that logic from
OnDemandArNSResolver into a pure, SDK-free helper (classifyResolvedTarget) and
unit-tested it: arweave/ipfs by targetProtocol, undefined+unknown protocol ->
arweave (fail-closed), CIDv0/v1 acceptance, and cross-format rejection
(CID under arweave, TX id under ipfs, garbage).

The ArNS middleware routing itself can't be unit-tested in isolation (it imports
system.ts, booting the DI graph — no middleware has unit tests for this reason);
it stays covered by live e2e. Also documented the three root/apex cases in
ipfs-integration.md: a name's @ record and apex-via-APEX_ARNS_NAME route to IPFS;
apex-via-APEX_TX_ID is Arweave-only (bypasses protocol routing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(s3): optional dedicated Turbo AWS client via TURBO_AWS_* vars
…ection

Owner-filtered GraphQL `transactions` queries (the ArDrive UserDriveEntities
pattern) scan tens of millions of rows because `transactions` is height-
ordered while the filter is on `owner_address`, so a sparse owner's rows
scatter across the full height range and finding a page trips
`max_rows_to_read` (Code 158) — measured 12.1M rows for an owner with only
22k rows / 46 drives.

Route eligible owner queries through the owner-ordered `owner_projection`
by emitting `optimize_use_projections = 1, optimize_read_in_order = 0`, which
lets the optimizer seek the owner's contiguous slice and sort the small
matched set in memory (measured 12.1M -> 451K rows). A reactive height-
windowing fallback retries on Code 158 for whale owners whose footprint
still exceeds the cap.

Gated by CLICKHOUSE_GQL_OWNER_PROJECTION_ROUTING_ENABLED (default off; when
off, queries plan exactly as before) and scoped to an Entity-Type allowlist
(CLICKHOUSE_GQL_OWNER_PROJECTION_ENTITY_TYPES, default drive,folder,snapshot)
so large-result `file` queries, bare-owner, and owner+other-tag queries are
excluded.

Verified against ClickHouse: the routed query reads 451K rows and returns the
correct page, and multi-page cursor pagination reproduces the canonical
ordered result exactly (no dup/gap, correct hasNextPage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The new owner-projection routing vars were added to config.ts and
docs/envs.md but not to docker-compose.yaml's core `environment:` allowlist,
so setting them in `.env` had no effect — the container never received them
and the feature stayed off. Add both vars alongside the other
CLICKHOUSE_GQL_* passthroughs (CLAUDE.md: keep compose and envs.md in sync).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ownerProjectionApplies now requires that ALL tags are Entity-Type tags, so
  owner+other-tag shapes (e.g. owner + Entity-Type=drive + App-Name) fall back
  to the default plan instead of routing an untested shape through the
  projection (matches the documented allowlist contract).
- The windowed fallback drains a dense height window via a running cursor
  before advancing the frontier. Previously a window returning only pageSize+1
  raw rows whose dedup collapsed some ids would advance past the slice and
  strand the unique rows below it (short pages / wrong hasNextPage).
- Refresh the findings-doc status (implemented + env-gated + canary-validated)
  and drop public gateway names from it.

Adds tests for both behaviors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(gql): route owner-filtered ClickHouse queries through owner_projection
These six guides (~5,462 lines) were added in one batch on 2025-07-01
(43318d8, "docs: add comprehensive AI-generated documentation"). They were
never linked from docs/INDEX.md, never reviewed, and never corrected in the
~year since.

An audit against the code found them partially fabricated while presenting
as authoritative "Complete Technical Documentation": invented env vars
(CHUNK_POST_URLS — real is PREFERRED_CHUNK_POST_NODE_URLS;
SECONDARY_CHUNK_POST_MIN_SUCCESS_COUNT; CHUNK_POST_TIMEOUT_MS — real are
CHUNK_POST_RESPONSE_TIMEOUT_MS/_ABORT_TIMEOUT_MS), a non-existent "secondary
broadcast tier", and wrong defaults — including copy-pasteable .env examples
referencing vars that do nothing. They were also the only place the
now-removed dead ARWEAVE_PEER_CHUNK_POST_* vars were "documented".

Orphaned + unreviewed + net-misleading. Removing the series; accurate,
maintained docs live under docs/ (indexed by docs/INDEX.md).

Removed:
- ar-io-01-architecture-overview.md
- ar-io-02-data-retrieval-complete-guide.md
- ar-io-03-arweave-connectivity-complete-guide.md
- ar-io-04-arns-name-resolution-system.md
- ar-io-05-centralization-analysis.md
- ar-io-06-database-architecture.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XQPK4TXcVoXoyFp6sNW2Lr
These three vars (MIN_SUCCESS_COUNT, MAX_PEER_ATTEMPT_COUNT,
CONCURRENCY_LIMIT) were introduced in PE-7945 (f0d127a) for the original
background peer-broadcast path, which has since been superseded by
ArweaveCompositeClient.broadcastChunk and the live CHUNK_POST_* family.

They had zero code consumers but were still plumbed through
docker-compose.yaml, so an operator setting them got silence. Worse, the
startup validation (MAX < MIN throws) could crash boot on a "tuning" attempt
that otherwise did nothing.

Remove the consts + validation and the compose passthrough.

(The only docs referencing these vars were the orphaned AI-generated
ar-io-0X drafts, removed wholesale in a separate PR.)

Verified: zero remaining code references; config test 20/20; eslint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XQPK4TXcVoXoyFp6sNW2Lr
TxChunksDataSource's full-stream read loop terminated only on byte
accounting (bytes < size), advancing by chunkData.chunk.length. A source
or cache returning a zero-length chunk left bytes unchanged, so the loop
re-requested the same offset forever — observed in production as a
5.1M-span Honeycomb trace for tx QY5bDvdGa9Q_GcdxEFlvTJxUW5UPrSyvf6yULjbsv5g
(a 2805-byte, single-chunk L1 tx) made up of ~1.7M repeated ~0.1ms cache
hits on one offset.

Add two forward-progress guards in the read loop:
- abort on a zero-length chunk (primary cause)
- abort once the chunk count exceeds ceil(size / MAX_CHUNK_SIZE) + 1,
  a backstop against pathological tx geometry

Both increment chunk_stream_aborts_total{reason} and destroy the stream
with a descriptive error instead of spinning.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Honeycomb trace 7fd5b41a (tx QY5bDvdGa9Q…, a 2805-byte single-chunk L1
tx) showed the first ReadThroughChunkDataCache.getChunkDataByAny span as a
cache HIT at relative_offset 0 with tx_size 2805, followed by ~1.7M
identical-offset reads. The chunk data store held a poisoned 0-byte file
for (dataRoot, 0): FsChunkDataStore.has() reports a hit on a 0-byte file
and get() serves an empty chunk, so the TxChunksDataSource stream loop
never advanced `bytes` and re-requested the same offset forever.

Root cause: nothing validated chunk length, so a source that once returned
an empty chunk was persisted (set() writes unconditionally) and re-served
indefinitely.

Harden every layer:
- FsChunkDataStore.set: refuse to persist zero-length chunks
- FsChunkDataStore.get / getByAbsoluteOffset: treat an existing 0-byte
  file as a miss, so already-poisoned entries self-heal on next refetch
- ReadThroughChunkDataCache: reject a zero-length chunk from the source
  (don't cache it; throw so the retrieval cascade falls through)
- TxChunksDataSource: only treat a zero-length chunk as fatal when size > 0

New metric chunk_zero_length_total{stage} tracks rejections at
source_fetch / cache_read / cache_write. Updates the prior store test that
asserted empty chunks round-trip (the poison contract).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… absolute-offset zero-length paths

Address CodeRabbit review on #799:
- Document chunkStreamAbortsTotal and chunkZeroLengthTotal (label meanings).
- Add FsChunkDataStore tests: a poisoned 0-byte by-absolute-offset entry
  reads as a miss, and a zero-length set() creates no absolute-offset index.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fix(chunks): reject and self-heal zero-length chunk cache poison
Multi-id transactions(ids:[...]) queries are ~99% of production TOO_MANY_ROWS
failures: `id` is the last sort-key column so `id IN (...)` has no seek and
relies on id_bloom, which lights up most granules (~100 ids reads ~308M rows).
An owner filter alone isn't enough (~13.6M on the main table for 100 ids), but
seeking the owner's slice via owner_projection drops it to ~556K (measured;
verified end-to-end at 590K rows / 35ms under the 10M cap).

ownerProjectionApplies now routes owners+ids through the projection regardless
of tags (the id list bounds the result). optimize_read_in_order=0 is a no-op
here (id queries carry no ORDER BY); the win is purely the owner seek. The
height-windowing fallback is disabled for id queries (it is height-ordered and
needs the cursor predicate, which id queries don't carry), so a whale owner
(>10M footprint) + ids still surfaces the 158 — rare, no worse than before.

Gated by the existing CLICKHOUSE_GQL_OWNER_PROJECTION_ROUTING_ENABLED flag.
Reduces live failures once clients add owners to their id queries; genuinely
ownerless batches still need the schema-level id_bloom / id-ordered-table fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Findings doc: the Implementation section's eligibility wording said `ids`
  must be absent, contradicting the owners+ids extension; describe both
  qualifying shapes and note the windowing fallback is no-id only.
- envs.md: call out that owners+ids does NOT get the height-windowing retry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
feat(gql): extend owner_projection routing to owners+ids queries
…ry search

Cold data retrieved by absolute offset must first locate the block
containing that offset. The chain binary search did this with
~log2(height) sequential GET /block/height/{h} requests to the trusted
node (~1.5s each), frequently exceeding CHUNK_SERVE_DEADLINE_MS and
turning into 504s.

Add a local index over stable_blocks.weave_size (getBlockByWeaveOffset,
backed by the new stable_blocks_weave_size_idx) and consult it first in
ArweaveCompositeClient.binarySearchBlocks. The local result is trusted
only when the immediately-preceding block is present and ends before the
offset (a tight bracket, so no missing block can hide the true
container), and the fetched block's weave_size is re-verified; any gap,
stale index, lookup error, or unstable-tip offset falls back to the
existing chain binary search. This changes only how the block is found
-- the block returned is identical.

Resolves offset->block only. Per-transaction offsets remain chain-
authoritative (/tx/{id}/offset): per-tx weave offsets follow the block's
binary tx-ID sort order (not block_transaction_index) and v1 inline data
is not captured by data_size, so they cannot be derived from local
columns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01ERJ5pSmgZoLj4jEHhweq5B
…nk-post-config

chore(config): remove dead ARWEAVE_PEER_CHUNK_POST_* env vars
…drafts

docs: remove orphaned AI-generated ar-io-0X draft guides
vilenarios and others added 5 commits June 26, 2026 14:36
…c, abort test

- offsets.sql: add `b.height ASC` tiebreaker so an offset that lands exactly on
  a weave_size shared by consecutive empty blocks deterministically resolves to
  the lowest such block. This matches the chain binary search's smallest-height
  selection and keeps the local fast path available on exact end-of-block
  offsets (without it the bracket guard would reject the tie and fall back).
- Add TSDoc to BlockByWeaveOffsetResult and both getBlockByWeaveOffset methods
  documenting the bracket semantics and the undefined fallback contract.
- Add an AbortError regression test asserting the fast path rethrows aborts
  instead of silently degrading into a slow chain walk.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01ERJ5pSmgZoLj4jEHhweq5B
…st path

Adds a labeled counter recording the outcome of each offset->block
resolution in ArweaveCompositeClient.binarySearchBlocks: cache_hit,
local_index_hit, and fallback_{miss,untight,stale,error,no_index}. The
fast-path debug logs are only emitted at debug level, so without this
counter the local-index hit rate and fallback reasons are invisible at
the info level production runs at. Needed to measure the soak.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01ERJ5pSmgZoLj4jEHhweq5B
perf(chunk-offset): resolve offset→block locally to avoid chain binary search
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants