Skip to content

API server: fix duplicate token ids in token listing queries#2088

Open
mannutech wants to merge 1 commit into
mintlayer:masterfrom
mannutech:fix/api-server-duplicate-token-ids
Open

API server: fix duplicate token ids in token listing queries#2088
mannutech wants to merge 1 commit into
mintlayer:masterfrom
mannutech:fix/api-server-duplicate-token-ids

Conversation

@mannutech

@mannutech mannutech commented Jul 2, 2026

Copy link
Copy Markdown

Fixes #1982

What's wrong

GET /api/v2/token returns the same token id multiple times. This is still reproducible right now on the public testnet server — https://api-server-lovelace.mintlayer.org/api/v2/token?offset=0&items=100 returns the id from the original report (tmltk1rz9hmgm...sugcfq6) 4 times, and other ids up to 8 times.

Why it happens

ml.fungible_token and ml.nft_issuance store one row per token state change — their primary key is (id, block_height). Point queries already handle this correctly (get_fungible_token_issuance picks the latest row), but the two listing queries select ids straight off the tables. So a token appears once per issuance/mint/lock/authority change — which is exactly why the issue shows repeat counts of 2x, 3x, 4x rather than uniform duplication: the count is the number of state changes that token has had.

The in-memory backend keys its maps by token id and was never affected. The two backends had quietly diverged on these two methods, and no shared test covered them.

Digging into the query exposed two more pagination bugs in the same place:

  • count_tokens, which decides where the NFT half of the UNION ALL starts, counted rows instead of distinct tokens — so the fungible→NFT handoff point drifts as tokens accumulate history, even with the ids deduplicated.
  • The NFT half's LIMIT ($2 + $1 - count) exceeds the requested page size once the offset points past the last fungible token. With 3 fungible tokens, offset=5&items=3 returns 5 NFTs instead of 3.

The fix

In get_token_ids and get_token_ids_by_ticker:

  • SELECT DISTINCT in both halves of the union (NFTs accumulate rows too, from owner changes)
  • count(DISTINCT token_id) in count_tokens
  • NFT LIMIT clamped to the page size: GREATEST(LEAST($2, $2 + $1 - count), 0)

A token's ticker never changes, so plain DISTINCT is enough for the ticker variant — no latest-row handling needed there.

How it's tested

  • storage-test-suitetoken_ids_dedup_and_pagination, running against both backends: seeds tokens and NFTs with multi-row histories, checks the exact listings, walks every page size end to end, and covers a ticker that only NFTs use (fungible count = 0), offsets past the end, and zero-size pages. Without the fix it fails on Postgres with the same repeated-id pattern as the issue report, and passes on in-memory — so it's the regression test and the missing backend-parity test in one.
  • stack-test-suiteno_duplicate_ids_for_tokens_with_state_changes: over HTTP, /token and /token/ticker/:ticker return a minted (multi-row) token exactly once.

@mannutech mannutech force-pushed the fix/api-server-duplicate-token-ids branch from 03a8fc9 to bb6b240 Compare July 2, 2026 18:56
The Postgres fungible_token and nft_issuance tables keep one row per
token state change, but get_token_ids/get_token_ids_by_ticker selected
ids without collapsing history, so every token appeared once per state
change. Deduplicate both halves of the union, count distinct tokens
(not rows) when computing the NFT half's offset, and clamp its limit
to the requested page size, which used to be exceeded when the offset
landed past the fungible tokens.

Add a storage-test-suite trial covering dedup and full page walks on
both backends, and a stack-test-suite HTTP test for the /token and
/token/ticker/:ticker endpoints.

Fixes mintlayer#1982
@mannutech mannutech force-pushed the fix/api-server-duplicate-token-ids branch from bb6b240 to cb2c43e Compare July 2, 2026 19:03
@mannutech mannutech marked this pull request as ready for review July 2, 2026 19:09
@mannutech

mannutech commented Jul 2, 2026

Copy link
Copy Markdown
Author

@ImplOfAnImplmeasured the cost of the added DISTINCTs on the real schema with EXPLAIN ANALYZE.

At current production size (~400 tokens) the new query runs in 0.25 ms. At ~1000x that (100k tokens / 1.05M fungible rows, 100k NFTs / 350k rows), items=100:

query old new
offset=0 79 ms 330 ms
offset=50000 29 ms 355 ms
ticker %TK1% 148 ms 439 ms

Almost all of the difference is count(DISTINCT token_id) in the CTE, which aggregates single-threaded while the old count(token_id) used a parallel scan (both were always full scans). The DISTINCT on the paged select is cheap: Unique runs over the PK index scan and stops after the page. If this ever matters, count(*) over a SELECT DISTINCT subquery gets parallelism back; happy to do that as a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API server's /token endpoint returns duplicate token ids

1 participant