fix: grow posting_lists before indexed access in FTS with_position builder#7330
Open
Siriapps wants to merge 2 commits into
Open
fix: grow posting_lists before indexed access in FTS with_position builder#7330Siriapps wants to merge 2 commits into
Siriapps wants to merge 2 commits into
Conversation
…7313 Add a unit test that mirrors the production failure (posting_lists.len=1731, token_id=4456) when with_position indexing encounters a stale next_id. Co-authored-by: Cursor <cursoragent@cursor.com>
…ilder When token_id exceeds posting_lists.len() during with_position indexing (e.g. stale next_id from legacy FTS partitions), resize posting_lists to token_id + 1 before access instead of growing only on exact equality. Fixes lance-format#7313 Co-authored-by: Cursor <cursoragent@cursor.com>
Author
|
Hi @sinianluoye this addresses #7313. The fix grows posting_lists before indexed access when token_id exceeds the current vector length (stale next_id from legacy partitions). Would appreciate a review when you have time. |
sinianluoye
approved these changes
Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes an
index out of boundspanic inIndexWorker::process_batch()when FTS indexing runs withwith_position: true(the default). Thewith_positionbranch now growsposting_listswithresize_with(token_idx + 1, ...)before indexing bytoken_id, matching the pattern already used in the non-position branch and inmerge_from.Why was this PR needed?
When
tokens.add()returns atoken_idgreater thanposting_lists.len()— e.g. after loading a legacy FTS partition with a stalenext_idduringoptimize_indices— the old code only appended a posting list whentoken_id == posting_lists.len(). That skips growth for gaps and panics atposting_lists[token_id].Reported in production with
posting_lists.len()=1731andtoken_id=4456(#7313). Changing==to>=with a singlepushis insufficient for that gap;resize_with(token_idx + 1, ...)is required.What are the relevant issue numbers?
Closes #7313
Does this PR meet the acceptance criteria?
Per CONTRIBUTING.md:
test_process_batch_with_position_handles_token_id_gaps)cargo test -p lance-index)cargo fmt --all)fix:)Suggested label:
critical-fix(crash duringoptimize_indiceson FTS indexes withwith_position: true)