KEMBAR78
New mutable null index by coszio · Pull Request #6954 · qdrant/qdrant · GitHub
Skip to content

Conversation

@coszio
Copy link
Contributor

@coszio coszio commented Jul 29, 2025

Current null index does not buffer updates at all, so there is no synchronization as to when data is written to disk.

This mutable null index keeps in-memory values in a roaring bitmap for efficient reads, and stores in a buffered dynamic mmap flags.

@coszio coszio force-pushed the use-mutable-null-index branch from 232ffdb to 5cd0b2c Compare August 1, 2025 22:54
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I spotted some mistakes in this PR, for which I've submitted review remarks below.

I'll start to pick these up tomorrow morning if you did not handle them yet already.

In local testing I still see this branch failing quite a bit, like we did without deferring flag writes, though it is less often. I hope it'll be even better when these review remarks are resolved.

There's also a bunch of CI tests that still are not happy.

Thanks though!

@coszio coszio requested review from generall and timvisee August 4, 2025 19:07
@coszio coszio marked this pull request as ready for review August 4, 2025 19:20
@coderabbitai

This comment was marked as resolved.

@timvisee timvisee force-pushed the use-mutable-null-index branch from 09ca2b5 to 8486f00 Compare August 5, 2025 12:36
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've extensively tested this together with #6975 and problems disappeared. 🙌

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/segment/src/index/field_index/null_index/mutable_null_index.rs (1)

159-177: Consider consistency in hardware counter usage.

The remove_point method correctly updates bitmaps and bumps total_point_count, but it creates a disposable hardware counter instead of accepting one as a parameter like add_point does. This inconsistency might make the API less predictable.

Consider updating the method signature to accept a hardware counter parameter:

-    pub fn remove_point(&mut self, id: PointOffsetType) -> OperationResult<()> {
+    pub fn remove_point(&mut self, id: PointOffsetType, hw_counter: &HardwareCounterCell) -> OperationResult<()> {
         // Update bitmaps immediately
         let storage = self.storage_mut()?;
         storage.has_values_flags.set(id, false);
         storage.is_null_flags.set(id, false);

         // ... existing logic ...

         // Account for I/O cost as if we were writing to disk now
-        let hw_counter = HardwareCounterCell::disposable();
         hw_counter.payload_index_io_write_counter().incr_delta(2);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8486f00 and eaf5712.

📒 Files selected for processing (1)
  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs (16 hunks)
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: JojiiOfficial
PR: qdrant/qdrant#5951
File: lib/api/src/grpc/qdrant.rs:6493-6496
Timestamp: 2025-03-12T09:25:16.340Z
Learning: The `payload_index_io_read` field in `HardwareUsage` was introduced in PR #5951 and is safe to modify as it was not used in production before this PR.
Learnt from: timvisee
PR: qdrant/qdrant#6150
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:62-71
Timestamp: 2025-03-12T16:23:44.715Z
Learning: In the MutableIdTracker implementation in Qdrant, the error case where versions file exists but mappings file is missing is considered a system-level error (potential corruption) rather than a user error, and is handled via logging and debug assertions without surfacing user-facing errors.
Learnt from: generall
PR: qdrant/qdrant#6088
File: lib/segment/src/index/field_index/null_index/mmap_null_index.rs:141-151
Timestamp: 2025-03-04T10:38:46.049Z
Learning: For `DynamicMmapFlags` in the Qdrant codebase, out-of-bounds access is automatically interpreted as `false`. This is the expected behavior and does not require explicit bounds checking in methods that use `DynamicMmapFlags::get()`.
Learnt from: ffuugoo
PR: qdrant/qdrant#6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.
📚 Learning: in volatile sparse vector storage implementations, the `total_sparse_size` field should be updated w...
Learnt from: timvisee
PR: qdrant/qdrant#6569
File: lib/segment/src/vector_storage/sparse/volatile_sparse_vector_storage.rs:56-76
Timestamp: 2025-05-21T12:52:15.229Z
Learning: In volatile sparse vector storage implementations, the `total_sparse_size` field should be updated when vectors are inserted, updated, or deleted to maintain an accurate count of the total number of non-zero elements across all vectors.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in test code for qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified id ...
Learnt from: generall
PR: qdrant/qdrant#6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the qdrant codebase, storage backends may have different error handling approaches. specifically,...
Learnt from: timvisee
PR: qdrant/qdrant#6444
File: lib/segment/src/index/field_index/map_index/immutable_map_index.rs:373-375
Timestamp: 2025-05-08T10:34:42.223Z
Learning: In the Qdrant codebase, storage backends may have different error handling approaches. Specifically, the mmap variant of `remove_point` method does not return errors, while the RocksDB variant does propagate errors using the `?` operator. These differences are intentional by design.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the bitpacking crate, the bitpacker::decompress_strictly_sorted function takes an option
Learnt from: coszio
PR: qdrant/qdrant#6528
File: lib/posting_list/src/view.rs:118-118
Timestamp: 2025-05-19T14:40:20.068Z
Learning: In the bitpacking crate, the BitPacker::decompress_strictly_sorted function takes an Option<PointOffsetType> as its first parameter, which means using checked_sub(1) without unwrapping is intentional and correct.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: the method `is_none_or` exists in the qdrant codebase and compiles correctly, despite not being part...
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/index/hnsw_index/point_scorer.rs:116-121
Timestamp: 2025-04-22T23:17:41.734Z
Learning: The method `is_none_or` exists in the Qdrant codebase and compiles correctly, despite not being part of standard Rust's Option type. Code reviews should verify compilation issues before reporting them.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the blob trait implementations for gridstore (lib/gridstore/src/blob.rs), the maintainer prefers ...
Learnt from: timvisee
PR: qdrant/qdrant#6644
File: lib/gridstore/src/blob.rs:41-57
Timestamp: 2025-06-06T07:52:38.478Z
Learning: In the Blob trait implementations for gridstore (lib/gridstore/src/blob.rs), the maintainer prefers using assert! for input validation over Result-based error handling, as they don't expect invalid inputs in real scenarios and changing to Result would require updating the entire trait interface.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the qdrant codebase, modules can be organized as directories with their own `mod.rs` file, rather...
Learnt from: generall
PR: qdrant/qdrant#6088
File: lib/segment/src/index/field_index/mod.rs:18-18
Timestamp: 2025-03-03T15:54:45.553Z
Learning: In the Qdrant codebase, modules can be organized as directories with their own `mod.rs` file, rather than single files, following standard Rust module organization patterns.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: pattern matching on `&option` with `some(x)` is valid and idiomatic in rust. the bound variable `...
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6663
File: lib/quantization/src/encoded_vectors_binary.rs:270-276
Timestamp: 2025-06-18T09:23:14.977Z
Learning: Pattern matching on `&Option<T>` with `Some(x)` is valid and idiomatic in Rust. The bound variable `x` will have type `&T` and can be used to access fields through automatic dereferencing. No need to use `.as_ref()` in such cases.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in qdrant's mutableidtracker, fileversiontracker flushers are designed to capture the pending versio...
Learnt from: ffuugoo
PR: qdrant/qdrant#6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measur...
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: the `populate()` method in `quantizedmmapstorage` does not return a `result` because the underlying ...
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` does not return a `Result` because the underlying `mmap.populate()` method doesn't return a Result either. Adding an unnecessary `Ok(())` would cause Clippy to complain.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: the `populate()` method in `quantizedmmapstorage` correctly returns void instead of a `result` becau...
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` correctly returns void instead of a `Result` because it directly implements the `Madviseable` trait which defines `populate(&self)` without a return type. Adding an unnecessary `Ok(())` return would trigger Clippy warnings about unnecessary Result wrapping.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in structpayloadindex, checking if a field exists in `self.config.indexed_fields` is sufficient to d...
Learnt from: generall
PR: qdrant/qdrant#6403
File: lib/segment/src/index/struct_payload_index.rs:485-497
Timestamp: 2025-04-20T18:32:43.471Z
Learning: In StructPayloadIndex, checking if a field exists in `self.config.indexed_fields` is sufficient to determine if an index should exist, as the initialization process in `load_all_fields()` ensures that `field_indexes` is synchronized with the configuration, automatically rebuilding any missing indexes.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: the `payload_index_io_read` field in `hardwareusage` was introduced in pr #5951 and is safe to modif...
Learnt from: JojiiOfficial
PR: qdrant/qdrant#5951
File: lib/api/src/grpc/qdrant.rs:6493-6496
Timestamp: 2025-03-12T09:25:16.340Z
Learning: The `payload_index_io_read` field in `HardwareUsage` was introduced in PR #5951 and is safe to modify as it was not used in production before this PR.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
📚 Learning: in the subgraph_connectivity method in graph_layers_builder.rs, the previous_visited_points vector a...
Learnt from: generall
PR: qdrant/qdrant#6479
File: lib/segment/src/index/hnsw_index/graph_layers_builder.rs:137-182
Timestamp: 2025-05-30T15:26:14.488Z
Learning: In the subgraph_connectivity method in graph_layers_builder.rs, the previous_visited_points vector and visited BitVec entry point marking are automatically re-initialized at the start of each retry loop iteration, so manual clearing is not needed.

Applied to files:

  • lib/segment/src/index/field_index/null_index/mutable_null_index.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
🔇 Additional comments (11)
lib/segment/src/index/field_index/null_index/mutable_null_index.rs (11)

22-35: LGTM! Well-structured design with appropriate abstractions.

The struct design correctly encapsulates the buffered flags approach, and the documentation accurately describes the use of roaring bitmaps for in-memory operations with buffered updates.


67-92: LGTM! Proper initialization and error handling.

The open_or_create method correctly initializes the BufferedDynamicFlags instances and handles directory creation appropriately. The approach aligns with the requirement to not create files when they shouldn't exist.


94-104: LGTM! Good defensive programming with error-returning accessors.

The storage() and storage_mut() methods properly handle the case where storage is uninitialized, returning meaningful error messages instead of panicking.


106-157: LGTM! Correct payload processing and total point count handling.

The add_point method correctly processes different JSON value types to determine null and value presence. The total_point_count bumping addresses the cardinality estimation requirements mentioned in past reviews.


179-235: LGTM! Safe accessor methods with appropriate error handling.

The accessor methods correctly handle uninitialized storage using safe patterns like is_ok_and() and map_or(). The populate() no-op and is_on_disk() returning false are appropriate for the buffered approach.


237-268: LGTM! Proper implementation of PayloadFieldIndex interface.

The implementation correctly handles uninitialized storage and provides appropriate flushing behavior by aggregating flushers from both flag structures.


284-326: LGTM! Correct filter logic for null index conditions.

The filter method correctly handles both is_empty and is_null conditions, returning appropriate iterators over the buffered flags. The logic properly distinguishes between points with/without values and points with/without null values.


328-390: LGTM! Sound cardinality estimation logic.

The cardinality estimation correctly calculates counts from the buffered flags and provides appropriate fallback estimates using the total_point_count. The primary clause annotations are properly set for query optimization.


402-426: LGTM! Builder correctly flushes buffered updates on finalize.

The builder implementation is well-designed, with the finalize() method properly ensuring that buffered updates are flushed to disk before returning the completed index. This is crucial for data persistence in the buffered approach.


437-548: LGTM! Comprehensive test coverage for null index functionality.

The main test thoroughly validates the null index behavior across different payload types (null values, null in arrays, empty payloads, and boolean values). The assertions correctly verify both filtering and cardinality estimation work as expected.


550-572: LGTM! Important test for manual buffer flushing behavior.

This test specifically validates the buffered approach by manually calling the flusher and verifying the data persistence. This is crucial for ensuring the new buffered design works correctly.

@timvisee timvisee force-pushed the use-mutable-null-index branch from eaf5712 to 1ba3a6a Compare August 5, 2025 14:41
@timvisee timvisee merged commit 9fb92ca into dev Aug 5, 2025
16 checks passed
@timvisee timvisee deleted the use-mutable-null-index branch August 5, 2025 15:04
timvisee added a commit that referenced this pull request Aug 11, 2025
* Introduce BufferedDynamicFlags

* clippy and extra changes

* fix flushing

* add mutable index impl

* use MutableNullIndex instead of MmapNullIndex

* use BufferedDynamicFlags

* remove mmap_null_index

* take updates before returning the flusher

* bump total_point_count on add and remove point

* improve test case

* Only clean up mutable null index files if index exists

* Remove mention of cached in context of length

* refactor flags into a single `Option<Storage>`

---------

Co-authored-by: timvisee <tim@visee.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants