Kerollmops
69c931334f
Fix the error messages categorization with invalid NDJson
2024-12-11 12:02:48 +01:00
Kerollmops
d683f5980c
Do not duplicate NDJson when unecessary
2024-12-11 12:02:48 +01:00
ManyTheFish
c614d0dd35
Add context when returning an error
2024-12-11 10:55:39 +01:00
ManyTheFish
479607e5dd
Convert update files from OBKV to ndjson
2024-12-11 10:55:39 +01:00
Kerollmops
bb00e70087
Reintroduce the document addition logs
2024-12-11 10:39:04 +01:00
Kerollmops
aeb6b74725
Make sure we use an FxHashBuilder on the Value
2024-12-10 15:52:22 +01:00
Kerollmops
a751972c57
Prefer using a stable than a random hash builder
2024-12-10 14:25:53 +01:00
Kerollmops
6b269795d2
Update bumparaw-collections to 0.1.2
2024-12-10 14:25:13 +01:00
Louis Dureuil
d075be798a
Fix tests
2024-12-10 13:39:07 +01:00
Kerollmops
89637bcaaf
Use bumparaw-collections in Meilisearch/milli
2024-12-10 11:52:20 +01:00
Louis Dureuil
866ac91be3
Fix error messages
2024-12-10 11:06:58 +01:00
Louis Dureuil
e610af36aa
User failure for documents with docid of ==512 bytes
2024-12-10 11:06:24 +01:00
Louis Dureuil
7cf6707ed3
Extend test to add the ==512 bytes case
2024-12-10 11:05:42 +01:00
Kushal Kumar
34254b42b6
refactor: use test configuration on import
...
Signed-off-by: Kushal Kumar <kushalkumargupta4@gmail.com>
2024-12-10 00:00:43 +05:30
ManyTheFish
07f42e8057
Do not index a filed count when no word is counted
2024-12-09 15:45:12 +01:00
ManyTheFish
71f59749dc
Reduce union impact in merging
2024-12-09 15:44:06 +01:00
meili-bors[bot]
3b0b9967f6
Merge #5141
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 16s
Test suite / Run tests in debug (push) Failing after 14s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 44s
Test suite / Run Rustfmt (push) Successful in 9m52s
Test suite / Run Clippy (push) Successful in 1h2m24s
5141: Use the right amount of max memory and not impact the settings r=curquiza a=Kerollmops
Fixes #5132 . Related to #5125 .
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-09 10:40:46 +00:00
meili-bors[bot]
123b54a178
Merge #5056
...
5056: Attach index name in error message r=irevoire a=airycanon
# Pull Request
## Related issue
Fixes #4392
## What does this PR do?
- ...
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: airycanon <airycanon@airycanon.me>
2024-12-09 09:59:12 +00:00
Kerollmops
f5dd8dfc3e
Rollback max memory usage changes
2024-12-09 10:26:30 +01:00
Kerollmops
bcfed70888
Revert "Merge #5125 "
...
This reverts commit 9a9383643f
, reversing
changes made to cac355bfa7
.
2024-12-09 10:08:02 +01:00
Louis Dureuil
08f2c696b0
Allow xtask bench to proceed without a commit message
2024-12-09 09:36:59 +01:00
James Hiew
54e34beac6
Check attributes are filterable before evaluating search query
2024-12-07 21:13:13 +00:00
Kushal Kumar
c0aa018c87
tests: split test in separate file
...
Signed-off-by: Kushal Kumar <kushalkumargupta4@gmail.com>
2024-12-08 00:32:32 +05:30
airycanon
b75f1f4c17
fix tests
...
# Conflicts:
# crates/index-scheduler/src/batch.rs
# crates/index-scheduler/src/snapshots/lib.rs/fail_in_process_batch_for_document_deletion/after_removing_the_documents.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_bad_primary_key/fifth_task_succeeds.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_bad_primary_key/fourth_task_fails.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key/second_task_fails.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key/third_task_fails.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_multiple_primary_key_batch_wrong_key/second_and_third_tasks_fails.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/all_other_tasks_succeeds.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/second_task_fails.snap
# crates/index-scheduler/src/snapshots/lib.rs/test_document_addition_with_set_and_null_primary_key_inference_works/third_task_succeeds.snap
# Conflicts:
# crates/index-scheduler/src/batch.rs
# crates/meilisearch/src/search/mod.rs
# crates/meilisearch/tests/vector/mod.rs
# Conflicts:
# crates/index-scheduler/src/batch.rs
2024-12-06 02:03:02 +08:00
airycanon
95ed079761
attach index name in errors
...
# Conflicts:
# crates/index-scheduler/src/batch.rs
# Conflicts:
# crates/index-scheduler/src/batch.rs
# crates/meilisearch/src/search/mod.rs
2024-12-06 01:12:13 +08:00
meili-bors[bot]
4a082683df
Merge #5131
...
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 21s
Test suite / Tests on ubuntu-20.04 (push) Failing after 10s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Run Rustfmt (push) Successful in 1m25s
Test suite / Run Clippy (push) Successful in 5m54s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
5131: Ignore documents whose selected fields didn't change r=dureuill a=dureuill
Attempts to improve the new indexer performance by ignoring documents whose selected fields didn't change:
- Add `Update::has_changed_for_fields` function
- Ignore documents whose searchable attributes didn't change for word docids and word pair proximity extraction
- Ignore documents whose faceted attributes didn't change for facet extraction
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-12-05 16:04:16 +00:00
meili-bors[bot]
26be5e0733
Merge #5123
...
5123: Fix batch details r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5079
Fixes https://github.com/meilisearch/meilisearch/issues/5112
## What does this PR do?
- Make the processing tasks actually processing in the stats of the batch instead of enqueued
- Stop counting one extra task for all non-prioritized batches in the stats
- Add a test
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-12-05 15:21:55 +00:00
Louis Dureuil
bd5110a2fe
Fix clippy warnings
2024-12-05 16:13:07 +01:00
Louis Dureuil
fa8b9acdf6
Ignore documents that didn't change in facets
2024-12-05 16:12:52 +01:00
Louis Dureuil
2b74d1824b
Ignore documents that didn't change any field in word pair proximity
2024-12-05 15:56:22 +01:00
Louis Dureuil
c77b00d3ac
Don't extract word docids when no searchable changed
2024-12-05 15:51:58 +01:00
Louis Dureuil
c77073efcc
Update::has_changed_for_fields
2024-12-05 15:50:12 +01:00
meili-bors[bot]
1537323eb9
Merge #5119
...
5119: Settings opt out error msg r=Kerollmops a=ManyTheFish
# Pull Request
## Related issue
PRD: https://meilisearch.notion.site/API-usage-Settings-to-opt-out-indexing-features-fff4b06b651f8108ade3f858aeb16b14?pvs=4
## What does this PR do?
Add a new error code and message when the user tries a facet search on an index where the facet search is disabled:
```json
{
"message": "The facet search is disabled for this index",
"code": "facet_search_disabled",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid_facet_search_disabled "
}
```
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-12-05 13:51:11 +00:00
ManyTheFish
a0a3b55700
Change error code
2024-12-05 14:48:29 +01:00
Tamo
214b51de87
try to fix the snapshot on demand flaky test
2024-12-05 14:45:54 +01:00
Tamo
95975944d7
fix the dumps missing the empty swap index tasks
2024-12-05 14:23:38 +01:00
meili-bors[bot]
9a9383643f
Merge #5125
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 37s
Test suite / Tests on ubuntu-20.04 (push) Failing after 15s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 12s
Test suite / Run Rustfmt (push) Successful in 2m14s
Test suite / Run Clippy (push) Successful in 12m4s
5125: Change the default max memory usage to 5% of the total memory r=ManyTheFish a=Kerollmops
After thorough testing, we found that giving 5% of the total available memory to allocate resident memory (caches and channels) is the best approach.
The main reason is that the new indexer is highly memory-map oriented, with LMDB, and reads the database while performing the indexation. So, by allowing the maximum amount of memory available to LMDB and the OS, it will perform the key-value store reads and all other indexation operations faster by keeping more pages hot in the cache. In #5124 , we also sorted the entries to merge to improve the read speed of LMDB.
This is common in database management systems: Reading stuff on the disk is much faster when done in lexicographic order (the default sorted order of key values). The entries have a great chance of already being in the OS memory cache, as they were loaded in a previous read, and reading stuff on the disk is very slow compared to reading memory.
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-05 10:11:25 +00:00
meili-bors[bot]
cac355bfa7
Merge #5124
...
5124: Optimize Prefixes and Merges r=ManyTheFish a=Kerollmops
In this PR, we plan to optimize the read of LMDB to use read the entries in lexicographic order and better use the memory-mapping OS cache:
- Optimize the prefix generation for word position docids (`@manythefish)`
- Optimize the parallel merging of the caches to sort entries before merging the caches (`@kerollmops)`
## Benchmarks on 1cpu 2gb gpo3 (5k IOps)
Before on the tag meilisearch-v1.12.0-rc.3.
```
word_position_docids:merge_and_send_docids: 988s
compute_word_fst: 23.3s
word_pair_proximity_docids:merge_and_send_docids: 428s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 76.3s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 429s
```
After sorting the whole `HashMap`s in a `Vec` on this branch.
```
word_position_docids:merge_and_send_docids: 202s
compute_word_fst: 20.4s
word_pair_proximity_docids:merge_and_send_docids: 427s
compute_word_prefix_fid_docids:recompute_modified_prefixes: 65.5s
compute_word_prefix_position_docids:recompute_modified_prefixes:from_prefixes: 62.5s
```
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-05 09:35:52 +00:00
Kerollmops
9020a50df8
Change the default max memory usage to 5% of the total memory
2024-12-05 10:14:46 +01:00
Kerollmops
52843123d4
Clean up and remove the non-sorted merge_caches function
2024-12-05 10:03:05 +01:00
meili-bors[bot]
6298db5bea
Merge #5113
...
5113: Fix the Minimum BBQueue channel threshold r=Kerollmops a=Kerollmops
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-12-05 09:01:02 +00:00
meili-bors[bot]
a003a0934a
Merge #5121
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 11s
Test suite / Run tests in debug (push) Failing after 9s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m19s
Test suite / Run Clippy (push) Successful in 5m32s
5121: Make the tasks pulling timeout configurable r=dureuill a=Kerollmops
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-04 17:04:14 +00:00
Louis Dureuil
3a11e39c01
Force max_memory to a min of 100MiB
2024-12-04 17:53:30 +01:00
Louis Dureuil
5f896b1050
Fix geo when spilling
2024-12-04 17:51:12 +01:00
Kerollmops
d0c4e6da6b
Make clippy happy
2024-12-04 17:39:10 +01:00
Kerollmops
2da5584bb5
Make the tasks pulling timeout configurable
2024-12-04 17:39:07 +01:00
Kerollmops
2e32d0474c
Lexicographically sort all the map to merge
2024-12-04 17:05:11 +01:00
Kerollmops
cb99ac6f7e
Consume vec instead of draining
2024-12-04 17:00:22 +01:00
Kerollmops
be411435f5
Use the merge_caches_alt function in the docids merging
2024-12-04 16:37:29 +01:00
Kerollmops
29ef164530
Introduce a new semi ordered merge function
2024-12-04 16:33:35 +01:00
ManyTheFish
739c52a3cd
Replace HashSets by BTreeSets for the prefixes
2024-12-04 16:16:48 +01:00
Tamo
7a2af06b1e
update the impacted snapshots
2024-12-04 15:52:24 +01:00
Tamo
cb0c3a5aad
stop adding one enqueued tasks to all unprioritized batches
2024-12-04 15:48:28 +01:00
Tamo
cbcf6c9ba3
make the processing tasks as processing in a batch
2024-12-04 14:48:48 +01:00
Tamo
bf742d81cf
add a test
2024-12-04 14:47:02 +01:00
ManyTheFish
fc1df5793c
fix tests
2024-12-04 14:35:20 +01:00
meili-bors[bot]
3ded069042
Merge #5122
...
5122: Yield the BBQueue writing loop r=ManyTheFish a=Kerollmops
We prefer yielding to let the writing thread do its job instead of spin looping.
Co-authored-by: Kerollmops <clement@meilisearch.com>
2024-12-04 13:33:51 +00:00
Kerollmops
261d2ceb06
Yield the BBQueue writer instead of spin looping
2024-12-04 14:16:40 +01:00
meili-bors[bot]
5b8cd68abe
Merge #5110
...
5110: Increase margin on deletion of task r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5077
## What does this PR do?
- Increase the margin we keep to enqueue task deletion
The issue was that we had not enough space on the reserved memory to write both the batch and the deletion task we just enqueued.
We could fix it only for this test as it’s not an issue in production where we have 10GiB of margin, but I thought it wasn’t a bad idea either to increase our margin a bit since we’re effectively writing more to lmdb.
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-12-04 12:54:48 +00:00
ManyTheFish
953a82ca04
Add new error message
2024-12-04 11:15:29 +01:00
Kerollmops
96831ed9bb
Send the WakeUp message if necessary in the reserve function
2024-12-04 11:03:01 +01:00
Kerollmops
0459b1a242
Change the reserve and grant function to accept a closure
2024-12-04 10:32:25 +01:00
Kerollmops
8ecb726683
Fix the minimun BBQueue channel threshold
2024-12-03 15:49:11 +01:00
Clément Renault
0ad2f57a92
Update bbqueue repo to point to the meilisearch org
2024-12-03 12:00:04 +01:00
Tamo
71d53f413f
increase the margin allowed to delete task
2024-12-03 11:07:03 +01:00
meili-bors[bot]
054622bd16
Merge #5094
...
5094: Implement a bbqueue channel between the extractors and the writer r=dureuill a=Kerollmops
This PR switches from a bounded crossbeam channel only with allocated entries for the communication between the extractors and the writer to a [BBQueue](https://github.com/jamesmunns/bbqueue )-based system with a Single Producer Single Consumer kind of Circular/Ring Buffers channel.
- [x] Implement the BBQueue channel system...
- [x] with a crossbeam channel to wake up the receiver.
- [x] Manage the BBQueue allocated memory dynamically.
- [x] Support content that doesn't fit in the bbqueues.
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-12-03 08:00:55 +00:00
Louis Dureuil
e905a72d73
remove mimalloc on Windows
2024-12-02 18:13:56 +01:00
meili-bors[bot]
2e879c1df8
Merge #5109
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 11s
Test suite / Run tests in debug (push) Failing after 11s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m22s
Test suite / Run Clippy (push) Successful in 6m29s
5109: Fix autobatch r=dureuill a=dureuill
Fixes most SDK tests and flaky failures
Changes:
- Make sure that the settings are not autobatched with document operations, as the new indexer no longer supports this operating mode
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-12-02 16:30:51 +00:00
Louis Dureuil
d040aff101
Stop allocating 1GiB for documents
2024-12-02 16:30:14 +01:00
Tamo
beeb31ce41
Update crates/index-scheduler/src/lib.rs
2024-12-02 15:32:16 +01:00
Louis Dureuil
057143214d
Fix warnings
2024-12-02 14:42:31 +01:00
Louis Dureuil
6a1d26a60c
Update autobatching tests
2024-12-02 14:15:15 +01:00
Louis Dureuil
d78f4666a0
Fix autobatching of documents and settings
2024-12-02 12:25:01 +01:00
Tamo
a439fa3e1a
While spamming the batches route we could see a processing batch becoming missing and then finished, this commit ensures the batches goes from processing to finished directly
2024-12-02 12:02:16 +01:00
Clément Renault
767259be7e
Prefer returning a abort indexation rather than throwing a panic
2024-12-02 11:53:42 +01:00
Clément Renault
e9f34fb4b1
Make the frame consumer pulling fair
2024-12-02 11:49:01 +01:00
Clément Renault
d5c07ef7b3
Manage key length conversion error correctly
2024-12-02 11:03:00 +01:00
Clément Renault
5e218f3f4d
Remove a sync_all (mark my words)
2024-12-02 11:03:00 +01:00
Clément Renault
bcab61ab1d
Do spurious wake ups on the receiver side
2024-12-02 11:03:00 +01:00
Clément Renault
263c5a348e
Move the spin looping for BBQueue frames into a dedicated function
2024-12-02 10:33:49 +01:00
Clément Renault
be7d2fbe63
Move the EntryHeader up in the file and document the safety related to the size
2024-12-02 10:19:11 +01:00
Clément Renault
f7f9a131e4
Improve copying bytes into aligned memory area
2024-12-02 10:15:58 +01:00
Clément Renault
5df5eb2db2
Clarify a method name
2024-12-02 10:10:48 +01:00
Clément Renault
30eb0e5b5b
Rename recv and read methods to recv_action and recv_frame
2024-12-02 10:08:01 +01:00
Clément Renault
5b860cb989
Fix english in the doc
2024-12-02 10:06:35 +01:00
Clément Renault
76d0623b11
Reduce the number of unwraps
2024-12-02 10:05:06 +01:00
Clément Renault
db4eaf4d2d
Rename serialize_into into serialize_into_writer
2024-12-02 10:03:27 +01:00
Clément Renault
13f21206a6
Call the serialize_into_writer method from the serialize_into one
2024-12-02 10:03:01 +01:00
Clément Renault
14ee7aa84c
Make sure the BBQueue is at least 50 MiB
2024-11-28 18:02:48 +01:00
Clément Renault
8a35cd1743
Adjust the BBQueue buffers to use 2% instead of 10%
2024-11-28 16:00:15 +01:00
meili-bors[bot]
8d33af1dff
Merge #5102
...
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 24s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 28s
Test suite / Run tests in debug (push) Failing after 28s
Test suite / Run Rustfmt (push) Successful in 3m52s
Test suite / Run Clippy (push) Successful in 9m8s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Has been cancelled
5102: Update mini-dashboard to v0.2.16 version r=curquiza a=curquiza
Fixes https://github.com/meilisearch/meilisearch/issues/5093
Fixes this bug: https://github.com/meilisearch/mini-dashboard/issues/563
Co-authored-by: curquiza <clementine@meilisearch.com>
2024-11-28 14:57:27 +00:00
Clément Renault
3c7ac093d3
Take the BBQueue capacity into account in the max memory
2024-11-28 15:43:14 +01:00
Clément Renault
b57dd5c58e
Remove the Vector variant and use the Vectors
2024-11-28 15:20:43 +01:00
ManyTheFish
90b428a8c3
Apply change requests
2024-11-28 15:16:13 +01:00
Clément Renault
096a28656e
Fix a bug around deleting all the vectors of a doc
2024-11-28 15:15:06 +01:00
curquiza
3dc87f5baa
Update mini-dashboard to v0.2.16 version
2024-11-28 14:33:05 +01:00
Clément Renault
cc4bd54669
Correctly construct the Embeddings struct
2024-11-28 13:53:25 +01:00
ManyTheFish
5383f41bba
Polish test_setting_routes!
2024-11-28 12:04:21 +01:00
Clément Renault
58eab9a018
Send large payload through crossbeam
2024-11-28 12:01:06 +01:00
ManyTheFish
9f36ffcbdb
Polish make_setting_routes!
2024-11-28 11:44:09 +01:00
ManyTheFish
68c4717e21
Change the settings tests and macros to avoid oversights
2024-11-28 11:34:35 +01:00
Clément Renault
5c488e20cc
Send the geo rtree through crossbeam channel
2024-11-27 18:03:45 +01:00
Clément Renault
da650f834e
Plug the NoPanicThreadPool in the tests and benchmarks
2024-11-27 17:04:49 +01:00
Clément Renault
e83534a430
Fix the indexer::index to correctly use the rayon::ThreadPool
2024-11-27 16:27:43 +01:00
Clément Renault
98d4a2909e
Fix the way we spawn the rayon threadpool
2024-11-27 16:05:44 +01:00
Clément Renault
a514ce472a
Make clippy happy
2024-11-27 14:59:04 +01:00
Clément Renault
cc63802115
Modify and return the IndexEmbeddings to write them later
2024-11-27 14:58:03 +01:00
Clément Renault
acec45ad7c
Send a WakeUp when writing data in the BBQueue buffers
2024-11-27 14:33:23 +01:00
Clément Renault
08d6413365
Fix result types
2024-11-27 14:32:42 +01:00
Clément Renault
70802eb7c7
Fix most issues with the lifetimes
2024-11-27 14:32:42 +01:00
Clément Renault
6ac5b3b136
Finish most of the channels types
2024-11-27 14:32:26 +01:00
Clément Renault
e1e76f39d0
Clean up dependencies
2024-11-27 14:30:34 +01:00
Clément Renault
2094ce8a9a
Move the arroy building after the writing loop
2024-11-27 14:30:33 +01:00
Clément Renault
8442db8101
Implement mostly all senders
2024-11-27 14:16:35 +01:00
Clément Renault
79671c9faa
Implement a first version of the bbqueue channels
2024-11-27 14:15:00 +01:00
meili-bors[bot]
a2f64f6552
Merge #5095
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 12s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 40s
Test suite / Run Rustfmt (push) Successful in 1m46s
Test suite / Run Clippy (push) Successful in 9m55s
5095: Span to measure the part of db writes that is after the merge/extraction r=curquiza a=dureuill
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-27 11:10:00 +00:00
ManyTheFish
18a9af353c
Update Charabia version to v0.9.2
2024-11-27 11:12:08 +01:00
meili-bors[bot]
aae0dc715d
Merge #5063
...
5063: Fix pagination when embedding fails r=Kerollmops a=dureuill
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/5045
## What does this PR do?
- Use `return_keyword_results` function when embedding fails
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-27 09:13:28 +00:00
meili-bors[bot]
d0b2c0a523
Merge #5091
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 11s
Test suite / Run tests in debug (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 39s
Test suite / Run Rustfmt (push) Successful in 1m38s
Test suite / Run Clippy (push) Successful in 23m11s
5091: Settings opt out r=Kerollmops a=ManyTheFish
# Pull Request
Related PRD: https://www.notion.so/meilisearch/API-usage-Settings-to-opt-out-indexing-features-fff4b06b651f8108ade3f858aeb16b14?pvs=4
## Related issue
Fixes #4979
- [x] Add setting opt-out
- [x] Add analytics
- [x] Add tests
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-11-26 15:50:28 +00:00
ManyTheFish
2e896f30a5
Fix PR comments
2024-11-26 16:06:33 +01:00
Louis Dureuil
8f57b4fdf4
Span to measure the part of db writes that is after the merge/extraction
2024-11-26 14:46:36 +01:00
Many the fish
f014e78684
Update crates/milli/src/index.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-11-26 14:46:01 +01:00
Many the fish
9008ecda3d
Update crates/meilisearch-types/src/settings.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-11-26 14:44:24 +01:00
ManyTheFish
d7bcfb2d19
fix clippy
2024-11-26 14:04:16 +01:00
meili-bors[bot]
fb66fec398
Merge #5092
...
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 12s
Test suite / Run tests in debug (push) Failing after 11s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 23s
Test suite / Run Rustfmt (push) Successful in 1m41s
Test suite / Run Clippy (push) Successful in 5m36s
5092: Precise spans for new indexer r=dureuill a=dureuill
- Separate extract and merge spans
- Add span around commit
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-26 10:59:40 +00:00
Louis Dureuil
fa15be5bc4
Add span around commit
2024-11-26 09:45:48 +01:00
Louis Dureuil
aa460819a7
Add more precise spans
2024-11-26 09:45:36 +01:00
meili-bors[bot]
e241f91285
Merge #5062
...
5062: Fix bugs for v1.12 r=Kerollmops a=ManyTheFish
# Pull Request
## Related issue
Fixes #4984
Fixes https://github.com/meilisearch/meilisearch/issues/4974
Fixes [SDK test](https://github.com/meilisearch/meilisearch/actions/runs/11886701996/job/33118278794 )
## What does this PR do?
- add 3 tests
- fix bugs
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-11-26 08:10:50 +00:00
ManyTheFish
d66dc363ed
Test and implement settings opt-out
2024-11-25 18:23:22 +01:00
meili-bors[bot]
5560452ef9
Merge #5089
...
5089: Improve error handling when writing into LMDB r=dureuill a=Kerollmops
This PR exposes two new internal error variants: `StoreDelete` and `StorePut`. So that the error messages are better when we fail at writing into LMDB.
Related to #5078
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-11-25 16:19:41 +00:00
Clément Renault
b4fb2dabd4
Use the grenad rayon feature
2024-11-25 16:31:21 +01:00
Clément Renault
5606679c53
Use the obkv and grenad crates.io versions
2024-11-25 16:24:59 +01:00
Clément Renault
a3103f347e
Fix the facet f64 database name
2024-11-25 16:05:31 +01:00
Clément Renault
25aac45fc7
Expose better error messages
2024-11-25 15:54:43 +01:00
meili-bors[bot]
98a785b0d7
Merge #5080
...
Indexing bench (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of indexing (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for geo (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for songs (push) / Run and upload benchmarks (push) Waiting to run
Benchmarks of search for Wikipedia articles (push) / Run and upload benchmarks (push) Waiting to run
Run the indexing fuzzer / Setup the action (push) Successful in 1h5m43s
Test suite / Tests on ${{ matrix.os }} (macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch, meilisearch-macos-amd64, macos-13) (push) Waiting to run
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Waiting to run
Look for flaky tests / flaky (push) Failing after 21s
Test suite / Tests on ubuntu-20.04 (push) Failing after 10s
Test suite / Tests on ${{ matrix.os }} (windows-2022) (push) Failing after 22s
Test suite / Tests almost all features (push) Failing after 7s
Test suite / Test disabled tokenization (push) Failing after 7s
Test suite / Run tests in debug (push) Failing after 9s
Test suite / Run Rustfmt (push) Successful in 1m24s
Test suite / Run Clippy (push) Successful in 6m14s
Publish binaries to GitHub release / Check the version validity (push) Successful in 7s
Publish binaries to GitHub release / Publish binary for Linux (push) Failing after 9s
Publish binaries to GitHub release / Publish binary for ${{ matrix.os }} (meilisearch.exe, meilisearch-windows-amd64.exe, windows-2022) (push) Failing after 19s
Publish binaries to GitHub release / Publish binary for aarch64 (meilisearch-linux-aarch64, aarch64-unknown-linux-gnu) (push) Failing after 9s
5080: Fix getting a single batch through the GET route r=Kerollmops a=dureuill
# Pull Request
## Related issue
Fixes a bug where getting a single batch does not work
Related to #5070
fix by `@Kerollmops`
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 17:08:46 +00:00
Louis Dureuil
ba7500998e
Fix getting a single batch through the GET route
2024-11-21 17:59:31 +01:00
meili-bors[bot]
19e6f675b3
Merge #4900
...
4900: Indexer edition 2024 r=Kerollmops a=dureuill
This PR is implementing the indexer edition 2024, largely inspired by [the ideas from this blog post](https://blog.kerollmops.com/meilisearch-is-too-slow ).
Fixes https://github.com/meilisearch/meilisearch/issues/4985
## Features
- Stream-first approach to reading documents.
- Minimum disk write operations.
- RAM usage-first approach to avoid modifying common bitmaps on disk but in memory.
- Reduced LMDB fragmentation by writing entries only once...
- ...computing the final version of the entries in parallel...
- ...and storing them in write-optimized data structures before sending them to the BTree (LMDB).
- Indexing in multiple transactions to improve large dataset support (dumps).
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 16:19:10 +00:00
Louis Dureuil
323ecbb885
Add span on document operation
2024-11-21 17:01:10 +01:00
Louis Dureuil
ffb60cb885
Add comment explaining why we fixed the version of insta
2024-11-21 16:56:56 +01:00
Louis Dureuil
dcc3caef0d
Remove TopLevelMap
2024-11-21 16:56:46 +01:00
Louis Dureuil
221e547e86
Slight changes
2024-11-21 16:47:44 +01:00
Clément Renault
61d0615253
Document the geo point extractor
2024-11-21 16:47:08 +01:00
Clément Renault
5727e00374
Remove useless geo skipped
2024-11-21 16:47:08 +01:00
Clément Renault
9b60843831
Remove commented lines
2024-11-21 16:47:07 +01:00
ManyTheFish
36962b943b
First batch of PR comment
2024-11-21 16:38:11 +01:00
Louis Dureuil
32bcacefd5
Changes Document::len to Document::top_level_fields_count
2024-11-21 15:01:07 +01:00
Louis Dureuil
4ed195426c
remove unused stuff in global.rs
2024-11-21 15:01:07 +01:00
Many the fish
ff38f29981
Update crates/index-scheduler/src/batch.rs
...
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-11-21 14:18:39 +01:00
ManyTheFish
94b260fd25
Remove orphan span
2024-11-21 12:12:07 +01:00
Clément Renault
ab2c83f868
Use the disk less when computing prefixes
2024-11-21 10:45:37 +01:00