Louis Dureuil
9b7764575b
openai: don't pass apiKey when it is empty
2024-07-31 15:03:44 +02:00
Louis Dureuil
0e68718027
Add detailed spans
2024-07-31 13:05:47 +02:00
Louis Dureuil
7c3fc8c655
Split settings and document facet string extractions
2024-07-31 10:57:46 +02:00
Louis Dureuil
8acd3f50bb
skip normalization when the locales and values are the same
2024-07-31 09:53:00 +02:00
Tamo
d262b1df32
craft an API over the Shared Server and Shared index to avoid hard to debug mistakes
2024-07-30 14:24:57 +02:00
meili-bors[bot]
c2c1ba39ee
Merge #4826
...
4826: Update Charabia v0.9.0 r=dureuill a=ManyTheFish
# Pull Request
## Related Changelog
https://github.com/meilisearch/charabia/releases/tag/v0.9.0
## Notable Change for Meilisearch
Adds all math symbols from https://www.compart.com/en/unicode/category/Sm to the default separator list.
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-07-25 14:08:38 +00:00
ManyTheFish
35567b2137
Update Charabia v0.9.0
2024-07-25 16:02:14 +02:00
Louis Dureuil
d4ea7cc2a9
fix clippy π π
2024-07-25 12:10:32 +02:00
Louis Dureuil
2413592bbf
Display docid when there are documents without manual embeddings for a manual embedder
2024-07-25 12:10:32 +02:00
Louis Dureuil
553440632e
Introduce Setting::some_or_not_set
2024-07-25 12:01:52 +02:00
Louis Dureuil
7a347966da
Allow explicit dimensions
for ollama
2024-07-25 12:01:51 +02:00
Louis Dureuil
4654d51e05
Add custom headers for REST embedder
2024-07-25 12:01:51 +02:00
ManyTheFish
a918561ac1
Fix PR comments
2024-07-25 10:52:56 +02:00
ManyTheFish
70d71581ee
fix clippy
2024-07-25 10:52:56 +02:00
ManyTheFish
04fa44e7eb
Implement localized attributes settings
2024-07-25 10:51:27 +02:00
ManyTheFish
90c0a6db7d
Implement localized search
2024-07-25 10:51:27 +02:00
ManyTheFish
cc02920f2b
Update charabia
2024-07-25 10:51:27 +02:00
Tamo
988552e178
add tests on the rest embedder
2024-07-24 14:34:17 +02:00
Louis Dureuil
0d8199f3b7
Change parameters in milli settings
2024-07-24 14:34:17 +02:00
Louis Dureuil
4b74803dae
Change parameters in vector settings
2024-07-24 14:34:17 +02:00
Louis Dureuil
d731fa661b
ollama and openai use new EmbedderOptions
2024-07-24 14:34:17 +02:00
Louis Dureuil
a1beddd5d9
rest embedder: use json_template
2024-07-24 14:34:17 +02:00
Louis Dureuil
4109182ca4
Add json_template module
2024-07-24 14:34:12 +02:00
Louis Dureuil
1a297c048e
Error changes
2024-07-24 14:34:12 +02:00
Louis Dureuil
303e601b87
HuggingFace: Clearer error message when a model is not supported
2024-07-23 15:13:22 +02:00
meili-bors[bot]
ea73615abf
Merge #4804
...
4804: Implements the experimental contains filter operator r=irevoire a=irevoire
# Pull Request
Related PRD: (private link) https://www.notion.so/meilisearch/Contains-Like-Filter-Operator-0d8ad53c6761466f913432eb1d843f1e
Public usage page: https://meilisearch.notion.site/Contains-filter-operator-usage-3e7421b0aacf45f48ab09abe259a1de6
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/3613
## What does this PR do?
- Extract the contains operator from this PR: https://github.com/meilisearch/meilisearch/pull/3751
- Gate it behind a feature flag
- Add tests
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-07-17 15:47:11 +00:00
Tamo
02c61eabfa
fix the range reported when the experimental feature has not been set
2024-07-17 16:54:33 +02:00
Tamo
2af9481804
Implements the experimental contains filter operatorΒ«
2024-07-17 11:13:37 +02:00
Louis Dureuil
24240934f9
Improve errors when indexing documents with a user provided embedder
2024-07-16 13:39:01 +02:00
Louis Dureuil
f4c94ac57f
manual embedders: limit max size of errors to 250
2024-07-16 13:39:01 +02:00
Louis Dureuil
4087a88dbe
rest|ollama|openai: increase tries to 10 + randomize retry duration
2024-07-16 13:39:00 +02:00
Louis Dureuil
5adacf2f45
OpenAI: embed only the first MAX_TOKENS tokens
2024-07-16 13:39:00 +02:00
Louis Dureuil
65d0c32aa7
Allow overriding OpenAI's url
2024-07-16 13:39:00 +02:00
Louis Dureuil
82647bcded
When retrieveVectors
is true, retrieve _vectors.embedder
even if there are no vector for that embedder
2024-07-16 13:39:00 +02:00
Louis Dureuil
e83da00446
Milli changes to match to allow for more flexible lifetimes
2024-07-11 16:29:35 +02:00
Louis Dureuil
7fb3e378ff
Do not fail sort comparisons when the field name or target point are different
2024-07-11 16:28:14 +02:00
meili-bors[bot]
29b44e5541
Merge #4626
...
4626: Edit Documents with Rhai r=ManyTheFish a=Kerollmops
This PR introduces a first version of [the _Update Documents with Function_ (internal)](https://www.notion.so/meilisearch/Update-Documents-by-Function-45f87b13e61c4435b73943768a490808 ). It uses [the Rhai programming language](https://rhai.rs/ ) to let users express the modifications they want apply.
You can read more about the way to use this functions on [the Usage PRD Page](https://meilisearch.notion.site/Edit-Documents-with-Rhai-0cff8fea7655436592e7c8a6de932062?pvs=25 ). The [prototype is available](https://github.com/meilisearch/meilisearch/actions/runs/9038384483 ) through Docker by using the following command:
```
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-edit-documents-with-rhai-3
```
## TODO
- [x] Support the `DocumentEdition` task in dumps.
- [x] Remove the unwraps and panics.
- [x] Improve error codes for the `function` parameter.
- [x] [Update Rhai to v1.19.0](https://github.com/rhaiscript/rhai/releases/tag/v1.19.0 ) π
- [x] Make it an experimental feature (only restrict the HTTP calls).
- [x] It must be possible not to send a context.
- [x] Rebase on main.
- [x] Check that the script cannot do any io.
- [x] ~Introduce a `Documents.edit` action or~ require the `Documents.all` action.
- [x] Change the `editionCode` to the clearer `function` field name in the tasks.
- [x] Support a user provided context and maybe more (but keep function execution isolated for reproducibility).
- [x] Support deleting documents when the `doc` is `()` (nil, null).
- [x] Support canceling document edition.
- [x] Multithread document edition by using rayon (and [rayon-par-bridge](https://docs.rs/rayon-par-bridge/latest/rayon_par_bridge/ )).
- [x] Limit the number of instruction by function execution.
- [ ] ~Expose the limit of instructions in the settings.~ Not sure, in fact.
- [x] Ignore unmodified documents in the tasks count.
- [x] Make the `filter` field optional (not forced to be `null`).
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
2024-07-11 09:02:55 +00:00
ClΓ©ment Renault
6e80364c50
Apply review comments
2024-07-11 11:00:27 +02:00
ClΓ©ment Renault
3bac22fd87
We do not do intersections with the universe when it is related to cache
2024-07-10 16:49:36 +02:00
ClΓ©ment Renault
ce61cb7fe6
Simplify and speedup an intersection pass
2024-07-10 16:49:36 +02:00
ClΓ©ment Renault
1693d1a311
Simplify the check to decide to stop a loop
2024-07-10 16:49:36 +02:00
ClΓ©ment Renault
febea735ca
Remove the unused universe parameter from resolve_negative_phrases
2024-07-10 16:49:36 +02:00
ClΓ©ment Renault
93ba051094
Remove the invalid get_phrases_docids universe parameter
2024-07-10 16:49:35 +02:00
ClΓ©ment Renault
cd7a20fa32
Make it work by avoid storing invalid stuff in the cache
2024-07-10 16:49:35 +02:00
ClΓ©ment Renault
41f51adbec
Do less useless intersections
2024-07-10 16:49:35 +02:00
ClΓ©ment Renault
0ca1a4e805
Always do the intersections with the universe
2024-07-10 16:49:34 +02:00
ClΓ©ment Renault
50a7393c55
Modify the compute_query_term_subset_docids function to accept the universe
2024-07-10 16:49:34 +02:00
ClΓ©ment Renault
837274f853
Restrict even more the Rhai engine
2024-07-10 16:30:18 +02:00
ClΓ©ment Renault
aace587dd1
Create errors for the internal processing ones
2024-07-10 16:29:18 +02:00
ClΓ©ment Renault
f35d6710f3
Update rhai to v1.19.0
2024-07-10 16:29:17 +02:00
ClΓ©ment Renault
81ec0abad1
Use the new rayon-par-bridge library
2024-07-10 16:29:04 +02:00
ClΓ©ment Renault
b67d385cf0
Parallelize the edition functions
2024-07-10 16:28:54 +02:00
ClΓ©ment Renault
dfecb25814
Disable the time package
2024-07-10 16:28:37 +02:00
ClΓ©ment Renault
2eae2015d7
Support aborting documents edition by function
2024-07-10 16:28:15 +02:00
ClΓ©ment Renault
33fa17bf12
Support deleting documents with functions
2024-07-10 16:28:15 +02:00
ClΓ©ment Renault
400e6b93ce
Support user-provided context for documents edition
2024-07-10 16:28:15 +02:00
ClΓ©ment Renault
f4add93043
Limit the number of script operations
2024-07-10 16:28:14 +02:00
ClΓ©ment Renault
2fae96ac14
Show the actual number of actually edited documents
2024-07-10 16:28:14 +02:00
ClΓ©ment Renault
45af18ae9c
Check the Rhai syntax before accepting the script
2024-07-10 16:28:13 +02:00
ClΓ©ment Renault
2d97164d9f
It works perfectly with some Rhai
2024-07-10 16:28:13 +02:00
ClΓ©ment Renault
efc156a4a4
Executing Lua works correctly
2024-07-10 16:27:36 +02:00
meili-bors[bot]
2099b4f0dd
Merge #4786
...
4786: Update dependencies r=Kerollmops a=irevoire
# Pull Request
## Related issue
Fixes #4753
## What does this PR do?
- Update all dependencies except rustls
- [x] Release charabia
- [x] Update charabia
- [x] Double check that the docker build works after updating charabia
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
2024-07-10 13:23:54 +00:00
ClΓ©ment Renault
9d6885793e
Upgrade dependencies
2024-07-10 13:46:24 +02:00
ClΓ©ment Renault
5f4530ce57
Remove more unused dependencies
2024-07-10 13:36:34 +02:00
Tamo
4d5005b01a
make clippy happy
2024-07-10 10:06:59 +02:00
Tamo
952e742321
update charabia
2024-07-09 23:41:29 +02:00
hanbings
0a40a98bb6
Make milli use edition 2021 ( #4770 )
...
* Make milli use edition 2021
* Add lifetime annotations to milli.
* Run cargo fmt
2024-07-09 17:25:39 +02:00
Tamo
cd46ebd6b5
remove insta deprecating
2024-07-08 18:38:05 +02:00
Tamo
6afa578688
update most incompatible dependencies
2024-07-08 18:31:15 +02:00
Tamo
300bdfc2a7
update most dependencies
2024-07-08 18:09:12 +02:00
Louis Dureuil
128e6c7502
Search: spans with a finer granularity
2024-07-02 16:13:53 +02:00
ManyTheFish
015d90a962
merge main
2024-07-01 11:50:36 +02:00
Louis Dureuil
e53de15b8e
Fix behavior of limit and offset for hybrid search when keyword results are returned early
...
The test is fixed
2024-06-27 14:25:33 +02:00
Tamo
ce08dc509b
add more tests and improve the location of the error
2024-06-27 11:51:45 +02:00
Tamo
1daaed163a
Make _vectors.:embedding.regenerate mandatory + tests + error messages
2024-06-27 11:04:58 +02:00
meili-bors[bot]
7e3c306c54
Merge #4725
...
4725: Store primary key as String when Number exceeds i64 range r=irevoire a=JWSong
# Pull Request
## Related issue
Fixes #4696
## What does this PR do?
- When a Number value exceeding the range of i64 is received as a primary key, it will be stored as a String.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: JWSong <thdwjddn123@gmail.com>
2024-06-26 07:06:04 +00:00
JWSong
dcdc83946f
accept large number as string
2024-06-25 21:41:47 +09:00
meili-bors[bot]
3c4c46377b
Merge #4665
...
4665: Add missing Korean support r=ManyTheFish a=junhochoi
Some configuration is missing `korean` features and add a test case in `milli/src/search/mod.rs`.
# Pull Request
## Related issue
#3443 #3882
## What does this PR do?
- Improvement on enabling Korean support
Inspired by the work (#3882 ) I tried to enable Korean features but have found some missing configurations.
This PR is add those missing configs (mostly Cargo.toml) and added one test case.
## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Junho Choi <jh.choi@catenoid.net>
2024-06-25 11:51:21 +00:00
Louis Dureuil
d75e0098c7
Fixes for Rust v1.79
2024-06-25 11:16:06 +02:00
Junho Choi
2e0ff56f3f
Add missing Korean support
...
Some configuration is missing `korean` features and
add a test case in `milli/src/search/mod.rs`.
2024-06-25 12:45:21 +09:00
Tamo
1693332cab
Update arroy and always build the tree that need to be built
2024-06-24 10:14:03 +02:00
meili-bors[bot]
ddd564665b
Merge #4713
...
4713: Speed up facet distribution r=ManyTheFish a=Kerollmops
This PR is akin to #4682 , but this time, the same logic is applied to the facets. Bitmaps are not decoded, and we do an intersection on the bytes with the search candidates instead of materializing the RoaringBitmap to destroy it just after the operation.
A prospect raised some slow requests when performing facet searches, and I found out that the disk optimization intersection wasn't performed on the facets.
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
2024-06-24 05:23:46 +00:00
ClΓ©ment Renault
9736e16a88
Make clippy happy
2024-06-20 13:02:44 +02:00
ClΓ©ment Renault
6fa4da8ae7
Improve facet distribution speed in count mode
2024-06-20 12:58:51 +02:00
ClΓ©ment Renault
19d7cdc20d
Improve facet distribution speed in lexico mode
2024-06-20 12:57:08 +02:00
Louis Dureuil
a04041c8f2
Only spawn the pool once
2024-06-19 16:25:33 +02:00
meili-bors[bot]
e580d6b98f
Merge #4693
...
4693: Introduce distinct attributes at search time r=irevoire a=Kerollmops
This PR fixes #4611 .
### To Do
- [x] Remove the `distinguishableAttributes` settings (not even a commit about that).
- [x] Use the `filterableAttributes` to be able to use the `distinct` parameter at search.
- [x] Work on the errors and make tests.
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-06-18 07:45:03 +00:00
Tamo
43875e6758
fix bug around nested fields
2024-06-17 15:59:30 +02:00
meili-bors[bot]
e9bf4c43a4
Merge #4649
...
4649: Don't store the vectors in the documents database r=dureuill a=irevoire
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4607
## What does this PR do?
- Ensure that anything falling under `_vectors` is NOT searchable, filterable or sortable
- [x] per embedder, add a roaring bitmap of documents that provide "userProvided" embeddings
- [x] in the indexing process in extract_vector_points, set the bit corresponding to the document depending on the "userProvided" subfield in the _vectors field.
- [x] in the document DB in typed chunks, when writing the _vectors field, remove all keys corresponding to an embedder
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-17 12:32:03 +00:00
Louis Dureuil
0a8f50695e
Fixes for Rust v1.79
2024-06-13 17:47:44 +02:00
Louis Dureuil
e35ef31738
Small changes following review
2024-06-13 14:20:48 +02:00
Louis Dureuil
3bc8f81abc
user_provided => regenerate
2024-06-12 18:12:20 +02:00
Louis Dureuil
a89eea233b
Fix vectors injection
2024-06-12 17:10:19 +02:00
Louis Dureuil
f5cf01e7d1
Rework extraction to use EmbedderAction
2024-06-12 14:50:55 +02:00
Louis Dureuil
d1dd7e5d09
In transform for removed embedders, write back their user provided vectors in documents, and clear the writers
2024-06-12 14:50:55 +02:00
Louis Dureuil
d18c1f77d7
Update embedder configs with a finer granularity
...
- no longer clear vector DB between any two embedder changes
2024-06-12 14:50:55 +02:00
Louis Dureuil
d0b05ae691
Add EmbedderAction to settings
2024-06-12 14:50:54 +02:00
Louis Dureuil
e9bf4eb100
Reformulate ParsedVectorsDiff in terms of VectorState
2024-06-12 14:11:44 +02:00
Louis Dureuil
b368105272
Add EmbedderConfigs::into_inner
2024-06-12 14:11:44 +02:00
meili-bors[bot]
e0eff08095
Merge #4685
...
4685: Fix ci tests r=dureuill a=ManyTheFish
# Pull Request
Make the all following CI succeed:
https://github.com/meilisearch/meilisearch/actions/runs/9477183091
## Related issue
Fixes #4629
## What does this PR do?
- Change the test behavior for `swedish-recomposition` feature flag
- Remove the `-v` parameter from grep
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Many the fish <many@meilisearch.com>
2024-06-12 07:58:33 +00:00
ClΓ©ment Renault
39f60abd7d
Add and modify distinct tests
2024-06-11 17:53:53 -04:00
ClΓ©ment Renault
1991bd03da
Distinct at search erases the distinct in the settings
2024-06-11 17:02:39 -04:00
ClΓ©ment Renault
ee39309aae
Improve errors and introduce a new InvalidSearchDistinct error code
2024-06-11 16:03:39 -04:00
ClΓ©ment Renault
0d31be1494
Make the distinct work at search
2024-06-11 11:39:35 -04:00
Louis Dureuil
7cef2299cf
Fix behavior when removing a document
2024-06-11 09:45:08 +02:00
ManyTheFish
57d066595b
fix Tests almost all features
2024-06-06 17:24:50 +02:00
ClΓ©ment Renault
75b2e02cd2
Log more stuff around filtering
2024-06-06 11:00:07 -04:00
ClΓ©ment Renault
52d0d35b39
Revert "Reduce the universe while exploring the facet tree" because it's slower this way
...
This reverts commit 14026115f21409535772ede0ee4273f37848dd61.
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
5432776132
Reduce the universe while exploring the facet tree
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
66470b27e6
Use the MultiOps trait for IN operations
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
0a9bd398c7
Improve the NOT operator to use the universe when possible
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
7967e93c16
Skip evaluating when a universe is empty, nothing can be found
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
a6f3a01c6a
Expose the universe to do efficient intersections on deserialization
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
4ca4a3f954
Make the CboRoaringBitmapCodec support intersection on deserialization
2024-06-06 09:17:51 -04:00
ClΓ©ment Renault
e4a69c5ac3
Introduce the FacetGroupLazyValue type
2024-06-06 09:17:50 -04:00
ClΓ©ment Renault
531e3d7d6a
MultiOps trait for OR operations
2024-06-06 09:17:50 -04:00
Tamo
2cdcb703d9
fix the deletion of vectors and add a test
2024-06-06 11:39:29 +02:00
Tamo
31a793d226
fix the regeneration of the embeddings in the search
2024-06-06 11:39:29 +02:00
Tamo
d85ab23b82
rename all occurences of user_defined to user_provided for consistency
2024-06-06 11:39:29 +02:00
Tamo
b7349910d9
implements mor review comments
2024-06-06 11:39:29 +02:00
Tamo
376b3a19a7
makes clippy and fmt happy
2024-06-06 11:39:29 +02:00
Tamo
b867829ef1
remove useless dbg
2024-06-06 11:39:29 +02:00
Tamo
5d50850e12
always push the user defined vectors in arroy
2024-06-06 11:39:29 +02:00
Tamo
a73ccc78a6
forward the embedding config to the extractors
2024-06-06 11:39:28 +02:00
Tamo
9eb6f522ea
wraps the index embedding config in a struct
2024-06-06 11:37:30 +02:00
Tamo
04f6523f3c
expose a new parameter to retrieve the embedders at search time
2024-06-06 11:36:11 +02:00
Tamo
84e498299b
Remove the vectors from the documents database
2024-06-06 11:36:11 +02:00
Tamo
7a84697570
never store the _vectors as searchable or faceted fields
2024-06-06 11:36:11 +02:00
Tamo
4148fbbe85
provide a method to get all the nested fields ids from a name
2024-06-06 11:36:11 +02:00
ManyTheFish
2e50c6ec81
Update Charabia
2024-06-06 10:18:43 +02:00
ManyTheFish
30293883e0
Fix condition mistake
2024-06-05 17:30:07 +02:00
ManyTheFish
b833be46b9
Avoid running proximity when only the exact attributes changes
2024-06-05 17:30:07 +02:00
ManyTheFish
0a4118329e
Put only_additional_fields to None if the difference gives an empty result.
2024-06-05 17:30:07 +02:00
ManyTheFish
261e92d7e6
Skip iterating over documents when the faceted field list doesn't change
2024-06-05 17:30:07 +02:00
ManyTheFish
5cd08979b1
iterate over the faceted fields instead of over the whole document
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
a998b881f6
Cache a lot of operations to know if a field must be indexed
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
b81953a65d
Add a span for the prepare_for_documents_reindexing
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
091bb157f1
Add a span for the settings diff creation
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
1b639ce44b
Reduce the number of complex calls to settings diff functions
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
87cf8a3c94
Introduce a new way to determine the operations to perform on the fields
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
0f578348f1
Introduce a dedicated function to write proximity entries in database
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
fad4675abe
Give the settings diff to the write_typed_chunk_into_index function
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
1ab03c4ede
Fix an issue with settings diff and * in the searchable attributes
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
0c6e4b2f00
Introducing a new into_del_add_obkv_conditional_operation function
2024-06-05 17:30:07 +02:00
ClΓ©ment Renault
42b3f52ef9
Introduce the SettingDiff only_additional_fields method
2024-06-05 17:30:07 +02:00
meili-bors[bot]
93f5defedc
Merge #4656
...
4656: Adding a new `searchableAttribute` no longer re-index all the attributes r=ManyTheFish a=Kerollmops
Fixes #4492 .
## To Do
- [x] Do not call the `InnerSettingsDiff::only_additional_fields` function too many times
- [ ] Add tests
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-06-05 14:51:14 +00:00
ManyTheFish
33241a6b12
Fix condition mistake
2024-06-05 16:00:24 +02:00
ManyTheFish
ff87b4db26
Avoid running proximity when only the exact attributes changes
2024-06-05 12:48:44 +02:00
ManyTheFish
ba9fadc8f1
Put only_additional_fields to None if the difference gives an empty result.
2024-06-05 10:51:16 +02:00
ManyTheFish
d29d4f88da
Skip iterating over documents when the faceted field list doesn't change
2024-06-04 15:31:24 +02:00
ManyTheFish
17c5ceeb9d
iterate over the faceted fields instead of over the whole document
2024-06-04 14:04:20 +02:00
meili-bors[bot]
fc584f1db3
Merge #4666
...
4666: Add a score threshold search parameter r=ManyTheFish a=dureuill
# Pull Request
## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4609
## What does this PR do?
- See [usage](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#95b76ded400342ba9ab3d67c734836f0 ) and [the known limitation](https://meilisearch.notion.site/Filter-by-score-usage-224a183ce7b24ca99b6a9a8da755668a?pvs=25#e4e32195bf0e4195b5daecdbb7a97a17 )
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-06-03 08:42:44 +00:00
Louis Dureuil
2b6db6541e
Changes after review
2024-06-03 10:30:00 +02:00
meili-bors[bot]
d6bd88ce4f
Merge #4667
...
4667: Frequency matching strategy r=Kerollmops a=ManyTheFish
# Pull Request
## Related issue
Fixes #3773
## What does this PR do?
- add test for matching strategy
- implement frequency matching strategy
See the [PRD for more details](https://www.notion.so/meilisearch/Frequency-Matching-Strategy-0f3ba08833a442a39590a53a1505ab00 ).
[Public API](https://www.notion.so/meilisearch/frequency-matching-strategy-89868fb7fc584026bc56e378eb854a7f ).
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-05-30 14:53:31 +00:00
ClΓ©ment Renault
b9a0ff0dd6
Cache a lot of operations to know if a field must be indexed
2024-05-30 16:18:23 +02:00
ClΓ©ment Renault
75496af985
Add a span for the prepare_for_documents_reindexing
2024-05-30 12:14:22 +02:00
ClΓ©ment Renault
0e9eb9eedb
Add a span for the settings diff creation
2024-05-30 12:08:27 +02:00
ManyTheFish
3f1a510069
Add tests and fix matching strategy
2024-05-30 12:02:42 +02:00
ClΓ©ment Renault
3a78e988da
Reduce the number of complex calls to settings diff functions
2024-05-30 11:23:07 +02:00
ClΓ©ment Renault
d9e5074189
Introduce a new way to determine the operations to perform on the fields
2024-05-30 11:23:07 +02:00
ClΓ©ment Renault
bc210bdc00
Introduce a dedicated function to write proximity entries in database
2024-05-30 11:23:06 +02:00
ClΓ©ment Renault
4bf83f701c
Give the settings diff to the write_typed_chunk_into_index function
2024-05-30 11:23:06 +02:00
ClΓ©ment Renault
db3887929f
Fix an issue with settings diff and * in the searchable attributes
2024-05-30 11:22:50 +02:00
ClΓ©ment Renault
9af103a88e
Introducing a new into_del_add_obkv_conditional_operation function
2024-05-30 11:22:49 +02:00
ClΓ©ment Renault
99211eb375
Introduce the SettingDiff only_additional_fields method
2024-05-30 11:22:49 +02:00
Louis Dureuil
4f03b0cf5b
Add ranking score threshold to similar
2024-05-30 11:20:50 +02:00
Louis Dureuil
c26db7878c
Expose rankingScoreThreshold in API
2024-05-30 10:32:35 +02:00
ManyTheFish
1ab88e10b9
Merge branch 'main' into merge-release-v1.8.1-in-main
2024-05-29 16:24:00 +02:00
Louis Dureuil
aac1d769a7
Add ranking_score_threshold to milli
2024-05-29 14:17:09 +02:00
ManyTheFish
abdc4afcca
Implement Frequency matching strategy
2024-05-29 13:59:08 +02:00
Many the fish
e1fbfde6c4
Merge branch 'main' into merge-release-v1.8.1-in-main
2024-05-29 11:31:03 +02:00
ManyTheFish
27b75ec648
merge main into v1.8.1
2024-05-29 11:26:07 +02:00
Louis Dureuil
ca6cc4654b
Add similar route
2024-05-28 15:28:19 +02:00
Louis Dureuil
d35278320e
Add support functions for accessing arroy writers and readers
2024-05-28 15:27:43 +02:00
Louis Dureuil
02b3d82c60
filtered_universe accepts index and txn instead of SearchContext
2024-05-28 15:22:12 +02:00
Louis Dureuil
fd2c95999d
Change validate_document_id
to public and remove extra layer of result
2024-05-28 15:21:19 +02:00
ClΓ©ment Renault
dc949ab46a
Remove puffin usage
2024-05-27 15:59:14 +02:00
ClΓ©ment Renault
7f3e51349e
Remove puffin for the dependencies
2024-05-27 15:53:06 +02:00
meili-bors[bot]
19acc65ad2
Merge #4646
...
4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops
This PR implements what is described in #4485 . It reduces the number of disk writes and disk usage.
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
2024-05-23 16:06:50 +00:00
ClΓ©ment Renault
fe17c0f52e
Construct the minimal OBKVs according to the settings diff
2024-05-23 11:23:57 +02:00
ClΓ©ment Renault
bc5663e673
FieldIdsMap no longer useful thanks to #4631
2024-05-22 16:06:15 +02:00
Louis Dureuil
8a941c0241
Smaller review changes
2024-05-22 14:44:42 +02:00
Louis Dureuil
3412e7fbcf
"[]" is deserialized as 0 embedding rather than 1 embedding of dim 0
2024-05-22 12:25:21 +02:00
Louis Dureuil
16037e2169
Don't remove embedders that are not in the config from the document DB
2024-05-22 12:24:51 +02:00
Louis Dureuil
8f7c8ca7f0
Remove now unused error variant
2024-05-22 12:23:43 +02:00
ClΓ©ment Renault
500ddc76b5
Make the flattened sorter optional
2024-05-21 16:16:36 +02:00
ClΓ©ment Renault
943f8dba0c
Make clippy happy
2024-05-21 14:58:41 +02:00
ClΓ©ment Renault
1aa8ed9ef7
Make the original sorter optional
2024-05-21 14:53:26 +02:00
ManyTheFish
f762307838
Fix clippy
2024-05-21 13:44:20 +02:00
ManyTheFish
3e94a90722
Fixes
2024-05-21 13:39:46 +02:00
Louis Dureuil
b17cb56dee
Test array of vectors
2024-05-20 14:44:10 +02:00
ManyTheFish
fc7e817221
Index geo points based on the settings differences
2024-05-20 12:27:26 +02:00
Louis Dureuil
d05d49ffd8
Fix tests
2024-05-20 10:36:18 +02:00
Louis Dureuil
0462ebbe58
Don't write an empty _vectors field
2024-05-20 10:36:18 +02:00
Louis Dureuil
2f7a8a4efb
Don't write vectors that weren't autogenerated in document DB
2024-05-20 10:36:18 +02:00
Louis Dureuil
52d9cb6e5a
Refactor vector indexing
...
- use the parsed_vectors module
- only parse `_vectors` once per document, instead of once per embedder per document
2024-05-20 10:36:17 +02:00
Louis Dureuil
261de888b7
Add function to get the embeddings of a document in an index
2024-05-20 10:36:17 +02:00
Louis Dureuil
98c811247e
Add parsed vectors module
2024-05-20 10:25:59 +02:00
Tamo
273c6e8c5c
uses the latest version of heed to get rid of unsafe code
2024-05-16 18:31:32 +02:00
Tamo
897d25780e
update milli to latest version
2024-05-16 18:31:32 +02:00
Tamo
f2d0a59f1d
when no searchable attributes are defined, makes all the weight equals to zero
2024-05-16 01:06:33 +02:00
Tamo
c78a2fa4f5
rename method and variable around the attributes to search on feature
2024-05-15 18:04:42 +02:00
Tamo
5542f1d9f1
get back to what we were doingb efore in the DB cache and with the restricted field id
2024-05-15 18:00:39 +02:00
Tamo
ad4d8502b3
stops storing the whole fieldids weights map when no searchable are defined
2024-05-15 17:16:10 +02:00
Tamo
7ec4e2a3fb
apply all style review comments
2024-05-15 15:02:26 +02:00
Tamo
9fffb8e83d
make clippy happy
2024-05-14 17:36:32 +02:00
Tamo
caa6a7149a
make the attribute ranking rule use the weights and fix the tests
2024-05-14 17:36:32 +02:00
Tamo
a0082c4df9
add a failing test on the attribute ranking rule
2024-05-14 17:00:02 +02:00
Tamo
b0afe0972e
stop updating the fields ids map when fields are only swapped
2024-05-14 17:00:02 +02:00
Tamo
9ecde41853
add a test on the current behaviour
2024-05-14 17:00:02 +02:00
Tamo
685f452fb2
Fix the indexing of the searchable
2024-05-14 17:00:02 +02:00
Tamo
4e4a1ddff7
gate a test behind the required feature
2024-05-14 17:00:02 +02:00
Tamo
c22460045c
Stops returning an option in the internal searchable fields
2024-05-14 17:00:02 +02:00
ClΓ©ment Renault
ac4bc143c4
Bump ureq to v2.9.7
2024-05-07 10:39:38 +02:00
meili-bors[bot]
4d5971f343
Merge #4621
...
4621: Bring back changes from v1.8.0 into main r=curquiza a=curquiza
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: ClΓ©ment Renault <clement@meilisearch.com>
2024-05-06 13:46:39 +00:00
Louis Dureuil
f4dd73ec8c
Destructure EmbedderOptions so we don't miss some options
2024-05-02 15:39:36 +02:00
ManyTheFish
88174b8ae4
Update charabia v0.8.10
2024-04-30 14:30:23 +02:00
meili-bors[bot]
ebca29f3de
Merge #4597
...
4597: Fix embeddings settings update r=ManyTheFish a=ManyTheFish
# Pull Request
- add some conditions reducing the work done when changing the settings
- add some benchmarks on embedders
## Related issue
Fixes #4585
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-25 16:37:28 +00:00
meili-bors[bot]
c793b6ef6d
Merge #4600
...
4600: Fix embedders api r=ManyTheFish a=ManyTheFish
# Pull Request
## Related issue
Fixes #4594
Fixes #4595
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-04-25 13:16:33 +00:00
ClΓ©ment Renault
d4aeff92d0
Introduce the ThreadPoolNoAbort wrapper
2024-04-24 16:40:12 +02:00
ManyTheFish
9b76501875
Display set API key for Ollama embedder
2024-04-24 12:33:07 +02:00
ClΓ©ment Renault
b3173d0423
Remove useless dots in the error messages
2024-04-22 18:09:33 +02:00
ClΓ©ment Renault
96cc5319c8
Introduce a new internal error type to categorize panics
2024-04-22 18:09:33 +02:00
ClΓ©ment Renault
0c7003c5df
Introduce an atomic to catch panics in thread pools
2024-04-22 18:09:33 +02:00
ManyTheFish
a1aa999026
Add conditions reducing wrok
2024-04-22 14:18:35 +02:00
ManyTheFish
c71b5d09ff
Updatre charabia v0.8.9
2024-04-18 11:38:26 +02:00
writegr
ab43a8a949
chore: fix some typos in comments
...
Signed-off-by: writegr <wellweek@outlook.com>
2024-04-18 14:12:52 +08:00
meili-bors[bot]
4a8459b799
Merge #4576
...
4576: increase the default search time budget from 150ms to 1.5s r=ManyTheFish a=irevoire
# Pull Request
## Related issue
Fixes #4575
## What does this PR do?
- increase the default search time budget from 150ms to 1.5s
Co-authored-by: Tamo <tamo@meilisearch.com>
2024-04-17 16:04:47 +00:00
ClΓ©ment Renault
c923adf222
Fix facet distribution for alpha on facet numbers
2024-04-17 16:31:16 +02:00
ManyTheFish
df29ba709a
Make some cleaning in Arcs
2024-04-17 12:33:25 +02:00
ManyTheFish
3acfab2eb7
Fix PR comments
2024-04-17 10:55:51 +02:00
Tamo
19137be0ea
increase the default search time budget from 150ms to 1.5s
2024-04-16 18:09:49 +02:00
ManyTheFish
87a93ba47d
fix clippy
2024-04-16 14:39:30 +02:00
ManyTheFish
eaf113ef34
Fix wod pair proximity error when nothing has to be extracted
2024-04-16 14:39:30 +02:00
ManyTheFish
e5ae337aae
Comeback to sorters in extract_word_docids
...
using buffers and merge the keys manually is less efficient
2024-04-16 14:39:30 +02:00
ManyTheFish
a489b406b4
fix test
2024-04-16 14:39:06 +02:00
ManyTheFish
02c3d6b265
finish work
2024-04-16 14:39:06 +02:00
ManyTheFish
b5e4a55af6
refactor faceted and searchable pipeline
2024-04-16 14:39:06 +02:00
ManyTheFish
a7e368aaa6
Create InnerIndexSettingsDiffs struct and populate it
2024-04-16 14:39:06 +02:00
ManyTheFish
893200ab87
Avoid clearing documents in transform
2024-04-16 14:39:06 +02:00
ManyTheFish
aabce52b1b
Fix test
2024-04-16 14:39:06 +02:00
ManyTheFish
8fff5fc281
update tests
2024-04-16 14:39:06 +02:00
yudrywet
cf864a1c2e
chore: fix some typos in comments
...
Signed-off-by: yudrywet <yudeyao@yeah.net>
2024-04-14 20:11:34 +08:00
Louis Dureuil
89e72fab32
Update grenad to fix rare DB corruption
2024-04-11 21:06:59 +02:00
meili-bors[bot]
b1844b0c27
Merge #4548
...
4548: v1.8 hybrid search changes r=dureuill a=dureuill
Implements the search changes from the [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#40f24df3da694428a39cc8043c9cfc64 )
### β οΈ Breaking changes in an experimental feature:
- Removed the `_semanticScore`. Use the `_rankingScore` instead.
- Removed `vector` in the response of the search (output was too big).
- Removed all the vectors from the `vectorSort` ranking score details
- target vector appearing in the name of the rule
- matched vector appearing in the details of the rule
### Other user-facing changes
- Added `semanticHitCount`, indicating how many hits were returned from the semantic search. This is especially useful in the hybrid search.
- Embed lazily: Meilisearch no longer generates an embedding when the keyword results are "good enough".
- Graceful embedding failure in hybrid search: when doing hybrid search (`semanticRatio in ]0.0, 1.0[`), an embedding failure no longer causes the search request to fail. Instead, only the keyword search is performed. When doing a full vector search (`semanticRatio==1.0`), a failure to embed will still result in failing that search.
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-04-04 16:00:20 +00:00
Louis Dureuil
1ff2a2d6fb
Add semanticHitCount
2024-04-04 16:04:06 +02:00
Louis Dureuil
3c6e9851a4
Correct error formatting
2024-04-04 15:58:19 +02:00
Louis Dureuil
466d718a05
Fix test
2024-04-04 15:58:19 +02:00
Louis Dureuil
6ebb6b55a6
Lazily embed, don't fail hybrid search on embedding failure
2024-04-04 15:58:17 +02:00
Louis Dureuil
fabc9cf14a
milli: add Embedder::embed_one
2024-04-04 15:57:29 +02:00