**Changes:**
The search filters are now using the FilterableAttributesFeatures from the FilterableAttributesRules to know if a field is filterable.
Moreover, the FilterableAttributesFeatures is more precise and an error will be returned if an operator is used on a field that doesn't have the related feature.
The facet-search is now checking if the feature is allowed in the FilterableAttributesFeatures and an error will be returned if the field doesn't have the related feature.
**Impact:**
- facet-search is now relying on AttributePatterns to match the locales
- search using filters is now relying on FilterableAttributesFeatures
- distinct attribute is now relying on FilterableAttributesRules
**Changes:**
The filterableAttributes type has been changed from a `BTreeSet<String>` to a `Vec<FilterableAttributesRule>`,
Which is a list of rules defining patterns to match the documents' fields and a set of feature to apply on the matching fields.
The rule order given by the user is now an important information, the features applied on a filterable field will be chosen based on the rule order as we do for the LocalizedAttributesRules.
This means that the list will not be reordered anymore and will keep the user defined order,
moreover, if there are any duplicates, they will not be de-duplicated anymore.
**Impact:**
- Settings API
- the database format of the filterable attributes changed
- may impact the LocalizedAttributesRules due to the AttributePatterns factorization
- OpenAPI generator
5355: Support fetching the pooling method from the model configuration r=Kerollmops a=dureuill
# Pull Request
## Related issue
Fixes#5354
## What does this PR do?
- Fetches the pooling configuration from the model repository
- Use a pooling method that depends on the pooling configuration of that model.
- Allow overriding the pooling method with a new huggingFace embedder parameter `pooling`
- for backward-compatibility with Meilisearch v1.13
- for compatibility with embedders that exhibit the same behavior as Meilisearch v1.13
- Handle the default value of that new parameter
- for compatibility, when importing a db/a dump, it should be set to `forceMean`
- when (re)set from the settings for an embedder, it should be set to `useModel`
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Publish binaries to GitHub release / Publish binary for macOS silicon (meilisearch-macos-apple-silicon, aarch64-apple-darwin) (push) Has been cancelled
5351: Bring back v1.13.0 changes into main r=irevoire a=Kerollmops
This PR brings back the changes made in v1.13 into the main branch.
Co-authored-by: ManyTheFish <many@meilisearch.com>
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: Clémentine <clementine@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
5339: Add back timeout from v1.11.3 r=Kerollmops a=dureuill
# Pull Request
## Related issue
Fixes#5337
## What does this PR do?
- Fix regression compared with v1.11 by reintroducing the 30s timeout on all REST API calls.
Thanks to `@migueltarga` for reporting the issue
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
5336: Meilitool Hair Dryer r=dureuill a=Kerollmops
This pull request introduces a new subcommand to hair dry a specific part of specific indexes. It is useful when [the memory-mapped pages are not hot in the cache](https://arc.net/l/quote/ixhcdwcq) and must be. Hair drying those interesting pages makes the search requests using the vector store much faster.
The previous technique used the "cat method," which consists of reading the whole LMDB data file and pipping it into the null file descriptor. By doing that, the whole LMDB data file becomes hot in the cache. However, when the database is large, at least 30% of it is free, and unused pages and many other pages don't need to be hot, e.g., raw JSON documents or uninteresting parts of the inverted index.
This new subcommand reads all the Arroy pages of a given index to make them hot, and only those. More coming...
The current algorithm is single-threaded and takes a lot of time. I am in the process of multithreading it. This is the time it takes to hair dry a 305GiB database with a single thread.
```
real 21m51.054s
user 0m3.155s
sys 0m19.393s
```
## To Do
- [ ] (optional) Do the reads in parallel.
Co-authored-by: Kerollmops <clement@meilisearch.com>