meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-01-31 15:31:53 +08:00

Author	SHA1	Message	Date
Louis Dureuil	393216bf30	Flatten embedders settings	2023-12-20 17:16:43 +01:00
Louis Dureuil	e249e4db7b	Change Setting::apply function signature	2023-12-20 17:15:24 +01:00
meili-bors[bot]	de2ca7006e	Merge #4272 4272: Don't pass default revision when the model is explicitly set in config r=Kerollmops a=dureuill # Pull Request ## Related issue Fixes #4271 ## What does this PR do? - When the `model` is explicitly set in the `embedders` setting, we reset the `revision` to `None`, such that if the user doesn't specify a revision, the head of the model repository is chosen. - Not changed: If the user specifies a revision, it applies, like previously. - Not changed: If the user doesn't specify a model, the default model with the default revision applies, like previously. ## Manual testing on a fresh DB 1. Enable experimental feature: ```sh curl \ -X PATCH 'http://localhost:7700/experimental-features/' \ -H 'Content-Type: application/json' -H 'Authorization: Bearer foo' \ --data-binary '{ "vectorStore": true }' ``` 2. Send settings with a specified model but no specified revision: ```sh curl \ -X PATCH 'http://localhost:7700/indexes/products/settings' \ -H 'Content-Type: application/json' --data-binary \ '{ "embedders": { "default": { "source": { "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" } }, "documentTemplate": { "template": "A product titled '{{doc.title}}'"} } } }' ``` 3. Check that the task was successful: ```sh curl 'http://localhost:7700/tasks/0' {"uid":0,"indexUid":"products","status":"succeeded","type":"settingsUpdate","canceledBy":null,"details":{"embedders":{"default":{"source":{"huggingFace":{"model":"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}},"documentTemplate":{"template":"A product titled {{doc.title}}"}}}},"error":null,"duration":"PT0.001892S","enqueuedAt":"2023-12-20T09:17:01.73789Z","startedAt":"2023-12-20T09:17:01.73854Z","finishedAt":"2023-12-20T09:17:01.740432Z"} ``` 4. Send documents to index: ```sh curl 'https://localhost:7700/indexes/products/documents' -H 'Content-Type: application/json' --data-binary '{"id": 0, "title": "Best product"}' ``` Co-authored-by: Louis Dureuil <louis@meilisearch.com> v1.6.0-rc.2	2023-12-20 14:27:51 +00:00
Louis Dureuil	333ce12eb2	Fixed issue where the default revision is always the one we picked for the default model	2023-12-20 10:17:49 +01:00
meili-bors[bot]	fb9db1eba6	Merge #4269 4269: Remove dependency that requires libstdc++ r=dureuill a=dureuill Removes the dependency that caused the additional runtime dependency on libstdc++ by disabling the default features of the hf tokenizer. ## Discussion - This removes a feature that is using a C++ dependency and is supposed to accelerate the tokenizer. As the tokenizer is likely to be a significant bottleneck for embedding texts using a HF model, this is an issue. - We should at least rerun the movies vector indexing and check that it still works correctly and that it has a runtime in the ballpark of what it used to be. Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net> v1.6.0-rc.1	2023-12-19 12:26:48 +00:00
Louis Dureuil	b2193e612f	Revert "Add libstdc++ in Dockerfile" as it is no longer needed This reverts commit 9df8cfc013452ecb5935d5501c96a4c465183a5d.	2023-12-18 22:17:29 +01:00
Louis Dureuil	942d49314c	Remove dependency that requires libstdc++	2023-12-18 22:17:18 +01:00
meili-bors[bot]	9a846e82bc	Merge #4268 4268: Add libstdc++ in Dockerfile r=curquiza a=sanders41 # Pull Request ## Related issue Fixes #4267 ## What does this PR do? - Add libstdc++ in the Dockerfile ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Paul Sanders <psanders1@gmail.com>	2023-12-18 18:35:53 +00:00
Paul Sanders	9df8cfc013	Add libstdc++ in Dockerfile	2023-12-18 13:05:46 -05:00
meili-bors[bot]	248aaa6d45	Merge #4262 4262: Update version for the next release (v1.6.0) in Cargo.toml r=curquiza a=meili-bot ⚠️ This PR is automatically generated. Check the new version is the expected one and Cargo.lock has been updated before merging. Co-authored-by: curquiza <curquiza@users.noreply.github.com> v1.6.0-rc.0	2023-12-18 14:00:19 +00:00
curquiza	50d6317ec0	Update version for the next release (v1.6.0) in Cargo.toml	2023-12-18 13:57:46 +00:00
meili-bors[bot]	b734bd9891	Merge #4261 4261: Set rust toolchain to 1.71.1 in dockerfile r=curquiza a=dureuill Fixes docker [CI](https://github.com/meilisearch/meilisearch/actions/workflows/publish-docker-images.yml) Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-12-18 12:32:26 +00:00
Louis Dureuil	9800d5a103	Set rust toolchain to 1.71.1 in dockerfile	2023-12-18 10:59:25 +01:00
meili-bors[bot]	7c4ed07617	Merge #4257 4257: Change proximity precision settings r=dureuill a=ManyTheFish - [x] Add proximity_precision value into the analytics - [x] Change the naming of `attributeScale` and `wordScale` into `byAttribute` and `byWord` - [x] Remove proximityPrecision from the experimental feature Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Many the fish <many@meilisearch.com>	2023-12-18 09:07:28 +00:00
ManyTheFish	3a99a555a2	Fix experimental features snapshots in tests	2023-12-18 10:05:51 +01:00
Many the fish	9e1b458010	Merge branch 'main' into change-proximity-precision-settings	2023-12-18 09:08:47 +01:00
meili-bors[bot]	2aede03bc2	Merge #4226 4226: Hybrid search r=dureuill a=dureuill Allows to perform hybrid search requests that combine the results of semantic and keyword search and automatically generate embeddings. ## How to use See [feature description](https://meilisearch.notion.site/v1-6-Hybrid-Search-Embedders-ea42c82f90cc4bc0be1eeb917c1118c8) ## Changes - work is based on #4213 - milli::new search now takes an input universe directly, rather than computing it from a filter. This adds flexibility to require results on a subset of documents - vector search is now a regular ranking rule (akin to sort and geosort) and reports its score as a ScoreDetail - separate keyword search and vector search functions, vector search now respects (geo)sort ranking rules - add automatic embedding - add hybrid search Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-12-14 16:24:56 +00:00
ManyTheFish	e741bc1c62	Add proximity_precision value into the analytics	2023-12-14 16:48:06 +01:00
ManyTheFish	6425996e36	Change the naming of attributeScale and wordScale into byAttribute and byWord	2023-12-14 16:31:00 +01:00
Louis Dureuil	eb5cb91da2	Switch default from hf to openai	2023-12-14 16:19:46 +01:00
Louis Dureuil	87bba98bd8	Various changes - fixed seed for arroy - check vector dimensions as soon as it is provided to search - don't embed whitespace	2023-12-14 16:08:42 +01:00
Louis Dureuil	217105b7da	hybrid search uses semantic ratio, error handling	2023-12-14 16:08:42 +01:00
ManyTheFish	1b7c164a55	Pass the semantic ratio to milli	2023-12-14 16:08:42 +01:00
ManyTheFish	f3f3944469	Fix error checking	2023-12-14 16:08:42 +01:00
ManyTheFish	93dcbf598d	Deserialize semantic ratio	2023-12-14 16:08:42 +01:00
ManyTheFish	ac68f33194	Add simple test	2023-12-14 16:08:42 +01:00
ManyTheFish	9991152bbe	Add TODOs	2023-12-14 16:08:42 +01:00
Louis Dureuil	a4536b1381	Small adjustments to respect the spec	2023-12-14 16:08:42 +01:00
Louis Dureuil	5b51cb04af	Remove some settings	2023-12-14 16:08:42 +01:00
Louis Dureuil	3c1a14f1cd	Add settings routes	2023-12-14 16:08:42 +01:00
Louis Dureuil	b8e4709dfa	Remove prompt strategy and fallback	2023-12-14 16:08:41 +01:00
Louis Dureuil	806e5b6899	Tests pass	2023-12-14 16:08:41 +01:00
Louis Dureuil	61bd2fb7a9	Update arroy	2023-12-14 16:08:41 +01:00
Louis Dureuil	e0cc775dc4	Various changes - DistributionShift in Search object (to be set from model in embed?) - Fix issue where embedder index wasn't computed at search time - Accept as default embedder either the "default" one, or the only embedder when there is only one	2023-12-14 16:08:41 +01:00
Louis Dureuil	12940d79a9	WIP - manual embedder - multi embedders OK - clippy + tests OK	2023-12-14 16:08:41 +01:00
Louis Dureuil	922a640188	WIP multi embedders fixed template bugs	2023-12-14 16:08:41 +01:00
Louis Dureuil	abbe131084	Cosmetic change	2023-12-14 16:08:41 +01:00
Louis Dureuil	d4715e0c4d	Fix same vector sort bug	2023-12-14 16:08:41 +01:00
Louis Dureuil	11e2a2c1aa	Fix geosort bug	2023-12-14 16:08:41 +01:00
Louis Dureuil	65e49b7092	Remove stuff, add distribution shift (WIP)	2023-12-14 16:08:38 +01:00
Louis Dureuil	e56f160032	Actually pass embedders on reindex	2023-12-14 16:07:49 +01:00
Louis Dureuil	687d92f217	prompt bifluor+	2023-12-14 16:07:49 +01:00
Louis Dureuil	fb539f61fe	WIP	2023-12-14 16:07:49 +01:00
Louis Dureuil	cb4ebe163e	WIP	2023-12-14 16:07:49 +01:00
Louis Dureuil	dde3a04679	WIP arroy integration	2023-12-14 16:07:49 +01:00
Louis Dureuil	13c2c6c16b	Small commit to add hybrid search and autoembedding	2023-12-14 16:07:48 +01:00
Louis Dureuil	21bcf32109	Add candle and hg_hub, updating a lot of deps in the process	2023-12-14 16:07:48 +01:00
ManyTheFish	35e1981488	Remove proximityPrecision form the experimental feature	2023-12-14 15:52:42 +01:00
meili-bors[bot]	e0f712b9d3	Merge #4254 4254: Bring back v1.5.1 changes into main r=ManyTheFish a=Kerollmops This pull request brings back changes from the _release-v1.5.1_ branch into _main_. Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: curquiza <curquiza@users.noreply.github.com> Co-authored-by: Clément Renault <clement@meilisearch.com>	2023-12-14 09:41:57 +00:00
Clément Renault	56571f762a	Merge remote-tracking branch 'origin/main' into tmp-release-v1.5.1	2023-12-13 11:57:01 +01:00

1 2 3 4 5 ...

8833 Commits