meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-11-27 20:45:06 +08:00

Author	SHA1	Message	Date
Louis Dureuil	d1dd7e5d09	In transform for removed embedders, write back their user provided vectors in documents, and clear the writers	2024-06-12 14:50:55 +02:00
Louis Dureuil	d18c1f77d7	Update embedder configs with a finer granularity - no longer clear vector DB between any two embedder changes	2024-06-12 14:50:55 +02:00
Louis Dureuil	7cef2299cf	Fix behavior when removing a document	2024-06-11 09:45:08 +02:00
Tamo	2cdcb703d9	fix the deletion of vectors and add a test	2024-06-06 11:39:29 +02:00
Tamo	d85ab23b82	rename all occurences of user_defined to user_provided for consistency	2024-06-06 11:39:29 +02:00
Tamo	b7349910d9	implements mor review comments	2024-06-06 11:39:29 +02:00
Tamo	376b3a19a7	makes clippy and fmt happy	2024-06-06 11:39:29 +02:00
Tamo	5d50850e12	always push the user defined vectors in arroy	2024-06-06 11:39:29 +02:00
Tamo	a73ccc78a6	forward the embedding config to the extractors	2024-06-06 11:39:28 +02:00
Tamo	9eb6f522ea	wraps the index embedding config in a struct	2024-06-06 11:37:30 +02:00
Tamo	84e498299b	Remove the vectors from the documents database	2024-06-06 11:36:11 +02:00
Tamo	7a84697570	never store the _vectors as searchable or faceted fields	2024-06-06 11:36:11 +02:00
ManyTheFish	30293883e0	Fix condition mistake	2024-06-05 17:30:07 +02:00
ManyTheFish	b833be46b9	Avoid running proximity when only the exact attributes changes	2024-06-05 17:30:07 +02:00
ManyTheFish	0a4118329e	Put only_additional_fields to None if the difference gives an empty result.	2024-06-05 17:30:07 +02:00
ManyTheFish	261e92d7e6	Skip iterating over documents when the faceted field list doesn't change	2024-06-05 17:30:07 +02:00
ManyTheFish	5cd08979b1	iterate over the faceted fields instead of over the whole document	2024-06-05 17:30:07 +02:00
Clément Renault	a998b881f6	Cache a lot of operations to know if a field must be indexed	2024-06-05 17:30:07 +02:00
Clément Renault	b81953a65d	Add a span for the prepare_for_documents_reindexing	2024-06-05 17:30:07 +02:00
Clément Renault	091bb157f1	Add a span for the settings diff creation	2024-06-05 17:30:07 +02:00
Clément Renault	1b639ce44b	Reduce the number of complex calls to settings diff functions	2024-06-05 17:30:07 +02:00
Clément Renault	87cf8a3c94	Introduce a new way to determine the operations to perform on the fields	2024-06-05 17:30:07 +02:00
Clément Renault	0f578348f1	Introduce a dedicated function to write proximity entries in database	2024-06-05 17:30:07 +02:00
Clément Renault	fad4675abe	Give the settings diff to the write_typed_chunk_into_index function	2024-06-05 17:30:07 +02:00
Clément Renault	1ab03c4ede	Fix an issue with settings diff and * in the searchable attributes	2024-06-05 17:30:07 +02:00
Clément Renault	0c6e4b2f00	Introducing a new into_del_add_obkv_conditional_operation function	2024-06-05 17:30:07 +02:00
Clément Renault	42b3f52ef9	Introduce the SettingDiff only_additional_fields method	2024-06-05 17:30:07 +02:00
ManyTheFish	1ab88e10b9	Merge branch 'main' into merge-release-v1.8.1-in-main	2024-05-29 16:24:00 +02:00
Many the fish	e1fbfde6c4	Merge branch 'main' into merge-release-v1.8.1-in-main	2024-05-29 11:31:03 +02:00
ManyTheFish	27b75ec648	merge main into v1.8.1	2024-05-29 11:26:07 +02:00
Louis Dureuil	d35278320e	Add support functions for accessing arroy writers and readers	2024-05-28 15:27:43 +02:00
Clément Renault	dc949ab46a	Remove puffin usage	2024-05-27 15:59:14 +02:00
meili-bors[bot]	19acc65ad2	Merge #4646 4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops This PR implements what is described in #4485. It reduces the number of disk writes and disk usage. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-23 16:06:50 +00:00
Clément Renault	fe17c0f52e	Construct the minimal OBKVs according to the settings diff	2024-05-23 11:23:57 +02:00
Clément Renault	bc5663e673	FieldIdsMap no longer useful thanks to #4631	2024-05-22 16:06:15 +02:00
Louis Dureuil	8a941c0241	Smaller review changes	2024-05-22 14:44:42 +02:00
Louis Dureuil	16037e2169	Don't remove embedders that are not in the config from the document DB	2024-05-22 12:24:51 +02:00
Clément Renault	500ddc76b5	Make the flattened sorter optional	2024-05-21 16:16:36 +02:00
Clément Renault	1aa8ed9ef7	Make the original sorter optional	2024-05-21 14:53:26 +02:00
ManyTheFish	f762307838	Fix clippy	2024-05-21 13:44:20 +02:00
ManyTheFish	3e94a90722	Fixes	2024-05-21 13:39:46 +02:00
ManyTheFish	fc7e817221	Index geo points based on the settings differences	2024-05-20 12:27:26 +02:00
Louis Dureuil	d05d49ffd8	Fix tests	2024-05-20 10:36:18 +02:00
Louis Dureuil	0462ebbe58	Don't write an empty _vectors field	2024-05-20 10:36:18 +02:00
Louis Dureuil	2f7a8a4efb	Don't write vectors that weren't autogenerated in document DB	2024-05-20 10:36:18 +02:00
Louis Dureuil	52d9cb6e5a	Refactor vector indexing - use the parsed_vectors module - only parse `_vectors` once per document, instead of once per embedder per document	2024-05-20 10:36:17 +02:00
Tamo	897d25780e	update milli to latest version	2024-05-16 18:31:32 +02:00
Tamo	f2d0a59f1d	when no searchable attributes are defined, makes all the weight equals to zero	2024-05-16 01:06:33 +02:00
Tamo	ad4d8502b3	stops storing the whole fieldids weights map when no searchable are defined	2024-05-15 17:16:10 +02:00
Tamo	7ec4e2a3fb	apply all style review comments	2024-05-15 15:02:26 +02:00
Tamo	caa6a7149a	make the attribute ranking rule use the weights and fix the tests	2024-05-14 17:36:32 +02:00
Tamo	b0afe0972e	stop updating the fields ids map when fields are only swapped	2024-05-14 17:00:02 +02:00
Tamo	685f452fb2	Fix the indexing of the searchable	2024-05-14 17:00:02 +02:00
Tamo	4e4a1ddff7	gate a test behind the required feature	2024-05-14 17:00:02 +02:00
Tamo	c22460045c	Stops returning an option in the internal searchable fields	2024-05-14 17:00:02 +02:00
meili-bors[bot]	4d5971f343	Merge #4621 4621: Bring back changes from v1.8.0 into main r=curquiza a=curquiza Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-06 13:46:39 +00:00
meili-bors[bot]	ebca29f3de	Merge #4597 4597: Fix embeddings settings update r=ManyTheFish a=ManyTheFish # Pull Request - add some conditions reducing the work done when changing the settings - add some benchmarks on embedders ## Related issue Fixes #4585 Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-04-25 16:37:28 +00:00
Clément Renault	d4aeff92d0	Introduce the ThreadPoolNoAbort wrapper	2024-04-24 16:40:12 +02:00
Clément Renault	96cc5319c8	Introduce a new internal error type to categorize panics	2024-04-22 18:09:33 +02:00
Clément Renault	0c7003c5df	Introduce an atomic to catch panics in thread pools	2024-04-22 18:09:33 +02:00
ManyTheFish	a1aa999026	Add conditions reducing wrok	2024-04-22 14:18:35 +02:00
ManyTheFish	df29ba709a	Make some cleaning in Arcs	2024-04-17 12:33:25 +02:00
ManyTheFish	3acfab2eb7	Fix PR comments	2024-04-17 10:55:51 +02:00
ManyTheFish	87a93ba47d	fix clippy	2024-04-16 14:39:30 +02:00
ManyTheFish	eaf113ef34	Fix wod pair proximity error when nothing has to be extracted	2024-04-16 14:39:30 +02:00
ManyTheFish	e5ae337aae	Comeback to sorters in extract_word_docids using buffers and merge the keys manually is less efficient	2024-04-16 14:39:30 +02:00
ManyTheFish	a489b406b4	fix test	2024-04-16 14:39:06 +02:00
ManyTheFish	02c3d6b265	finish work	2024-04-16 14:39:06 +02:00
ManyTheFish	b5e4a55af6	refactor faceted and searchable pipeline	2024-04-16 14:39:06 +02:00
ManyTheFish	a7e368aaa6	Create InnerIndexSettingsDiffs struct and populate it	2024-04-16 14:39:06 +02:00
ManyTheFish	893200ab87	Avoid clearing documents in transform	2024-04-16 14:39:06 +02:00
ManyTheFish	aabce52b1b	Fix test	2024-04-16 14:39:06 +02:00
ManyTheFish	8fff5fc281	update tests	2024-04-16 14:39:06 +02:00
yudrywet	cf864a1c2e	chore: fix some typos in comments Signed-off-by: yudrywet <yudeyao@yeah.net>	2024-04-14 20:11:34 +08:00
Louis Dureuil	466d718a05	Fix test	2024-04-04 15:58:19 +02:00
meili-bors[bot]	56bf8503db	Merge #4537 4537: Expose distribution shift in settings r=ManyTheFish a=dureuill See [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#d652adc0890445658aaf36352dbc8802) # Changes - Distribution shift added to all embedders. - Exposed in settings - Changed the reindexing logic to not trigger a reindex operation when only the distribution shift or API key change Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-04-03 09:08:58 +00:00
redistay	182cb42953	chore: fix some typos in conments Signed-off-by: redistay <wujunjing@outlook.com>	2024-04-02 19:37:55 +08:00
meili-bors[bot]	92a049c2dd	Merge #4543 4543: Bring back changes from v1.7.4 into main r=Kerollmops a=dureuill Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: dureuill <dureuill@users.noreply.github.com>	2024-03-28 16:53:51 +00:00
Louis Dureuil	796213af9a	Merge branch 'main' into tmp-release-v1.7.4	2024-03-28 10:51:49 +01:00
Louis Dureuil	ee8cbea810	Don't optimize reindexing when fields contain dots	2024-03-27 17:04:45 +01:00
Louis Dureuil	572fb3a51d	Finer granularity for embedder needs reindex	2024-03-27 12:01:34 +01:00
Louis Dureuil	afd1da5642	Add distribution to all embedders	2024-03-27 11:50:22 +01:00
Louis Dureuil	817ccc089a	also allow `api_key`	2024-03-25 11:50:00 +01:00
Louis Dureuil	4136630ea5	Use constants instead of raw strings in set_*set()	2024-03-25 11:39:33 +01:00
Louis Dureuil	58972f35cb	Allow `url` parameter for ollama embedder	2024-03-25 11:32:55 +01:00
Louis Dureuil	dfa5e41ea6	Check validity of the URL setting	2024-03-25 11:23:16 +01:00
Louis Dureuil	a1db342f01	Expose REST embedder to the API	2024-03-25 11:23:15 +01:00
Louis Dureuil	f87747f4d3	Remove unwraps	2024-03-25 11:23:04 +01:00
Louis Dureuil	ac52c857e8	Update ollama and openai impls to use the rest embedder internally	2024-03-25 11:23:03 +01:00
meili-bors[bot]	fc1c3f4a29	Merge #4466 4466: Implements the search cutoff r=irevoire a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4488 ## What does this PR do? - Adds a cutoff to the bucket sort after 150ms has been spent - Adds a new setting to customize the default value of 150ms - When the time is exceeded, we exit early with what we had the time to sort - If the cutoff has been reached, the search details are updated with a new `Skip` ranking details for the ranking rules that were skipped - Adds analytics to measure the total number of degraded search requests - Adds the number of degraded search requests to the Prometheus metrics and Grafana dashboard - The cutoff must not skip the filters; otherwise, we would leak documents to people who don’t have the right to see them Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-20 13:06:53 +00:00
Tamo	c5322df519	Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1""	2024-03-20 10:08:28 +01:00
Tamo	567194b925	Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" This reverts commit `bd74cce86a`, reversing changes made to `d2f77e88bd`.	2024-03-19 16:56:21 +01:00
Clément Renault	bd74cce86a	Merge remote-tracking branch 'origin/main' into release-v1.7.1	2024-03-19 13:39:17 +01:00
Tamo	d1db495119	add a settings for the search cutoff	2024-03-19 10:28:23 +01:00
meili-bors[bot]	abd954755d	Merge #4476 4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting. Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly. ### TODO - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings. This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one. - [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by. - [x] Use it when there is no search query. - [x] Use it when there is a search query with and without allowed typos. - [x] Do not change the original logic, only use a wrapper. - [x] Add tests Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-03-13 14:36:14 +00:00
meili-bors[bot]	5ed7b6a0b2	Merge #4456 4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm # Pull Request ## Related issue [Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305) ## What does this PR do? - Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama` - Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set - Changes some of the structs and functions in `openai.rs` to be public so that they can be shared. - Added more error variants for Ollama specific errors - It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Co-authored-by: Jakob Klemm <jakob@jeykey.net> Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>	2024-03-13 08:48:47 +00:00
Jakob Klemm	88bc9556a9	Add Ollama dimension inference and add clearer errors Instead of the user manually specifying the model dimensions it will now automatically get determined Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user	2024-03-12 19:59:11 +01:00
Clément Renault	ca4876fd10	Do not reindex when modifying unknown faceted field	2024-03-12 16:18:58 +01:00
Clément Renault	d3a95ea2f6	Introduce a new OrderByMap struct to simplify the sort by usage	2024-03-12 13:56:56 +01:00
meili-bors[bot]	ee3076d5ba	Merge #4462 4462: Divide threshold by ten r=dureuill a=ManyTheFish Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-03-06 13:05:38 +00:00

1 2 3 4 5 ...

939 Commits