meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-12-02 18:15:38 +08:00

Author	SHA1	Message	Date
Clément Renault	1ab03c4ede	Fix an issue with settings diff and * in the searchable attributes	2024-06-05 17:30:07 +02:00
Clément Renault	0c6e4b2f00	Introducing a new into_del_add_obkv_conditional_operation function	2024-06-05 17:30:07 +02:00
Clément Renault	42b3f52ef9	Introduce the SettingDiff only_additional_fields method	2024-06-05 17:30:07 +02:00
ManyTheFish	1ab88e10b9	Merge branch 'main' into merge-release-v1.8.1-in-main	2024-05-29 16:24:00 +02:00
Many the fish	e1fbfde6c4	Merge branch 'main' into merge-release-v1.8.1-in-main	2024-05-29 11:31:03 +02:00
ManyTheFish	27b75ec648	merge main into v1.8.1	2024-05-29 11:26:07 +02:00
Louis Dureuil	d35278320e	Add support functions for accessing arroy writers and readers	2024-05-28 15:27:43 +02:00
Clément Renault	dc949ab46a	Remove puffin usage	2024-05-27 15:59:14 +02:00
meili-bors[bot]	19acc65ad2	Merge #4646 4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops This PR implements what is described in #4485. It reduces the number of disk writes and disk usage. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-23 16:06:50 +00:00
Clément Renault	fe17c0f52e	Construct the minimal OBKVs according to the settings diff	2024-05-23 11:23:57 +02:00
Clément Renault	bc5663e673	FieldIdsMap no longer useful thanks to #4631	2024-05-22 16:06:15 +02:00
Louis Dureuil	8a941c0241	Smaller review changes	2024-05-22 14:44:42 +02:00
Louis Dureuil	16037e2169	Don't remove embedders that are not in the config from the document DB	2024-05-22 12:24:51 +02:00
Clément Renault	500ddc76b5	Make the flattened sorter optional	2024-05-21 16:16:36 +02:00
Clément Renault	1aa8ed9ef7	Make the original sorter optional	2024-05-21 14:53:26 +02:00
ManyTheFish	f762307838	Fix clippy	2024-05-21 13:44:20 +02:00
ManyTheFish	3e94a90722	Fixes	2024-05-21 13:39:46 +02:00
ManyTheFish	fc7e817221	Index geo points based on the settings differences	2024-05-20 12:27:26 +02:00
Louis Dureuil	d05d49ffd8	Fix tests	2024-05-20 10:36:18 +02:00
Louis Dureuil	0462ebbe58	Don't write an empty _vectors field	2024-05-20 10:36:18 +02:00
Louis Dureuil	2f7a8a4efb	Don't write vectors that weren't autogenerated in document DB	2024-05-20 10:36:18 +02:00
Louis Dureuil	52d9cb6e5a	Refactor vector indexing - use the parsed_vectors module - only parse `_vectors` once per document, instead of once per embedder per document	2024-05-20 10:36:17 +02:00
Tamo	897d25780e	update milli to latest version	2024-05-16 18:31:32 +02:00
Tamo	f2d0a59f1d	when no searchable attributes are defined, makes all the weight equals to zero	2024-05-16 01:06:33 +02:00
Tamo	ad4d8502b3	stops storing the whole fieldids weights map when no searchable are defined	2024-05-15 17:16:10 +02:00
Tamo	7ec4e2a3fb	apply all style review comments	2024-05-15 15:02:26 +02:00
Tamo	caa6a7149a	make the attribute ranking rule use the weights and fix the tests	2024-05-14 17:36:32 +02:00
Tamo	b0afe0972e	stop updating the fields ids map when fields are only swapped	2024-05-14 17:00:02 +02:00
Tamo	685f452fb2	Fix the indexing of the searchable	2024-05-14 17:00:02 +02:00
Tamo	4e4a1ddff7	gate a test behind the required feature	2024-05-14 17:00:02 +02:00
Tamo	c22460045c	Stops returning an option in the internal searchable fields	2024-05-14 17:00:02 +02:00
meili-bors[bot]	4d5971f343	Merge #4621 4621: Bring back changes from v1.8.0 into main r=curquiza a=curquiza Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-06 13:46:39 +00:00
meili-bors[bot]	ebca29f3de	Merge #4597 4597: Fix embeddings settings update r=ManyTheFish a=ManyTheFish # Pull Request - add some conditions reducing the work done when changing the settings - add some benchmarks on embedders ## Related issue Fixes #4585 Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-04-25 16:37:28 +00:00
Clément Renault	d4aeff92d0	Introduce the ThreadPoolNoAbort wrapper	2024-04-24 16:40:12 +02:00
Clément Renault	96cc5319c8	Introduce a new internal error type to categorize panics	2024-04-22 18:09:33 +02:00
Clément Renault	0c7003c5df	Introduce an atomic to catch panics in thread pools	2024-04-22 18:09:33 +02:00
ManyTheFish	a1aa999026	Add conditions reducing wrok	2024-04-22 14:18:35 +02:00
ManyTheFish	df29ba709a	Make some cleaning in Arcs	2024-04-17 12:33:25 +02:00
ManyTheFish	3acfab2eb7	Fix PR comments	2024-04-17 10:55:51 +02:00
ManyTheFish	87a93ba47d	fix clippy	2024-04-16 14:39:30 +02:00
ManyTheFish	eaf113ef34	Fix wod pair proximity error when nothing has to be extracted	2024-04-16 14:39:30 +02:00
ManyTheFish	e5ae337aae	Comeback to sorters in extract_word_docids using buffers and merge the keys manually is less efficient	2024-04-16 14:39:30 +02:00
ManyTheFish	a489b406b4	fix test	2024-04-16 14:39:06 +02:00
ManyTheFish	02c3d6b265	finish work	2024-04-16 14:39:06 +02:00
ManyTheFish	b5e4a55af6	refactor faceted and searchable pipeline	2024-04-16 14:39:06 +02:00
ManyTheFish	a7e368aaa6	Create InnerIndexSettingsDiffs struct and populate it	2024-04-16 14:39:06 +02:00
ManyTheFish	893200ab87	Avoid clearing documents in transform	2024-04-16 14:39:06 +02:00
ManyTheFish	aabce52b1b	Fix test	2024-04-16 14:39:06 +02:00
ManyTheFish	8fff5fc281	update tests	2024-04-16 14:39:06 +02:00
yudrywet	cf864a1c2e	chore: fix some typos in comments Signed-off-by: yudrywet <yudeyao@yeah.net>	2024-04-14 20:11:34 +08:00
Louis Dureuil	466d718a05	Fix test	2024-04-04 15:58:19 +02:00
meili-bors[bot]	56bf8503db	Merge #4537 4537: Expose distribution shift in settings r=ManyTheFish a=dureuill See [usage page](https://meilisearch.notion.site/v1-8-AI-search-API-usage-135552d6e85a4a52bc7109be82aeca42#d652adc0890445658aaf36352dbc8802) # Changes - Distribution shift added to all embedders. - Exposed in settings - Changed the reindexing logic to not trigger a reindex operation when only the distribution shift or API key change Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-04-03 09:08:58 +00:00
redistay	182cb42953	chore: fix some typos in conments Signed-off-by: redistay <wujunjing@outlook.com>	2024-04-02 19:37:55 +08:00
meili-bors[bot]	92a049c2dd	Merge #4543 4543: Bring back changes from v1.7.4 into main r=Kerollmops a=dureuill Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com> Co-authored-by: dureuill <dureuill@users.noreply.github.com>	2024-03-28 16:53:51 +00:00
Louis Dureuil	796213af9a	Merge branch 'main' into tmp-release-v1.7.4	2024-03-28 10:51:49 +01:00
Louis Dureuil	ee8cbea810	Don't optimize reindexing when fields contain dots	2024-03-27 17:04:45 +01:00
Louis Dureuil	572fb3a51d	Finer granularity for embedder needs reindex	2024-03-27 12:01:34 +01:00
Louis Dureuil	afd1da5642	Add distribution to all embedders	2024-03-27 11:50:22 +01:00
Louis Dureuil	817ccc089a	also allow `api_key`	2024-03-25 11:50:00 +01:00
Louis Dureuil	4136630ea5	Use constants instead of raw strings in set_*set()	2024-03-25 11:39:33 +01:00
Louis Dureuil	58972f35cb	Allow `url` parameter for ollama embedder	2024-03-25 11:32:55 +01:00
Louis Dureuil	dfa5e41ea6	Check validity of the URL setting	2024-03-25 11:23:16 +01:00
Louis Dureuil	a1db342f01	Expose REST embedder to the API	2024-03-25 11:23:15 +01:00
Louis Dureuil	f87747f4d3	Remove unwraps	2024-03-25 11:23:04 +01:00
Louis Dureuil	ac52c857e8	Update ollama and openai impls to use the rest embedder internally	2024-03-25 11:23:03 +01:00
meili-bors[bot]	fc1c3f4a29	Merge #4466 4466: Implements the search cutoff r=irevoire a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4488 ## What does this PR do? - Adds a cutoff to the bucket sort after 150ms has been spent - Adds a new setting to customize the default value of 150ms - When the time is exceeded, we exit early with what we had the time to sort - If the cutoff has been reached, the search details are updated with a new `Skip` ranking details for the ranking rules that were skipped - Adds analytics to measure the total number of degraded search requests - Adds the number of degraded search requests to the Prometheus metrics and Grafana dashboard - The cutoff must not skip the filters; otherwise, we would leak documents to people who don’t have the right to see them Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-20 13:06:53 +00:00
Tamo	c5322df519	Revert "Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1""	2024-03-20 10:08:28 +01:00
Tamo	567194b925	Revert "Merge remote-tracking branch 'origin/main' into release-v1.7.1" This reverts commit `bd74cce86a`, reversing changes made to `d2f77e88bd`.	2024-03-19 16:56:21 +01:00
Clément Renault	bd74cce86a	Merge remote-tracking branch 'origin/main' into release-v1.7.1	2024-03-19 13:39:17 +01:00
Tamo	d1db495119	add a settings for the search cutoff	2024-03-19 10:28:23 +01:00
meili-bors[bot]	abd954755d	Merge #4476 4476: Make the `/facet-search` route use the `sortFacetValuesBy` setting r=irevoire a=Kerollmops This PR fixes #4423 by ensuring that the `/facet-search` route uses the `sortFacetValuesBy` setting. Note for the documentation team (to be moved in the tracking issue): Using the new `sortFacetValuesBy` setting can slow down the facet-search requests as Meilisearch iterates over the whole list of facet values and computes the count of documents on every entry. That is hardly or even impossible to optimize correctly. ### TODO - [x] Create a custom HashMap wrapper for the facet `OrderBy` settings. This wrapper will return the `OrderBy` setting of the facet, if not defined will use the default `*` one, and if not there either (strange) will fall back on the lexicographic one. - [x] Create a `ValuesCollection` wrapper that implements the logic for the lexicographic and count order by. - [x] Use it when there is no search query. - [x] Use it when there is a search query with and without allowed typos. - [x] Do not change the original logic, only use a wrapper. - [x] Add tests Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-03-13 14:36:14 +00:00
meili-bors[bot]	5ed7b6a0b2	Merge #4456 4456: Add Ollama as an embeddings provider r=dureuill a=jakobklemm # Pull Request ## Related issue [Related Discord Thread](https://discord.com/channels/1006923006964154428/1211977150316683305) ## What does this PR do? - Adds Ollama as a provider of Embeddings besides HuggingFace and OpenAI under the name `ollama` - Adds the environment variable `MEILI_OLLAMA_URL` to set the embeddings URL of an Ollama instance with a default value of `http://localhost:11434/api/embeddings` if no variable is set - Changes some of the structs and functions in `openai.rs` to be public so that they can be shared. - Added more error variants for Ollama specific errors - It uses the model `nomic-embed-text` as default, but any string value is allowed, however it won't automatically check if the model actually exists or is an embedding model Tested against Ollama version `v0.1.27` and the `nomic-embed-text` model. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Co-authored-by: Jakob Klemm <jakob@jeykey.net> Co-authored-by: Louis Dureuil <louis.dureuil@gmail.com>	2024-03-13 08:48:47 +00:00
Jakob Klemm	88bc9556a9	Add Ollama dimension inference and add clearer errors Instead of the user manually specifying the model dimensions it will now automatically get determined Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user	2024-03-12 19:59:11 +01:00
Clément Renault	ca4876fd10	Do not reindex when modifying unknown faceted field	2024-03-12 16:18:58 +01:00
Clément Renault	d3a95ea2f6	Introduce a new OrderByMap struct to simplify the sort by usage	2024-03-12 13:56:56 +01:00
meili-bors[bot]	ee3076d5ba	Merge #4462 4462: Divide threshold by ten r=dureuill a=ManyTheFish Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-03-06 13:05:38 +00:00
Louis Dureuil	b11df7ec34	Meilisearch: fix some wrong spans	2024-03-05 10:11:43 +01:00
ManyTheFish	eada6de261	Divide threshold by ten	2024-03-04 18:02:54 +01:00
Jakob Klemm	d3004d8040	Implemented Ollama as an embeddings provider Initial prototype of Ollama embeddings actually working, error handlign / retries still missing. Allow model to be any String and require dimensions parameter Fixed rustfmt formatting issues There were some formatting issues in the initial PR and this should not make the changes comply with the Rust style guidelines Because I accidentally didn't follow the style guide for commits in my commit messages I squashed them into one to comply	2024-03-04 15:09:43 +01:00
ManyTheFish	5e83bac448	Fix PR comments	2024-02-26 15:40:15 +01:00
ManyTheFish	a493a50825	Fix clippy	2024-02-22 14:53:33 +01:00
ManyTheFish	9d1f489a37	Fix facet incremental indexing	2024-02-21 18:42:16 +01:00
ManyTheFish	03bb6372af	Change is_batchable_with by mergeable_with	2024-02-14 11:50:22 +01:00
ManyTheFish	3beda8833d	Fix and add logs	2024-02-14 11:46:30 +01:00
ManyTheFish	48026aa75c	fix PR comments	2024-02-13 15:19:01 +01:00
Many the fish	e5e811e2c9	Update milli/src/update/index_documents/extract/mod.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-13 14:22:21 +01:00
Many the fish	55de96f74e	Update milli/src/update/facet/mod.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-13 14:22:10 +01:00
ManyTheFish	39c83cb3d9	fix clippy	2024-02-12 09:12:54 +01:00
Louis Dureuil	7efb1cae11	yield in loop when the channel is not disconnected	2024-02-12 09:12:54 +01:00
Louis Dureuil	7877788510	fix logs	2024-02-12 09:12:54 +01:00
ManyTheFish	be1b054b05	Compute chunk size based on the input data size ant the number of indexing threads	2024-02-08 17:28:37 +01:00
meili-bors[bot]	023c2d755f	Merge #4391 4391: Tracing r=dureuill a=irevoire # Pull Request - [ ] Hide the parameters of the process batch - [x] Make actix-web trace every call on every route - [x] Remove all `env_logger`/`logs` dependencies - [x] Be able to enable or disable the memory measurement using the `/logs` route parameters See the following product discussion: https://github.com/orgs/meilisearch/discussions/721 Supersedes https://github.com/meilisearch/meilisearch/pull/4338 ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4317 ## What does this PR do? Update the format of the logs from: ``` [2024-02-06T14:54:11Z INFO actix_server::builder] starting 10 workers ``` to ``` 2024-02-06T13:58:14.710803Z INFO actix_server::builder: 200: starting 10 workers ``` First, run meilisearch with the route enabled via the feature flag: - `cargo run --experimental-enable-logs-route` - Or at runtime by sending the following payload: ``` curl \ -X PATCH 'http://localhost:7700/experimental-features/' \ -H 'Content-Type: application/json' \ --data-binary '{ "logsRoute": true }' ``` Then gather data from meilisearch by calling for example: ``` curl \ -X POST http://localhost:7700/logs \ -H 'Content-Type: application/json' \ --data-binary '{ "mode": "fmt", "target": "milli=trace" }' ``` Once your operation is over, tell meilisearch to stop the route: ``` curl \ -X DELETE http://localhost:7700/logs ``` ---- In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand. ```bash cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json ``` Then go to https://profiler.firefox.com and load it. Note that we can also share the profiles using the https://share.firefox.dev website. Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com>	2024-02-08 14:16:56 +00:00
Louis Dureuil	407ad753ed	rust fmt	2024-02-08 15:11:42 +01:00
Tamo	bf43a3f60a	fix typo	2024-02-08 15:04:06 +01:00
Tamo	1502382316	use debug instead of debug_span	2024-02-08 15:04:06 +01:00
Tamo	08af0e690c	Structures a bunch of logs	2024-02-08 15:04:06 +01:00
Louis Dureuil	db722d201a	Write entries into database downgraded to trace level	2024-02-08 15:04:05 +01:00
Tamo	e773dfa9ba	get rids of log in milli and add logs for the bucket sort	2024-02-08 15:04:05 +01:00
Louis Dureuil	5d7061682e	Add tracing to milli	2024-02-08 15:03:31 +01:00
meili-bors[bot]	72ebac1fbb	Merge #4388 4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available. Fixes #4152. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-08 13:19:28 +00:00

1 2 3 4 5 ...

915 Commits