meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-12-02 01:55:03 +08:00

Author	SHA1	Message	Date
Jakob Klemm	88bc9556a9	Add Ollama dimension inference and add clearer errors Instead of the user manually specifying the model dimensions it will now automatically get determined Just like with hf.rs the word "test" gets embedded to determine the dimensions of the output Add a dedicated error type for if the model doesn't exist (don't automatically pull it though) and set the fault of that error to be the user	2024-03-12 19:59:11 +01:00
Clément Renault	ca4876fd10	Do not reindex when modifying unknown faceted field	2024-03-12 16:18:58 +01:00
Clément Renault	d3a95ea2f6	Introduce a new OrderByMap struct to simplify the sort by usage	2024-03-12 13:56:56 +01:00
Clément Renault	69c118ef76	Extract the facet order before extracting the facets values	2024-03-12 10:35:39 +01:00
meili-bors[bot]	ee3076d5ba	Merge #4462 4462: Divide threshold by ten r=dureuill a=ManyTheFish Change the facet incremental vs bulk indexing threshold to better fit our user needs, it might be changed in the future if we have more insights Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-03-06 13:05:38 +00:00
meili-bors[bot]	ab1224bfa7	Merge #4458 4458: Replace logging timer by spans r=Kerollmops a=dureuill - Remove logging timer dependency. - Remplace last uses in search by spans Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-05 16:43:23 +00:00
meili-bors[bot]	eefc1c421e	Merge #4459 4459: Put a bound on OpenAI timeout r=dureuill a=dureuill # Pull Request ## Related issue Fixes #4460 ## What does this PR do? - Makes sure that the timeout of the openai embedder is limited to max 1min, rather than the prior 15min+ Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-03-05 15:18:51 +00:00
Louis Dureuil	0c216048b5	Cap timeout duration	2024-03-05 12:19:25 +01:00
Louis Dureuil	36d17110d8	openai: Handle BAD_GETAWAY, be more resilient to failure	2024-03-05 12:18:54 +01:00
Louis Dureuil	25f64ce7df	Replace logging timer by spans	2024-03-05 11:05:42 +01:00
Louis Dureuil	b11df7ec34	Meilisearch: fix some wrong spans	2024-03-05 10:11:43 +01:00
ManyTheFish	eada6de261	Divide threshold by ten	2024-03-04 18:02:54 +01:00
Jakob Klemm	d3004d8040	Implemented Ollama as an embeddings provider Initial prototype of Ollama embeddings actually working, error handlign / retries still missing. Allow model to be any String and require dimensions parameter Fixed rustfmt formatting issues There were some formatting issues in the initial PR and this should not make the changes comply with the Rust style guidelines Because I accidentally didn't follow the style guide for commits in my commit messages I squashed them into one to comply	2024-03-04 15:09:43 +01:00
Louis Dureuil	452a343a2b	Fix imports	2024-02-28 18:09:40 +01:00
meili-bors[bot]	b87485e80d	Merge #4433 4433: Enhance facet incremental r=Kerollmops a=ManyTheFish # Pull Request ## Related issue Fixes #4367 Fixes #4409 ## What does this PR do? - Add a test reproducing #4409 - Fix #4409 by removing a document from a level only if it is no more present in all the linked sub-level nodes - Optimize facet Incremental indexing by creating or deleting a complete level once per field id instead of for each facet value - Optimize facet Incremental indexing by doing the additions and the deletions in the same process instead of doing them separately Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-02-28 15:28:46 +00:00
ManyTheFish	5e83bac448	Fix PR comments	2024-02-26 15:40:15 +01:00
Louis Dureuil	55796406c5	Add GPU analytics	2024-02-26 10:41:47 +01:00
ManyTheFish	a493a50825	Fix clippy	2024-02-22 14:53:33 +01:00
ManyTheFish	9d1f489a37	Fix facet incremental indexing	2024-02-21 18:42:16 +01:00
ManyTheFish	03bb6372af	Change is_batchable_with by mergeable_with	2024-02-14 11:50:22 +01:00
ManyTheFish	3beda8833d	Fix and add logs	2024-02-14 11:46:30 +01:00
ManyTheFish	48026aa75c	fix PR comments	2024-02-13 15:19:01 +01:00
Many the fish	e5e811e2c9	Update milli/src/update/index_documents/extract/mod.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-13 14:22:21 +01:00
Many the fish	55de96f74e	Update milli/src/update/facet/mod.rs Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-13 14:22:10 +01:00
ManyTheFish	39c83cb3d9	fix clippy	2024-02-12 09:12:54 +01:00
Louis Dureuil	7efb1cae11	yield in loop when the channel is not disconnected	2024-02-12 09:12:54 +01:00
Louis Dureuil	7877788510	fix logs	2024-02-12 09:12:54 +01:00
ManyTheFish	be1b054b05	Compute chunk size based on the input data size ant the number of indexing threads	2024-02-08 17:28:37 +01:00
meili-bors[bot]	023c2d755f	Merge #4391 4391: Tracing r=dureuill a=irevoire # Pull Request - [ ] Hide the parameters of the process batch - [x] Make actix-web trace every call on every route - [x] Remove all `env_logger`/`logs` dependencies - [x] Be able to enable or disable the memory measurement using the `/logs` route parameters See the following product discussion: https://github.com/orgs/meilisearch/discussions/721 Supersedes https://github.com/meilisearch/meilisearch/pull/4338 ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4317 ## What does this PR do? Update the format of the logs from: ``` [2024-02-06T14:54:11Z INFO actix_server::builder] starting 10 workers ``` to ``` 2024-02-06T13:58:14.710803Z INFO actix_server::builder: 200: starting 10 workers ``` First, run meilisearch with the route enabled via the feature flag: - `cargo run --experimental-enable-logs-route` - Or at runtime by sending the following payload: ``` curl \ -X PATCH 'http://localhost:7700/experimental-features/' \ -H 'Content-Type: application/json' \ --data-binary '{ "logsRoute": true }' ``` Then gather data from meilisearch by calling for example: ``` curl \ -X POST http://localhost:7700/logs \ -H 'Content-Type: application/json' \ --data-binary '{ "mode": "fmt", "target": "milli=trace" }' ``` Once your operation is over, tell meilisearch to stop the route: ``` curl \ -X DELETE http://localhost:7700/logs ``` ---- In the case you’re profiling code, you will be interested by the next command that converts the output of the route to a format that the firefox profiler can understand. ```bash cargo run --release --bin trace-to-firefox -- 2024-01-17_17:07:55-indexing-trace.json ``` Then go to https://profiler.firefox.com and load it. Note that we can also share the profiles using the https://share.firefox.dev website. Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com>	2024-02-08 14:16:56 +00:00
Louis Dureuil	407ad753ed	rust fmt	2024-02-08 15:11:42 +01:00
Tamo	bf43a3f60a	fix typo	2024-02-08 15:04:06 +01:00
Tamo	1502382316	use debug instead of debug_span	2024-02-08 15:04:06 +01:00
Tamo	08af0e690c	Structures a bunch of logs	2024-02-08 15:04:06 +01:00
Louis Dureuil	db722d201a	Write entries into database downgraded to trace level	2024-02-08 15:04:05 +01:00
Tamo	e773dfa9ba	get rids of log in milli and add logs for the bucket sort	2024-02-08 15:04:05 +01:00
Louis Dureuil	5d7061682e	Add tracing to milli	2024-02-08 15:03:31 +01:00
meili-bors[bot]	72ebac1fbb	Merge #4388 4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available. Fixes #4152. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-02-08 13:19:28 +00:00
Louis Dureuil	a1caac9bfb	Correct distribution shifts for new models	2024-02-07 15:09:16 +01:00
Louis Dureuil	88d03c56ab	Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model	2024-02-07 11:52:09 +01:00
Louis Dureuil	32ee05ccef	Fix default dimensions for models	2024-02-07 11:52:09 +01:00
Louis Dureuil	74c180267e	pass dimensions only when defined	2024-02-07 11:52:08 +01:00
Louis Dureuil	517f5332d6	Allow actually passing `dimensions` for OpenAI source -> make sure the settings change is rejected or the settings task fails when the specified model doesn't support overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.	2024-02-07 11:51:44 +01:00
Louis Dureuil	9ac5750096	Retrieve the overriden dimensions from the configuration when fetching settings	2024-02-07 11:51:44 +01:00
Louis Dureuil	7ae4013478	Make sure the overriden dimensions are always used when embedding	2024-02-07 11:51:44 +01:00
Gosti	fb705116a6	feat: add new models and ability to override dimensions	2024-02-07 11:51:42 +01:00
Clément Renault	053306c0e7	Try with 500MiB	2024-02-07 11:24:43 +01:00
Clément Renault	9eeb75d501	Clamp the max memory of the grenad sorters to a reasonable maximum	2024-02-06 10:47:04 +01:00
Louis Dureuil	fbf5f2a392	Don't use a runtime in extract_embedder, use it only for OpenAI	2024-02-01 10:33:27 +01:00
Louis Dureuil	1555870088	Truncate HuggingFace vectors that are too long	2024-02-01 10:33:27 +01:00
Tamo	9f8f3105d5	make clippy happy	2024-02-01 10:33:27 +01:00
Tamo	318843aacd	add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB	2024-02-01 10:33:27 +01:00
Louis Dureuil	dff2707471	Use MatchingWords from keyword search instead of the one from vector search	2024-02-01 10:33:27 +01:00
Tamo	c1bf33a112	Revert "Remove panic on the geosearch"	2024-01-25 18:51:19 +01:00
Louis Dureuil	f692021bfc	Implement PR comments	2024-01-22 10:25:56 +01:00
Louis Dureuil	84f49d76cd	Add cuda feature	2024-01-22 10:25:16 +01:00
Tamo	0887186ecf	make clippy happy	2024-01-17 16:07:10 +01:00
Tamo	7d190d8078	add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB	2024-01-17 15:51:52 +01:00
Clément Renault	01e2c3d6bb	Bump arroy to v0.2.0	2024-01-16 16:45:55 +01:00
Clément Renault	9f9ad4cc05	Fix Clippy warnings	2024-01-16 15:27:24 +01:00
Clément Renault	3ee7682fa7	Fix some integer comparisons	2024-01-16 15:22:23 +01:00
meili-bors[bot]	e93d36d5b9	Merge #4313 4313: Fix document formatting performances r=Kerollmops a=ManyTheFish reduce the formatted option list to the attributes that should be formatted, instead of all the attributes to display. The time to compute the `format` list scales with the number of fields to format; cumulated with `map_leaf_values` that iterates over all the nested fields, it gives a quadratic complexity: `d*f` where `d` is the total number of fields to display and `f` is the total number of fields to format. Co-authored-by: ManyTheFish <many@meilisearch.com>	2024-01-11 14:19:44 +00:00
ManyTheFish	5f5a486895	Reduce formatting time	2024-01-11 11:36:41 +01:00
ManyTheFish	5f4fc6c955	Add timer logs	2024-01-11 09:44:16 +01:00
Clément Renault	3f3462ab62	Limit the number of values returned by the facet search	2024-01-10 16:54:08 +01:00
Tamo	54ae6951eb	fix warning	2024-01-02 15:19:30 +01:00
Louis Dureuil	0bf879fb88	Fix warning on rust stable	2023-12-20 17:48:09 +01:00
Louis Dureuil	6ff81de401	Fix tests	2023-12-20 17:16:46 +01:00
Louis Dureuil	9123370e90	Validate fused settings in settings task after fusing with existing setting	2023-12-20 17:16:46 +01:00
Louis Dureuil	14b396d302	Add new errors	2023-12-20 17:16:45 +01:00
Louis Dureuil	393216bf30	Flatten embedders settings	2023-12-20 17:16:43 +01:00
Louis Dureuil	e249e4db7b	Change Setting::apply function signature	2023-12-20 17:15:24 +01:00
Louis Dureuil	333ce12eb2	Fixed issue where the default revision is always the one we picked for the default model	2023-12-20 10:17:49 +01:00
Many the fish	9e1b458010	Merge branch 'main' into change-proximity-precision-settings	2023-12-18 09:08:47 +01:00
ManyTheFish	6425996e36	Change the naming of attributeScale and wordScale into byAttribute and byWord	2023-12-14 16:31:00 +01:00
Louis Dureuil	eb5cb91da2	Switch default from hf to openai	2023-12-14 16:19:46 +01:00
Louis Dureuil	87bba98bd8	Various changes - fixed seed for arroy - check vector dimensions as soon as it is provided to search - don't embed whitespace	2023-12-14 16:08:42 +01:00
Louis Dureuil	217105b7da	hybrid search uses semantic ratio, error handling	2023-12-14 16:08:42 +01:00
ManyTheFish	9991152bbe	Add TODOs	2023-12-14 16:08:42 +01:00
Louis Dureuil	a4536b1381	Small adjustments to respect the spec	2023-12-14 16:08:42 +01:00
Louis Dureuil	5b51cb04af	Remove some settings	2023-12-14 16:08:42 +01:00
Louis Dureuil	b8e4709dfa	Remove prompt strategy and fallback	2023-12-14 16:08:41 +01:00
Louis Dureuil	806e5b6899	Tests pass	2023-12-14 16:08:41 +01:00
Louis Dureuil	e0cc775dc4	Various changes - DistributionShift in Search object (to be set from model in embed?) - Fix issue where embedder index wasn't computed at search time - Accept as default embedder either the "default" one, or the only embedder when there is only one	2023-12-14 16:08:41 +01:00
Louis Dureuil	12940d79a9	WIP - manual embedder - multi embedders OK - clippy + tests OK	2023-12-14 16:08:41 +01:00
Louis Dureuil	922a640188	WIP multi embedders fixed template bugs	2023-12-14 16:08:41 +01:00
Louis Dureuil	d4715e0c4d	Fix same vector sort bug	2023-12-14 16:08:41 +01:00
Louis Dureuil	11e2a2c1aa	Fix geosort bug	2023-12-14 16:08:41 +01:00
Louis Dureuil	65e49b7092	Remove stuff, add distribution shift (WIP)	2023-12-14 16:08:38 +01:00
Louis Dureuil	e56f160032	Actually pass embedders on reindex	2023-12-14 16:07:49 +01:00
Louis Dureuil	687d92f217	prompt bifluor+	2023-12-14 16:07:49 +01:00
Louis Dureuil	fb539f61fe	WIP	2023-12-14 16:07:49 +01:00
Louis Dureuil	cb4ebe163e	WIP	2023-12-14 16:07:49 +01:00
Louis Dureuil	dde3a04679	WIP arroy integration	2023-12-14 16:07:49 +01:00
Louis Dureuil	13c2c6c16b	Small commit to add hybrid search and autoembedding	2023-12-14 16:07:48 +01:00
Clément Renault	56571f762a	Merge remote-tracking branch 'origin/main' into tmp-release-v1.5.1	2023-12-13 11:57:01 +01:00
ManyTheFish	467b49153d	Implement proximityPrecision setting on milli side	2023-12-06 15:49:02 +01:00
ManyTheFish	bddc168d83	List TODOs	2023-12-06 14:59:23 +01:00
ManyTheFish	3b3fa38f27	Put the restrict list in a sub-struct	2023-11-28 18:37:57 +01:00
ManyTheFish	d6c2ee15a9	Filter on attributes before computing the docids when attribute restriction is on	2023-11-28 14:55:29 +01:00
Clément Renault	ec9b52d608	Rename copy_to_path to copy_to_file	2023-11-28 14:32:30 +01:00

1 2 3 4 5 ...

1898 Commits