meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-02-20 01:27:52 +08:00

Author	SHA1	Message	Date
Louis Dureuil	40215bfec3	TMP: check windows free disk space	2024-05-29 11:27:27 +02:00
Louis Dureuil	ca006e38ec	Basic tests	2024-05-28 15:28:19 +02:00
Louis Dureuil	e26bd87780	Error tests for similar routes	2024-05-28 15:28:19 +02:00
Louis Dureuil	c01e498a63	Test server can call similar	2024-05-28 15:28:19 +02:00
Louis Dureuil	ca6cc4654b	Add similar route	2024-05-28 15:28:19 +02:00
Louis Dureuil	3bd9d2478c	Add error codes	2024-05-28 15:27:43 +02:00
Louis Dureuil	54b15059a0	Analytics changes	2024-05-28 15:27:43 +02:00
Louis Dureuil	d35278320e	Add support functions for accessing arroy writers and readers	2024-05-28 15:27:43 +02:00
Louis Dureuil	e172e938e7	add search rules directly takes the filter rather than the searchquery	2024-05-28 15:22:25 +02:00
Louis Dureuil	02b3d82c60	filtered_universe accepts index and txn instead of SearchContext	2024-05-28 15:22:12 +02:00
Louis Dureuil	fd2c95999d	Change `validate_document_id` to public and remove extra layer of result	2024-05-28 15:21:19 +02:00
meili-bors[bot]	e248d2a1e6	Merge #4655 4655: Remove `exportPuffinReport` experimental feature r=Kerollmops a=Kerollmops This PR fixes #4605 by removing every trace of Puffin. Puffin is a great tool, but we use a better approach to measuring performance. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-28 07:01:16 +00:00
Clément Renault	487431a035	Fix tests	2024-05-27 16:12:20 +02:00
Clément Renault	b6d450d484	Remove puffin experimental feature	2024-05-27 15:59:28 +02:00
Clément Renault	dc949ab46a	Remove puffin usage	2024-05-27 15:59:14 +02:00
Clément Renault	7f3e51349e	Remove puffin for the dependencies	2024-05-27 15:53:06 +02:00
meili-bors[bot]	19acc65ad2	Merge #4646 4646: Reduce `Transform`'s disk usage r=Kerollmops a=Kerollmops This PR implements what is described in #4485. It reduces the number of disk writes and disk usage. Co-authored-by: Clément Renault <clement@meilisearch.com>	2024-05-23 16:06:50 +00:00
meili-bors[bot]	3a3ab17714	Merge #4651 4651: Allow to comment with the results of benchmark invocation r=Kerollmops a=dureuill Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-05-23 15:32:09 +00:00
Louis Dureuil	eaf57056ca	comment with the results of benchmarks	2024-05-23 15:34:39 +02:00
Louis Dureuil	e340705634	Change benchmark outputs - logs to stderr instead of stdout - prints links to the dashboard when there is a dashboard	2024-05-23 15:29:06 +02:00
Clément Renault	fe17c0f52e	Construct the minimal OBKVs according to the settings diff	2024-05-23 11:23:57 +02:00
meili-bors[bot]	14bc80e3df	Merge #4633 4633: Allow to mark vectors as "userProvided" r=Kerollmops a=dureuill # Pull Request ## Related issue Fixes #4606 ## What does this PR do? [See usage in PRD](https://meilisearch.notion.site/v1-9-AI-search-changes-e90d6803eca8417aa70a1ac5d0225697#deb96fb0595947bda7d4a371100326eb) - Extends the shape of the special `_vectors` field in documents. - previously, the `_vectors` field had to be an object, with each field the name of a configured embedder, and each value either `null`, an embedding (array of numbers), or an array of embeddings. - In this PR, the value of an embedder in the `_vectors` field can additionally be an object. The object has two fields: 1. `embeddings`: `null`, an embedding (array of numbers), or an array of embeddings. 2. `userProvided`: a boolean indicating if the vector was provided by the user. - The previous form `embedder_or_array_of_embedders` is semantically equivalent to: ```json { "embeddings": embedder_or_array_of_embedders, "userProvided": true } ``` - During the indexing step, the subfields and values of the `_vectors` field that have `userProvided` set to false are added in the vector DB, but not in the documents DB: that means that future modifications of the documents will trigger a regeneration of that particular vector using the document template. - This allows importing embeddings as a one-shot process, while still retaining the ability to regenerate embeddings on document change. - The dump process now uses this ability: it enriches the `_vectors` fields of documents with the embeddings that were autogenerated, marking them as not `userProvided`. This allows importing the vectors from a dump without regenerating them. ### Tests This PR adds the following tests - Long-needed hybrid search tests of a simple hf embedder - Dump test that imports vectors. Due to the difficulty of actually importing a dump in tests, we just read the dump and check it contains the expected content. - Tests in the index-scheduler: this tests that documents containing the same kind of instructions as in the dump indexes as expected Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2024-05-23 08:17:54 +00:00
Clément Renault	bc5663e673	FieldIdsMap no longer useful thanks to #4631	2024-05-22 16:06:15 +02:00
Louis Dureuil	8a941c0241	Smaller review changes	2024-05-22 14:44:42 +02:00
Louis Dureuil	3412e7fbcf	"[]" is deserialized as 0 embedding rather than 1 embedding of dim 0	2024-05-22 12:25:21 +02:00
Louis Dureuil	16037e2169	Don't remove embedders that are not in the config from the document DB	2024-05-22 12:24:51 +02:00
Louis Dureuil	8f7c8ca7f0	Remove now unused error variant	2024-05-22 12:23:43 +02:00
Clément Renault	500ddc76b5	Make the flattened sorter optional	2024-05-21 16:16:36 +02:00
Louis Dureuil	eccbcf5130	Increase index-scheduler test timeouts	2024-05-21 14:59:08 +02:00
Clément Renault	943f8dba0c	Make clippy happy	2024-05-21 14:58:41 +02:00
Clément Renault	1aa8ed9ef7	Make the original sorter optional	2024-05-21 14:53:26 +02:00
meili-bors[bot]	abe29772db	Merge #4644 4644: Revert "Stream documents" and keep heed+arroy to the latest verion r=Kerollmops a=irevoire Reverts meilisearch/meilisearch#4544 Fixes https://github.com/meilisearch/meilisearch/issues/4641 I didn’t realize that some http clients were not handling chunked http requests like you would expect (if you ask the body, it gives you the body), which made the previous PR breaking. There is no way to provide a good fix to the issue we initially wanted to fix without breaking meilisearch and that’s not planned for now. Co-authored-by: Tamo <irevoire@protonmail.ch> Co-authored-by: Tamo <tamo@meilisearch.com>	2024-05-21 10:21:47 +00:00
Tamo	c9ac7f2e7e	update heed to latest version	2024-05-20 15:19:00 +02:00
Tamo	7e251b43d4	Revert "Stream documents"	2024-05-20 15:09:45 +02:00
Louis Dureuil	9969f7a638	Add test on index-scheduler	2024-05-20 14:44:10 +02:00
Louis Dureuil	b17cb56dee	Test array of vectors	2024-05-20 14:44:10 +02:00
Louis Dureuil	afcd7b9f0c	Test hybrid search with hf embedder	2024-05-20 14:44:10 +02:00
Louis Dureuil	30cf972987	Add test with a dump	2024-05-20 10:36:18 +02:00
Louis Dureuil	d05d49ffd8	Fix tests	2024-05-20 10:36:18 +02:00
Louis Dureuil	0462ebbe58	Don't write an empty _vectors field	2024-05-20 10:36:18 +02:00
Louis Dureuil	2f7a8a4efb	Don't write vectors that weren't autogenerated in document DB	2024-05-20 10:36:18 +02:00
Louis Dureuil	02714ef5ed	Add vectors from vector DB in dump	2024-05-20 10:36:18 +02:00
Louis Dureuil	52d9cb6e5a	Refactor vector indexing - use the parsed_vectors module - only parse `_vectors` once per document, instead of once per embedder per document	2024-05-20 10:36:17 +02:00
Louis Dureuil	261de888b7	Add function to get the embeddings of a document in an index	2024-05-20 10:36:17 +02:00
Louis Dureuil	98c811247e	Add parsed vectors module	2024-05-20 10:25:59 +02:00
meili-bors[bot]	59ecf1cea7	Merge #4544 4544: Stream documents r=curquiza a=irevoire # Pull Request ## Related issue Fixes https://github.com/meilisearch/meilisearch/issues/4383 ### Perf 2M hackernews: main: Time to retrieve: 7s RAM consumption: 2+GiB stream: Time to retrieve: 4.7s RAM consumption: Too small Co-authored-by: Tamo <tamo@meilisearch.com>	2024-05-17 14:49:08 +00:00
Tamo	273c6e8c5c	uses the latest version of heed to get rid of unsafe code	2024-05-16 18:31:32 +02:00
Tamo	897d25780e	update milli to latest version	2024-05-16 18:31:32 +02:00
Tamo	c85d1752dd	keep the same rtxn to compute the filters on the documents and to stream the documents later on	2024-05-16 18:31:32 +02:00
Tamo	8e6ffbfc6f	stream documents	2024-05-16 18:31:32 +02:00

1 2 3 4 5 ...

9407 Commits