Louis Dureuil
7b4ce468a6
Allow overriding pooling method
2025-02-18 17:12:23 +01:00
Louis Dureuil
11759c4be4
Support pooling
2025-02-18 16:10:51 +01:00
meili-bors[bot]
885710a07b
Merge #5341
...
5341: Embeddings stats r=ManyTheFish a=ManyTheFish
# Pull Request
## Related issue
Fixes #5321
## What does this PR do?
- Add embedding stats
- force dumpless upgrade to recompute stats
- add tests
Co-authored-by: ManyTheFish <many@meilisearch.com>
2025-02-12 15:46:37 +00:00
ManyTheFish
8419ed52a1
fix clippy
2025-02-12 14:38:51 +01:00
Louis Dureuil
8e0d8d31f9
Add back timeout from v1.11.3
2025-02-12 11:53:00 +01:00
ManyTheFish
41203f0931
Add embedders stats
2025-02-12 11:37:47 +01:00
meili-bors[bot]
796acd1aee
Merge #5288
...
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Tests on ubuntu-20.04 (push) Failing after 13s
Test suite / Run tests in debug (push) Failing after 13s
Test suite / Run Clippy (push) Failing after 19s
Test suite / Tests on windows-2022 (push) Failing after 48s
Test suite / Run Rustfmt (push) Successful in 1m28s
Test suite / Tests on macos-13 (push) Has been cancelled
5288: Improve AI logging r=dureuill a=Kerollmops
This PR fixes #5285 and brings the changes from #5233 to simplify debugging indexation and search performance issues related to AI. The following texts can be found in the logs to debug and understand performance issues:
- `embed_one: search` represents the time we spent waiting for the embedding generation, i.e., OpenAI, local HuggingFace, Ollama.
- `filtered_universe: search::universe` the time spent filtering the documents.
- ~`next_bucket: search::vector_sort` is the time spent finding the nearest neighbors (ANNs) in the vector store (arroy), locally~ was being triggered too many times.
- `indexing::vectors` is the time arroy spends indexing the new vectors for a batch.
- `documents::extract vectors` and `documents::merge vectors` to see the time spent generating and writing the embeddings.
Co-authored-by: Kerollmops <clement@meilisearch.com>
2025-02-04 10:20:45 +00:00
meili-bors[bot]
ede74ccc42
Merge #5306
...
Test suite / Tests on ubuntu-20.04 (push) Failing after 2s
Test suite / Tests almost all features (push) Has been skipped
Test suite / Test disabled tokenization (push) Has been skipped
Test suite / Run tests in debug (push) Failing after 2s
Test suite / Tests on windows-2022 (push) Failing after 24s
Test suite / Run Rustfmt (push) Successful in 1m33s
Test suite / Run Clippy (push) Successful in 6m20s
Test suite / Tests on macos-13 (push) Has been cancelled
5306: Fix internal error when passing `documentTemplateMaxBytes` to a source that doesn't support it r=ManyTheFish a=dureuill
# Pull Request
## Related issue
Fixes #5305
## What does this PR do?
- add `DOCUMENT_TEMPLATE_MAX_BYTES` to `allowed_sources_for_field` and `allowed_fields_for_source` to prevent a panic
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2025-02-04 08:46:13 +00:00
Kerollmops
7a9382b115
Better document the rayon limitation condition
2025-02-03 10:24:53 +01:00
Kerollmops
62dabeba5f
Do not create too many rayon tasks when processing the settings
2025-02-03 10:24:52 +01:00
Kerollmops
48812229a9
Remove a log that would log too much
2025-02-03 10:24:52 +01:00
Louis Dureuil
96544bfa43
add DOCUMENT_TEMPLATE_MAX_BYTES
to allowed_sources_for_field
and allowed_fields_for_source
2025-02-03 09:59:17 +01:00
Kerollmops
aaefbfae1f
Do not create too many rayon tasks
2025-01-30 16:36:12 +01:00
Kerollmops
97e17f52a1
Add more logs to see calls to the embedders
2025-01-30 16:36:12 +01:00
Kerollmops
0f8eb3b506
Improve the logs of the search with AI
2025-01-27 14:22:22 +01:00
Louis Dureuil
4709c638ed
Swap implementations of ollama
2025-01-20 22:22:22 +01:00
Louis Dureuil
8b1fcfd7f8
Parse ollama URL to adapt configuration depending on the endpoint
2025-01-13 14:34:11 +01:00
Clément Renault
0ee4671a91
Fix after upgrading candle
2025-01-08 15:59:56 +01:00
Clément Renault
3e3695445f
Fix after upgrading thiserror
2025-01-08 15:58:32 +01:00
Tamo
0bf4157a75
try my best to make the sub-settings routes works, it doesn't
2025-01-07 16:26:06 +01:00
Gnosnay
44eb153619
Replace hardcoded string with constants
2024-12-28 20:35:55 +08:00
Clément Renault
cc4bd54669
Correctly construct the Embeddings struct
2024-11-28 13:53:25 +01:00
Louis Dureuil
e9d17136b2
Add deadline of 3 seconds to embedding requests made in the context of hybrid search
2024-11-18 12:15:11 +01:00
Louis Dureuil
6570da3bcb
Retry in case where the JSON deserialization fails
2024-11-18 11:33:09 +01:00
Louis Dureuil
3b0cb5b487
Fix vector error messages
2024-11-12 23:26:16 +01:00
Louis Dureuil
bfdcd1cf33
Space changes
2024-11-12 22:52:45 +01:00
Louis Dureuil
c4e9f761e9
Emit better error messages when parsing vectors
2024-11-12 22:49:22 +01:00
Louis Dureuil
8a6e61c77f
InvalidVectorsEmbedderConf error takes a String rather than a deserr error
2024-11-12 22:47:57 +01:00
Louis Dureuil
980921e078
Vector fixes
2024-11-12 16:31:22 +01:00
Louis Dureuil
bef8fc6cf1
Fix hf embedder
2024-11-08 13:10:17 +01:00
Louis Dureuil
4706a0eb49
Fix vector parsing
2024-11-07 23:26:20 +01:00
Louis Dureuil
10f49f0d75
Post processing of the merge
2024-11-06 17:50:12 +01:00
Louis Dureuil
ee03743355
Merge branch 'indexer-edition-2024' into indexer-edition-2024-doc-chunks
2024-11-06 15:50:53 +01:00
ManyTheFish
10feeb88f2
Merge branch 'main' into indexer-edition-2024
2024-11-06 15:19:18 +01:00
Tamo
cf6ad1ae5e
Merge branch 'main' into tmp-release-v1.11.0
2024-11-04 16:14:44 +01:00
Clément Renault
9c1e54a2c8
Move crates under a sub folder to clean up the code
2024-10-21 08:18:43 +02:00