Louis Dureuil
5d7061682e
Add tracing to milli
2024-02-08 15:03:31 +01:00
meili-bors[bot]
72ebac1fbb
Merge #4388
...
4388: Cap the maximum memory of the grenad sorters r=curquiza a=Kerollmops
This PR clamps the memory usage of the grenad sorters to a reasonable maximum. Grenad sorters are opened on multiple threads at a time. This can result in higher memory usage than expected, even though it shouldn't consume more than the memory available.
Fixes #4152 .
Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-02-08 13:19:28 +00:00
Louis Dureuil
88d03c56ab
Don't accept dimensions of 0 (ever) or dimensions greater than the default dimensions of the model
2024-02-07 11:52:09 +01:00
Louis Dureuil
517f5332d6
Allow actually passing dimensions
for OpenAI source
...
-> make sure the settings change is rejected or the settings task fails when the specified model doesn't support
overriding `dimensions` and the passed `dimensions` differs from the model's default dimensions.
2024-02-07 11:51:44 +01:00
Clément Renault
053306c0e7
Try with 500MiB
2024-02-07 11:24:43 +01:00
Clément Renault
9eeb75d501
Clamp the max memory of the grenad sorters to a reasonable maximum
2024-02-06 10:47:04 +01:00
Louis Dureuil
fbf5f2a392
Don't use a runtime in extract_embedder, use it only for OpenAI
2024-02-01 10:33:27 +01:00
Tamo
9f8f3105d5
make clippy happy
2024-02-01 10:33:27 +01:00
Tamo
318843aacd
add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB
2024-02-01 10:33:27 +01:00
Tamo
c1bf33a112
Revert "Remove panic on the geosearch"
2024-01-25 18:51:19 +01:00
Tamo
0887186ecf
make clippy happy
2024-01-17 16:07:10 +01:00
Tamo
7d190d8078
add a bunch of tests and fix the error message when adding the geosearch as filterable/sortable while there is malformed documents in the DB
2024-01-17 15:51:52 +01:00
Clément Renault
01e2c3d6bb
Bump arroy to v0.2.0
2024-01-16 16:45:55 +01:00
Clément Renault
9f9ad4cc05
Fix Clippy warnings
2024-01-16 15:27:24 +01:00
Clément Renault
3ee7682fa7
Fix some integer comparisons
2024-01-16 15:22:23 +01:00
Tamo
54ae6951eb
fix warning
2024-01-02 15:19:30 +01:00
Louis Dureuil
6ff81de401
Fix tests
2023-12-20 17:16:46 +01:00
Louis Dureuil
9123370e90
Validate fused settings in settings task after fusing with existing setting
2023-12-20 17:16:46 +01:00
Louis Dureuil
e249e4db7b
Change Setting::apply function signature
2023-12-20 17:15:24 +01:00
Many the fish
9e1b458010
Merge branch 'main' into change-proximity-precision-settings
2023-12-18 09:08:47 +01:00
ManyTheFish
6425996e36
Change the naming of attributeScale and wordScale into byAttribute and byWord
2023-12-14 16:31:00 +01:00
Louis Dureuil
87bba98bd8
Various changes
...
- fixed seed for arroy
- check vector dimensions as soon as it is provided to search
- don't embed whitespace
2023-12-14 16:08:42 +01:00
Louis Dureuil
b8e4709dfa
Remove prompt strategy and fallback
2023-12-14 16:08:41 +01:00
Louis Dureuil
806e5b6899
Tests pass
2023-12-14 16:08:41 +01:00
Louis Dureuil
e0cc775dc4
Various changes
...
- DistributionShift in Search object (to be set from model in embed?)
- Fix issue where embedder index wasn't computed at search time
- Accept as default embedder either the "default" one, or the only embedder when there is only one
2023-12-14 16:08:41 +01:00
Louis Dureuil
12940d79a9
WIP
...
- manual embedder
- multi embedders OK
- clippy + tests OK
2023-12-14 16:08:41 +01:00
Louis Dureuil
922a640188
WIP multi embedders
...
fixed template bugs
2023-12-14 16:08:41 +01:00
Louis Dureuil
65e49b7092
Remove stuff, add distribution shift (WIP)
2023-12-14 16:08:38 +01:00
Louis Dureuil
e56f160032
Actually pass embedders on reindex
2023-12-14 16:07:49 +01:00
Louis Dureuil
687d92f217
prompt bifluor+
2023-12-14 16:07:49 +01:00
Louis Dureuil
fb539f61fe
WIP
2023-12-14 16:07:49 +01:00
Louis Dureuil
cb4ebe163e
WIP
2023-12-14 16:07:49 +01:00
Louis Dureuil
dde3a04679
WIP arroy integration
2023-12-14 16:07:49 +01:00
Louis Dureuil
13c2c6c16b
Small commit to add hybrid search and autoembedding
2023-12-14 16:07:48 +01:00
ManyTheFish
467b49153d
Implement proximityPrecision setting on milli side
2023-12-06 15:49:02 +01:00
ManyTheFish
bddc168d83
List TODOs
2023-12-06 14:59:23 +01:00
Clément Renault
d32eb11329
Move to the v0.20.0-alpha.9 of heed
2023-11-27 11:52:22 +01:00
Clément Renault
0dbf1a16ff
Make clippy happy
2023-11-23 14:11:38 +01:00
Clément Renault
462b4c0080
Fix the tests
2023-11-23 12:07:35 +01:00
Clément Renault
0d4482625a
Make the changes to use heed v0.20-alpha.6
2023-11-23 11:43:58 +01:00
ManyTheFish
d3575fb028
Make into_del_add_obkv parameters more human readable
2023-11-20 16:10:39 +01:00
ManyTheFish
39cbb499c2
Small fixes
2023-11-20 10:20:39 +01:00
ManyTheFish
ebef6bc24d
Simplify documents database writing
2023-11-20 10:14:57 +01:00
ManyTheFish
d59b7db8d0
remove unused code
2023-11-20 10:10:45 +01:00
ManyTheFish
263e825619
Fix typos in comments
2023-11-20 10:06:29 +01:00
Many the fish
b0adc73ce6
Merge pull request #4207 from meilisearch/diff-indexing-prefix-databases
...
Diff indexing prefix databases
2023-11-14 16:04:05 +01:00
Louis Dureuil
772964125d
Factor removal of document from DB
2023-11-13 13:51:22 +01:00
Louis Dureuil
264b10ec20
Fixup documentation
2023-11-09 16:23:20 +01:00
Louis Dureuil
3053e01c05
Batch::remove_documents_from_db_no_batch
2023-11-09 14:23:02 +01:00
Louis Dureuil
9cef800b2a
Enrich uses the new type
2023-11-09 14:22:05 +01:00
ManyTheFish
882ab9cc85
remove warnings
2023-11-09 11:35:33 +01:00
ManyTheFish
5a9c96e1db
Compute word integer prefix cache
2023-11-09 11:34:26 +01:00
ManyTheFish
70ce40828c
Compute word docids prefix cache
2023-11-08 17:01:00 +01:00
ManyTheFish
688266c83e
Remove word pair proximity prefix cache and compute it at search time
2023-11-08 14:16:01 +01:00
ManyTheFish
6dab826908
Reactivate prefix databases
2023-11-08 13:58:01 +01:00
ManyTheFish
1e2fbc6a42
revert "REVERT ME: ignore prefix pair databases tests"
...
This reverts commit 1b2ea6cf19
.
2023-11-08 11:50:52 +01:00
Louis Dureuil
cbaa54cafd
Fix clippy issues
2023-11-06 11:19:31 +01:00
Louis Dureuil
1bccf2079e
Correctly mark non-tests as non-tests
2023-11-06 11:03:56 +01:00
ManyTheFish
1b2ea6cf19
REVERT ME: ignore prefix pair databases tests
2023-11-06 10:46:22 +01:00
Louis Dureuil
1ad1fcc8c8
Remove all warnings
2023-11-06 10:31:14 +01:00
ManyTheFish
87610a5f98
Don't try to delete a document that is not in the database
2023-11-02 16:49:03 +01:00
Clément Renault
ff522c919d
Fix the vector extractions for the diff indexing
2023-11-02 15:58:08 +01:00
ManyTheFish
bf0651f23c
Implement iter method on ExternalDocumentsIds
2023-11-02 15:38:00 +01:00
ManyTheFish
5b20e625f3
fix merge
2023-11-02 15:31:37 +01:00
ManyTheFish
bc51d6157a
Fix transform reindexing path
2023-11-02 15:26:20 +01:00
ManyTheFish
1b4ff991c0
update typed chunks
2023-11-02 15:26:20 +01:00
ManyTheFish
4b64c33aa2
update vector extractor
2023-11-02 15:26:20 +01:00
ManyTheFish
12323d610e
Change the original document sorter key from the internal docid to a concatenation of the internal and the external docid
2023-11-02 15:26:20 +01:00
Clément Renault
4d864f0702
Always sort internal Sorter entries in parallel
2023-11-02 14:47:43 +01:00
Clément Renault
c71b1d33ae
Sort entries using rayon in the transform sorters
2023-11-01 11:07:16 +01:00
Clément Renault
0fc446c62f
Add more timing logs to the Transform
2023-11-01 11:07:16 +01:00
Louis Dureuil
0fb6acefc3
Add snapshots for facets
2023-10-31 17:11:08 +01:00
Louis Dureuil
b1d1355b69
remove tests on soft-deleted
2023-10-31 16:36:27 +01:00
Louis Dureuil
f19332466e
Extract field value as values instead of Option<Value>
2023-10-31 16:36:27 +01:00
Louis Dureuil
03ddb4f310
use deladd in facet update tests
2023-10-31 16:36:27 +01:00
Louis Dureuil
da0503ef80
Fix document count
2023-10-31 16:36:27 +01:00
Louis Dureuil
b40253bf18
update snapshots
2023-10-31 10:30:48 +01:00
Louis Dureuil
d8bf3f3fc2
Remove unused snapshots
2023-10-31 10:12:49 +01:00
Louis Dureuil
9d59e8011a
fix some tests
2023-10-31 10:08:36 +01:00
Louis Dureuil
dad78cbf8d
Bulk facet remove deletes keys from DB when value empty
2023-10-31 09:53:55 +01:00
Louis Dureuil
4e91707a06
Rename test
2023-10-31 09:41:17 +01:00
Louis Dureuil
de10f20732
Fix field distribution again
2023-10-30 17:47:22 +01:00
Louis Dureuil
be395c7944
Change order of arguments to tokenizer_builder
2023-10-30 16:26:29 +01:00
Louis Dureuil
9fedd8101a
Fix tests
2023-10-30 15:11:07 +01:00
Louis Dureuil
54d07a8da3
Update field distribution taking into account both deletions and additions
2023-10-30 14:47:51 +01:00
Louis Dureuil
58690dfb19
Fix tests compilation after changes to ExternalDocumentsIds API
2023-10-30 13:34:07 +01:00
Louis Dureuil
abf424ebfc
Remove unused FromIterator
2023-10-30 11:41:56 +01:00
Clément Renault
dfab6293c9
Use an LMDB database to store the external documents ids
2023-10-30 11:41:23 +01:00
Louis Dureuil
fdf3f7f627
Fix facet distribution test
2023-10-30 11:41:23 +01:00
Louis Dureuil
6260cff65f
Actually delete documents from DB when the merge function says so
2023-10-30 11:41:22 +01:00
Louis Dureuil
8e0d9c9a5e
Recover delete_documents tests that were too eagerly deleted
2023-10-30 11:41:22 +01:00
Louis Dureuil
a35988550c
Fix some snapshots
2023-10-30 11:41:22 +01:00
Louis Dureuil
e78281785c
Actually execute the transform even if there are only documents to delete
2023-10-30 11:41:22 +01:00
Louis Dureuil
290e773d23
remove more warnings and fix some tests
2023-10-30 11:41:22 +01:00
Louis Dureuil
113527f466
Remove soft-deleted related methods from Index
2023-10-30 11:41:22 +01:00
Louis Dureuil
c534a1b687
Stop using delete documents pipeline in batch runner
2023-10-30 11:41:22 +01:00
Louis Dureuil
2263dff02b
Stop using removed delete pipelines almost everywhere
2023-10-30 11:41:22 +01:00
Louis Dureuil
d651b3ef01
Remove delete documents files
2023-10-30 11:41:20 +01:00
ManyTheFish
762b0b47e6
Use deladd merging function in chunks mergers
2023-10-30 11:40:20 +01:00
Louis Dureuil
01d5eedf2f
Remove some warnings
2023-10-30 11:40:20 +01:00