72 Commits

Author SHA1 Message Date
Clément Renault
5727e00374
Remove useless geo skipped 2024-11-21 16:47:08 +01:00
Clément Renault
9b60843831
Remove commented lines 2024-11-21 16:47:07 +01:00
ManyTheFish
36962b943b First batch of PR comment 2024-11-21 16:38:11 +01:00
Louis Dureuil
32bcacefd5
Changes Document::len to Document::top_level_fields_count 2024-11-21 15:01:07 +01:00
ManyTheFish
94b260fd25 Remove orphan span 2024-11-21 12:12:07 +01:00
Clément Renault
ab2c83f868
Use the disk less when computing prefixes 2024-11-21 10:45:37 +01:00
Louis Dureuil
e0864f1b21
Separate side effect and debug asserts 2024-11-20 16:25:17 +01:00
Clément Renault
a38344acb3
Replace eprintlns by tracing 2024-11-20 15:29:51 +01:00
ManyTheFish
4d616f8794 Parse every attributes and filter before tokenization 2024-11-20 15:15:25 +01:00
Louis Dureuil
ff9c92c409
rename documents -> substep 2024-11-20 15:12:02 +01:00
Louis Dureuil
867138f166
Add SP to into_changes 2024-11-20 15:07:05 +01:00
Louis Dureuil
84600a10d1
Add MSP to document_update.into_changes() 2024-11-20 14:53:37 +01:00
Louis Dureuil
7d64e8dbd3
Fix Windows compilation 2024-11-20 14:40:38 +01:00
Louis Dureuil
cae8c89467
"fix" last warnings 2024-11-20 14:03:52 +01:00
Clément Renault
7cb8732b45
Introduce a new bincode internal error 2024-11-20 13:23:11 +01:00
ManyTheFish
fe5d50969a
Fix filed selector in extrators 2024-11-20 13:16:44 +01:00
Clément Renault
56c7c5d5f0
Fix comments 2024-11-20 13:16:44 +01:00
Louis Dureuil
2afa33011a
Fix tokenize_document 2024-11-20 13:16:43 +01:00
Louis Dureuil
f893b5153e
Don't mark [""] as empty facet 2024-11-20 13:16:42 +01:00
Louis Dureuil
ca779c21f9
facets: Handle boolean and skip empty strings 2024-11-20 13:16:42 +01:00
Louis Dureuil
477077bdc2
Remove _vectors from fid map when there are no vectors in sight 2024-11-20 13:16:42 +01:00
ManyTheFish
b1f8aec348
Fix index_documents_check_exists_database 2024-11-20 13:16:41 +01:00
ManyTheFish
ba7f091db3
Use tokenizer on numbers and booleans 2024-11-20 13:16:41 +01:00
Louis Dureuil
8049df125b
Add depth to facet extraction so that null inside an array doesn't mark the entire field as null 2024-11-20 13:16:40 +01:00
Clément Renault
3957917e0b
Correctly count indexed documents 2024-11-20 13:16:36 +01:00
Louis Dureuil
651c30899e
Allow fetching embedders from inside tests 2024-11-20 13:16:36 +01:00
Clément Renault
2c7a7fe4e8
Count the number of documents correctly 2024-11-20 13:16:35 +01:00
Clément Renault
23f0c2c29b
Generate internal ids only when needed 2024-11-20 13:16:35 +01:00
Clément Renault
3cf1352ae1
Fix the benchmark tests 2024-11-20 13:16:31 +01:00
ManyTheFish
41dbdd2d18 Fix filtered_placeholder_search_should_not_return_deleted_documents and word_scale_set_and_reset 2024-11-19 16:08:25 +01:00
Louis Dureuil
c782c09208
Move step to a dedicated mod and replace it with an enum 2024-11-18 18:22:13 +01:00
Louis Dureuil
75943a5a9b
Add TODO to remember replacing steps with an enum 2024-11-18 17:40:51 +01:00
Louis Dureuil
04c38220ca
Move MostlySend, ThreadLocal, FullySend to their own commit 2024-11-18 16:43:05 +01:00
Louis Dureuil
5f93651cef
fixes 2024-11-18 16:23:11 +01:00
Louis Dureuil
0a21d9bfb3
Fix double borrow of new fields id map 2024-11-18 15:56:01 +01:00
Louis Dureuil
e736a74729
Remove infinite loop in import_vectors 2024-11-18 12:50:56 +01:00
Clément Renault
5b4c06c24c
Plug the grenad max memory parameter 2024-11-18 11:28:04 +01:00
Louis Dureuil
c202f3dbe2
fix tests and revert change in behavior when primary_key_from_op != primary_key_from_db && index.is_empty() 2024-11-18 10:59:05 +01:00
Clément Renault
677d7293f5
Fix a lot of primary key related tests 2024-11-18 10:59:05 +01:00
Clément Renault
83865d2ebd
Expose intermediate errors when processing batches 2024-11-18 10:59:05 +01:00
ManyTheFish
4ff2b3c2ee Fix test on locales 2024-11-14 15:45:04 +01:00
ManyTheFish
91c58cfa38 Fix positional databases 2024-11-14 11:40:12 +01:00
Clément Renault
9e8367f1e6
Move the rayon thread pool outside the extract method 2024-11-14 10:40:32 +01:00
Louis Dureuil
0e3c5d91ab
Document deletion test passes 2024-11-14 08:42:56 +01:00
Louis Dureuil
695c2c6b99
Cosmetic fix 2024-11-14 08:42:39 +01:00
Louis Dureuil
40dd25d6b2
Fix issue with Replace document method when adding and deleting a document in the same batch 2024-11-13 22:10:00 +01:00
Clément Renault
8e5b1a3ec1
Compute the field distribution and convert _geo into an f64s 2024-11-13 17:44:05 +01:00
ManyTheFish
e627e182ce
Fix facet strings 2024-11-13 17:43:02 +01:00
ManyTheFish
51b6293738
Add linear facet databases 2024-11-13 17:43:02 +01:00
Clément Renault
b17896d899
Finialize the GeoExtractor 2024-11-13 17:43:02 +01:00