Clément Renault
72c6a21a30
Use raw JSON to read the payloads
2024-09-05 20:08:23 +02:00
Clément Renault
8412be4a7d
Cleanup CowStr and TopLevelMap struct
2024-09-05 18:32:55 +02:00
Louis Dureuil
10f09c531f
add some commented code to read from json with raw values
2024-09-05 18:22:16 +02:00
ManyTheFish
8fd99b111b
Add tracing timers logs
2024-09-05 18:00:22 +02:00
Clément Renault
f6b3d1f9a5
Increase some channel sizes
2024-09-05 15:12:07 +02:00
Clément Renault
73ce67862d
Use the word pair proximity and fid word count docids extractors
...
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-05 10:56:22 +02:00
Clément Renault
0fc02f7351
Move the facet extraction to dedicated modules
2024-09-05 10:32:27 +02:00
ManyTheFish
34f11e3380
Implement word count and word pair proximity extractors
2024-09-05 10:30:39 +02:00
Clément Renault
27308eaab1
Import the facet extractors
2024-09-04 17:58:15 +02:00
Clément Renault
b33ec9ba3f
Introduce the FieldIdFacetIsNullDocidsExtractor
2024-09-04 17:50:08 +02:00
Clément Renault
9c0a1cd9fd
Introduce the FieldIdFacetExistsDocidsExtractor
2024-09-04 17:48:49 +02:00
Clément Renault
0b061f1e70
Introduce the FieldIdFacetIsEmptyDocidsExtractor
2024-09-04 17:40:24 +02:00
Clément Renault
19d937ab21
Introduce the facet extractors
2024-09-04 17:03:54 +02:00
Clément Renault
1d59c19cd2
Send the WordsFst by using an Mmap
2024-09-04 14:30:09 +02:00
Clément Renault
98e48371c3
Factorize some stuff
2024-09-04 12:17:13 +02:00
Clément Renault
6d74fb0229
Introduce the WordFidWordDocids database
2024-09-04 11:40:55 +02:00
ManyTheFish
1eb75a1040
remove milli/src/update/new/extract/tokenize_document.rs
2024-09-04 11:40:26 +02:00
Clément Renault
3b82d8b5b9
Fix the cache to serialize entries correctly
2024-09-04 10:55:36 +02:00
ManyTheFish
781a186f75
remove milli/src/update/new/extract/extract_word_docids.rs
2024-09-04 10:28:31 +02:00
ManyTheFish
6a399556b5
Implement more searchable extractor
2024-09-04 10:20:18 +02:00
Clément Renault
27b4cab857
Extract and write the documents and words fst in the database
2024-09-04 09:59:19 +02:00
Clément Renault
52d32b4ee9
Move the channel sender in the closure to stop the merger thread
2024-09-03 16:08:33 +02:00
ManyTheFish
da61408e52
Remove unimplemented from document changes
2024-09-03 15:14:16 +02:00
ManyTheFish
fe69385bd7
Fix tokenizer test
2024-09-03 14:24:37 +02:00
Clément Renault
c1557734dc
Use the GlobalFieldsIdsMap everywhere and write it to disk
...
Co-authored-by: Dureuill <louis@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-03 12:01:01 +02:00
ManyTheFish
c50d3edc4a
Integrate first searchable exctrator
2024-09-03 11:02:39 +02:00
Clément Renault
5369bf4a62
Change some lifetimes
2024-09-02 19:51:22 +02:00
Clément Renault
bcb1aa3d22
Find a temporary solution to par into iter on an HashMap
...
Spoiler: Do not use an HashMap but drain it into a Vec
2024-09-02 19:39:48 +02:00
Clément Renault
9b7858fb90
Expose the new indexer
2024-09-02 15:21:59 +02:00
Clément Renault
ab01679a8f
Remove the useless option from the document changes
2024-09-02 15:21:00 +02:00
Clément Renault
521775f788
I push for Many
2024-09-02 15:10:21 +02:00
Clément Renault
72e7b7846e
Renaming the indexers
2024-09-02 14:42:27 +02:00
Clément Renault
6526ce1208
Fix the merging of documents
2024-09-02 14:41:20 +02:00
Clément Renault
e639ec79d1
Move the indexers into their own modules
2024-09-02 10:42:19 +02:00
Clément Renault
bb885a5810
Fix the merge for roaring bitmap
2024-09-01 23:20:19 +02:00
Clément Renault
b625d31c7d
Introduce the PartialDumpIndexer indexer that generates document ids in parallel
2024-08-30 15:07:21 +02:00
Clément Renault
6487a67f2b
Introduce the ConcurrentAvailableIds struct and rename the other to AvailableIds
2024-08-30 15:06:50 +02:00
Clément Renault
271ce91b3b
Add the rayon Threadpool to the index function parameter
2024-08-30 14:34:24 +02:00
Clément Renault
54f2eb4507
Remove duplication of grenad merger
2024-08-30 14:34:05 +02:00
Clément Renault
794ebcd582
Replace grenad with the new grenad various-improvement branch
2024-08-30 11:53:59 +02:00
Clément Renault
b7c77c7a39
Use the latest version of the obkv crate
2024-08-30 11:53:59 +02:00
Clément Renault
0c57cf7565
Replace obkv with the temporary new version of it
2024-08-30 11:53:58 +02:00
Clément Renault
27df9e6c73
Introduce the indexer::index function that runs the indexation
2024-08-30 11:53:58 +02:00
Clément Renault
45c060831e
Introduce typed channels and the merger loop
2024-08-30 11:53:58 +02:00
Clément Renault
874c1ac538
First channels types
2024-08-30 11:53:58 +02:00
Clément Renault
e6ffa4d454
Implement the document merge function for the replace method
2024-08-30 11:53:58 +02:00
Clément Renault
637a9c8bdd
Implement the document merge function for the update method
2024-08-30 11:53:58 +02:00
Louis Dureuil
c683fa98e6
WIP
...
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-30 11:53:57 +02:00
meili-bors[bot]
9a756cf2c5
Merge #4888
...
4888: bring back v1.10.0 into main r=Kerollmops a=ManyTheFish
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-08-27 14:02:08 +00:00
ManyTheFish
b12e997c8a
Add pinyin flag
2024-08-21 14:38:04 +02:00