Clément Renault
|
7ad037841f
|
Move the tracing info to eprintln
|
2024-09-24 18:21:58 +02:00 |
|
Clément Renault
|
e0c7067355
|
Expose an IndexedParallelIterator to the index function
|
2024-09-24 17:24:59 +02:00 |
|
ManyTheFish
|
6e87332410
|
Change the way the FST is built
|
2024-09-24 16:28:31 +02:00 |
|
Clément Renault
|
2d1caf27df
|
Use eprintln to log
|
2024-09-24 15:59:50 +02:00 |
|
Clément Renault
|
92678383d6
|
Update charabia
|
2024-09-24 15:37:56 +02:00 |
|
Clément Renault
|
7f148c127c
|
Measure the SmallVec efficacity
|
2024-09-24 15:32:15 +02:00 |
|
Clément Renault
|
4ce5d3d66d
|
Do not check before pushing in bitmaps
|
2024-09-24 09:43:16 +02:00 |
|
Clément Renault
|
ff931edb55
|
Update roaring to inline max calls
|
2024-09-23 16:53:42 +02:00 |
|
Clément Renault
|
42b093687d
|
Introduce the new PushOptimizedBitmap
|
2024-09-23 16:38:21 +02:00 |
|
Clément Renault
|
835c5f98f9
|
Remove the debug symbols
|
2024-09-23 15:49:24 +02:00 |
|
Clément Renault
|
f00664247d
|
Add more stats about the channel message sent
|
2024-09-23 15:13:52 +02:00 |
|
Clément Renault
|
3c63d4a1e5
|
Fix charabia Zho
|
2024-09-23 14:50:17 +02:00 |
|
Clément Renault
|
4551abf6d4
|
Update roaring to the latest version
|
2024-09-23 14:35:33 +02:00 |
|
Clément Renault
|
193d7f5d34
|
Add the mutualized charabia normalization
|
2024-09-23 14:24:25 +02:00 |
|
Clément Renault
|
013acb3d93
|
Measure merger writer channel contention
|
2024-09-23 11:07:59 +02:00 |
|
Clément Renault
|
f4ab1f168e
|
Prefer using Rc<str> than String when cloning a lot
|
2024-09-16 15:41:29 +02:00 |
|
ManyTheFish
|
1a0e962299
|
Replace hashmap by vectors in wpp
|
2024-09-16 15:01:20 +02:00 |
|
ManyTheFish
|
f13e076b8a
|
Use hashmap instead of Btree in wpp extractor
|
2024-09-16 14:40:40 +02:00 |
|
ManyTheFish
|
7ba49b849e
|
Extract and write facet databases
|
2024-09-16 09:35:16 +02:00 |
|
Clément Renault
|
f7652186e1
|
WIP geo fields
|
2024-09-12 18:01:02 +02:00 |
|
Clément Renault
|
b2f4e67c9a
|
Do not store useless updates
|
2024-09-12 15:38:31 +02:00 |
|
Clément Renault
|
ff5d3b59f5
|
Move the document id extraction to the primary key code
|
2024-09-12 12:01:42 +02:00 |
|
ManyTheFish
|
aa69308e45
|
Use a bufWriter to build word FSTs
|
2024-09-12 11:48:00 +02:00 |
|
ManyTheFish
|
eb9a20ff0b
|
Fix fid_word_docids extraction
|
2024-09-12 11:08:18 +02:00 |
|
Clément Renault
|
0d868f36d7
|
Make sure we always use a BufWriter to write the update files
|
2024-09-11 18:38:04 +02:00 |
|
Clément Renault
|
e7d9db078f
|
Use the right key name when convertir from CSV to NDJSON
|
2024-09-11 18:27:00 +02:00 |
|
Clément Renault
|
3e9198ebaa
|
Support guessing primary key again
|
2024-09-11 17:25:40 +02:00 |
|
Clément Renault
|
2a0ad0982f
|
Fix the document counter
|
2024-09-11 15:59:36 +02:00 |
|
ManyTheFish
|
2b317c681b
|
Build mergers in parallel
|
2024-09-11 11:49:26 +02:00 |
|
ManyTheFish
|
39b5990f64
|
Mutualize tokenization
|
2024-09-11 10:22:38 +02:00 |
|
Clément Renault
|
3848adf5a2
|
Improve error management and simplify JSON read
|
2024-09-11 10:10:51 +02:00 |
|
Clément Renault
|
b4de06259e
|
Better CSV support
|
2024-09-11 10:02:00 +02:00 |
|
Clément Renault
|
8287c2644f
|
Support CSV again
|
2024-09-10 21:10:28 +01:00 |
|
Clément Renault
|
c1c44a0b81
|
Impl serialize on TopLevelMap
|
2024-09-10 19:32:03 +01:00 |
|
Clément Renault
|
04596f3616
|
Move the TopLevelMap into a dedicated module
|
2024-09-10 18:01:17 +01:00 |
|
Clément Renault
|
24cb5839ad
|
Move the document changes sorting logic to a new trait
|
2024-09-10 17:37:52 +01:00 |
|
Clément Renault
|
8d97b7b28c
|
Support JSON payloads again (not perfectly though)
|
2024-09-10 17:09:49 +01:00 |
|
ManyTheFish
|
f69688e8f7
|
Fix several warnings in extractors and remove unreachable macros
|
2024-09-09 14:52:50 +02:00 |
|
Clément Renault
|
8fd0afaaaa
|
Make sure we iterate over the payload documents in order
|
2024-09-06 08:09:08 +02:00 |
|
Clément Renault
|
72c6a21a30
|
Use raw JSON to read the payloads
|
2024-09-05 20:08:23 +02:00 |
|
Clément Renault
|
8412be4a7d
|
Cleanup CowStr and TopLevelMap struct
|
2024-09-05 18:32:55 +02:00 |
|
Louis Dureuil
|
10f09c531f
|
add some commented code to read from json with raw values
|
2024-09-05 18:22:16 +02:00 |
|
ManyTheFish
|
8fd99b111b
|
Add tracing timers logs
|
2024-09-05 18:00:22 +02:00 |
|
Clément Renault
|
f6b3d1f9a5
|
Increase some channel sizes
|
2024-09-05 15:12:07 +02:00 |
|
Clément Renault
|
73ce67862d
|
Use the word pair proximity and fid word count docids extractors
Co-authored-by: ManyTheFish <many@meilisearch.com>
|
2024-09-05 10:56:22 +02:00 |
|
Clément Renault
|
0fc02f7351
|
Move the facet extraction to dedicated modules
|
2024-09-05 10:32:27 +02:00 |
|
ManyTheFish
|
34f11e3380
|
Implement word count and word pair proximity extractors
|
2024-09-05 10:30:39 +02:00 |
|
Clément Renault
|
27308eaab1
|
Import the facet extractors
|
2024-09-04 17:58:15 +02:00 |
|
Clément Renault
|
b33ec9ba3f
|
Introduce the FieldIdFacetIsNullDocidsExtractor
|
2024-09-04 17:50:08 +02:00 |
|
Clément Renault
|
9c0a1cd9fd
|
Introduce the FieldIdFacetExistsDocidsExtractor
|
2024-09-04 17:48:49 +02:00 |
|