Loïc Lecrenier
130d2061bd
Fix indexing of word_position_docid and fid
2023-04-06 17:50:39 +02:00
Loïc Lecrenier
d18ebe4f3a
Remove more warnings
2023-03-23 09:41:18 +01:00
Clément Renault
64571c8288
Improve the testing of the filters
2023-03-15 14:57:17 +01:00
Clément Renault
df48ac8803
Add one more test for the NULL operator
2023-03-09 13:53:37 +01:00
ManyTheFish
8aa808d51b
Merge branch 'main' into enhance-language-detection
2023-02-20 18:14:34 +01:00
Tamo
74dcfe9676
Fix a bug when you update a document that was already present in the db, deleted and then inserted again in the same transform
2023-02-14 19:09:40 +01:00
Tamo
93db755d57
add a test to ensure we handle correctly a deletion of multiple time the same document
2023-02-08 21:03:34 +01:00
Tamo
421a9cf05e
provide a new method on the transform to remove documents
2023-02-08 16:06:09 +01:00
Kerollmops
fbec48f56e
Merge remote-tracking branch 'milli/main' into bring-v1-changes
2023-02-06 16:48:10 +01:00
f3r10
7681be5367
Format code
2023-01-31 11:28:05 +01:00
f3r10
50bc156257
Fix tests
2023-01-31 11:28:05 +01:00
f3r10
a27f329e3a
Add tests for checking that detected script and language associated with document(s) were stored during indexing
2023-01-31 11:28:05 +01:00
Philipp Ahlner
f5ca421227
Superfluous test removed
2023-01-19 15:39:21 +01:00
Clément Renault
1b78231e18
Make clippy happy
2023-01-17 18:25:54 +01:00
bors[bot]
6a10e85707
Merge #736
...
736: Update charabia r=curquiza a=ManyTheFish
Update Charabia to the last version.
> We are now Romanizing Chinese characters into Pinyin.
> Note that we keep the accent because they are in fact never typed directly by the end-user, moreover, changing an accent leads to a different Chinese character, and I don't have sufficient knowledge to forecast the impact of removing accents in this context.
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-01-03 15:44:41 +00:00
Louis Dureuil
4b166bea2b
Add primary_key_inference test
2022-12-21 15:13:38 +01:00
Louis Dureuil
5943100754
Fix existing tests
2022-12-21 15:13:38 +01:00
Louis Dureuil
fc7618d49b
Add DeletionStrategy
2022-12-19 09:49:58 +01:00
ManyTheFish
7f88c4ff2f
Fix #1714 test
2022-12-15 18:22:28 +01:00
Loïc Lecrenier
be3b00350c
Apply review suggestions: naming and documentation
2022-12-13 10:15:22 +01:00
Loïc Lecrenier
e3ee553dcc
Remove soft deleted ids from ExternalDocumentIds during document import
...
If the document import replaces a document using hard deletion
2022-12-12 14:16:09 +01:00
Loïc Lecrenier
f2cf981641
Add more tests and allow disabling of soft-deletion outside of tests
...
Also allow disabling soft-deletion in the IndexDocumentsConfig
2022-12-05 10:51:01 +01:00
Loïc Lecrenier
990a861241
Add test for indexing a document with a long facet value
2022-11-17 11:29:42 +01:00
unvalley
c7322f704c
Fix cargo clippy errors
...
Dont apply clippy for tests for now
Fix clippy warnings of filter-parser package
parent 8352febd646ec4bcf56a44161e5c4dce0e55111f
author unvalley <38400669+unvalley@users.noreply.github.com> 1666325847 +0900
committer unvalley <kirohi.code@gmail.com> 1666791316 +0900
Update .github/workflows/rust.yml
Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
Allow clippy lint too_many_argments
Allow clippy lint needless_collect
Allow clippy lint too_many_arguments and type_complexity
Fix for clippy warnings comparison_chains
Fix for clippy warnings vec_init_then_push
Allow clippy lint should_implement_trait
Allow clippy lint drop_non_drop
Fix lifetime clipy warnings in filter-paprser
Execute cargo fmt
Fix clippy remaining warnings
Fix clippy remaining warnings again and allow lint on each place
2022-10-27 01:04:23 +09:00
unvalley
811f156031
Execute cargo clippy --fix
2022-10-27 01:00:00 +09:00
Loïc Lecrenier
2741756248
Merge remote-tracking branch 'origin/main' into facet-levels-refactor
2022-10-26 14:03:23 +02:00
Loïc Lecrenier
b7f2428961
Fix formatting and warning after rebasing from main
2022-10-26 13:49:33 +02:00
Loïc Lecrenier
27454e9828
Document and refine facet indexing algorithms
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
9026867d17
Give same interface to bulk and incremental facet indexing types
...
+ cargo fmt, oops, sorry for the bad history :(
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
61252248fb
Fix some facet indexing bugs
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
85824ee203
Try to make facet indexing incremental
2022-10-26 13:47:04 +02:00
Loïc Lecrenier
e8a156d682
Reorganise facets database indexing code
2022-10-26 13:46:46 +02:00
Loïc Lecrenier
7913d6365c
Update Facets indexing to be compatible with new database structure
2022-10-26 13:46:14 +02:00
bors[bot]
c8f16530d5
Merge #616
...
616: Introduce an indexation abortion function when indexing documents r=Kerollmops a=Kerollmops
Co-authored-by: Kerollmops <clement@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-10-26 11:41:18 +00:00
Loïc Lecrenier
d76d0cb1bf
Merge branch 'main' into word-pair-proximity-docids-refactor
2022-10-24 15:23:00 +02:00
Loïc Lecrenier
a983129613
Apply suggestions from code review
2022-10-20 09:49:37 +02:00
Loïc Lecrenier
264a04922d
Add prefix_word_pair_proximity database
...
Similar to the word_prefix_pair_proximity one but instead the keys are:
(proximity, prefix, word2)
2022-10-18 10:37:34 +02:00
Kerollmops
6603437cb1
Introduce an indexation abortion function when indexing documents
2022-10-17 17:28:03 +02:00
Ewan Higgs
beb987d3d1
Fixing piles of clippy errors.
...
Most of these are calling clone when the struct supports Copy.
Many are using & and &mut on `self` when the function they are called
from already has an immutable or mutable borrow so this isn't needed.
I tried to stay away from actual changes or places where I'd have to
name fresh variables.
2022-10-13 22:02:54 +02:00
bors[bot]
15d478cf4d
Merge #635
...
635: Use an unstable algorithm for `grenad::Sorter` when possible r=Kerollmops a=loiclec
# Pull Request
## What does this PR do?
Use an unstable algorithm to sort the internal vector used by `grenad::Sorter` whenever possible to speed up indexing.
In practice, every time the merge function creates a `RoaringBitmap`, we use an unstable sort. For every other merge function, such as `keep_first`, `keep_last`, etc., a stable sort is used.
Co-authored-by: Loïc Lecrenier <loic@meilisearch.com>
2022-09-14 12:00:52 +00:00
Loïc Lecrenier
3794962330
Use an unstable algorithm for grenad::Sorter when possible
2022-09-13 14:49:53 +02:00
Kerollmops
d4d7c9d577
We avoid skipping errors in the indexing pipeline
2022-09-13 14:03:00 +02:00
Kerollmops
c83c3cd796
Add a test to make sure that long words are correctly skipped
2022-09-07 14:12:36 +02:00
ManyTheFish
5391e3842c
replace optional_words by term_matching_strategy
2022-08-22 17:47:19 +02:00
ManyTheFish
9640976c79
Rename TermMatchingPolicies
2022-08-18 17:36:08 +02:00
Irevoire
e96b852107
bump heed
2022-08-17 17:05:50 +02:00
ManyTheFish
e9e2349ce6
Fix typo in comment
2022-08-17 15:09:48 +02:00
ManyTheFish
2668f841d1
Fix update indexing
2022-08-17 15:03:37 +02:00
ManyTheFish
7384650d85
Update test to showcase the bug
2022-08-17 15:03:08 +02:00
Loïc Lecrenier
58cb1c1bda
Simplify unit tests in facet/filter.rs
2022-08-04 12:03:44 +02:00