Commit Graph

2114 Commits

Author SHA1 Message Date
ManyTheFish
4dd7b20c32 Update benchmarks 2022-06-02 17:33:25 +02:00
ManyTheFish
4dd3675d2b Update http-ui 2022-06-02 16:59:11 +02:00
ManyTheFish
86ac8568e6 Use Charabia in milli 2022-06-02 16:59:11 +02:00
ManyTheFish
192e024ada Add Charabia in Cargo.toml 2022-06-02 16:59:07 +02:00
bors[bot]
ac6df0df57
Merge #539
539: Update version to v0.28.1 r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-06-01 16:40:12 +00:00
Clémentine Urquizar
c19c17eddb
Update version to v0.28.1 2022-06-01 18:31:02 +02:00
bors[bot]
74d1914a64
Merge #535
535: Reintroduce the max values by facet limit r=ManyTheFish a=Kerollmops

This PR reintroduces the max values by facet limit this is related to https://github.com/meilisearch/meilisearch/issues/2349.

~I would like some help in deciding on whether I keep the default 100 max values in milli and set up the `FacetDistribution` settings in Meilisearch to use 1000 as the new value, I expose the `max_values_by_facet` for this purpose.~

I changed the default value to 1000 and the max to 10000, thank you `@ManyTheFish` for the help!

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-06-01 14:30:50 +00:00
bors[bot]
582930dbbb
Merge #538
538: speedup exact words r=Kerollmops a=MarinPostma

This PR make `exact_words` return an `Option` instead of an empty set, since set creation is costly, as noticed by `@kerollmops.`

I was not convinces that this was the cause for all of the performance drop we measured, and then realized that methods that initialized it were called recursively which caused initialization times to add up. While the first fix solves the issue when not using exact words, using exact word remained way more expensive that it should be. To address this issue, the exact words are cached into the `Context`, so they are only initialized once.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-05-30 08:20:34 +00:00
bors[bot]
9f78e392b1
Merge #536
536: Improves ranking rules error message r=Kerollmops a=matthias-wright

This PR improves the ranking rules error message to properly reflect the case sensitivity.
The issue was highlighted in [meilisearch/issues/2407](https://github.com/meilisearch/meilisearch/issues/2407).
Cheers!

Co-authored-by: Matthias Wright <matthias.s.wright@gmail.com>
2022-05-24 16:43:52 +00:00
ad hoc
25fc576696
review changes 2022-05-24 14:15:33 +02:00
ad hoc
69dc4de80f
change &Option<Set> to Option<&Set> 2022-05-24 12:14:55 +02:00
ad hoc
ac975cc747
cache context's exact words 2022-05-24 09:43:17 +02:00
ad hoc
8993fec8a3
return optional exact words 2022-05-24 09:15:49 +02:00
Matthias Wright
754f48a4fb Improves ranking rules error message 2022-05-20 21:25:43 +02:00
Kerollmops
cd7c6e19ed
Reintroduce the max values by facet limit 2022-05-18 15:57:57 +02:00
bors[bot]
19dac01c5c
Merge #534
534: Bump milli version to v0.28.0 r=curquiza a=ManyTheFish



Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-05-18 09:04:46 +00:00
ManyTheFish
895f5d8a26 Bump milli version 2022-05-18 10:37:12 +02:00
bors[bot]
3389561f34
Merge #532
532: Add some implementation on MatchBounds r=Kerollmops a=ManyTheFish

Theses Implementations are needed in meilisearch

Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-05-17 14:50:22 +00:00
ManyTheFish
137434a1c8 Add some implementation on MatchBounds 2022-05-17 15:57:09 +02:00
bors[bot]
08c6d50cd1
Merge #531
531: fix the mixed dataset geosearch indexing bug r=Kerollmops a=irevoire

port #529 to main

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-16 16:06:36 +00:00
bors[bot]
cf3e574cb4
Merge #530
530: fix the searchable fields bug when a field is nested r=Kerollmops a=irevoire

port #528 to main

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-16 15:52:30 +00:00
Tamo
0af399a6d7
fix the mixed dataset geosearch indexing bug 2022-05-16 17:37:45 +02:00
Tamo
f586028f9a
fix the searchable fields bug when a field is nested
Update milli/src/index.rs

Co-authored-by: Clément Renault <clement@meilisearch.com>
2022-05-16 17:24:36 +02:00
bors[bot]
e1e85267fd
Merge #526
526: remove useless comment r=irevoire a=MarinPostma



Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-05-16 10:01:43 +00:00
bors[bot]
51809eb260
Merge #525
525: Simplify the error creation with thiserror r=irevoire a=irevoire

I introduced [`thiserror`](https://docs.rs/thiserror/latest/thiserror/) to implements all the `Display` trait and most of the `impl From<xxx> for yyy` in way less lines.
And then I introduced a cute macro to implements the `impl<X, Y, Z> From<X> for Z where Y: From<X>, Z: From<X>` more easily.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-04 15:47:32 +00:00
Tamo
484a9ddb27
Simplify the error creation with thiserror and a smol friendly macro 2022-05-04 17:24:00 +02:00
bors[bot]
65e6aa0de2
Merge #523
523: Improve geosearch error messages r=irevoire a=irevoire

Improve the geosearch error messages (#488).
And try to parse the string as specified in https://github.com/meilisearch/meilisearch/issues/2354

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-04 13:36:11 +00:00
bors[bot]
f3b9f7b867
Merge #527
527: Remove the wip section part of the contributing file r=curquiza a=Kerollmops

Everything was good in the _Development Workflow_ section so I removed the _WIP Section_ part, now this PR fixes https://github.com/meilisearch/milli/issues/513.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-05-04 13:11:30 +00:00
Kerollmops
48cdfddebf
Remove the wip section part of the contributing file 2022-05-04 14:44:51 +02:00
Tamo
c55368ddd4
apply code suggestion
Co-authored-by: Kerollmops <kero@meilisearch.com>
2022-05-04 14:11:03 +02:00
bors[bot]
60ccb3fa4c
Merge #524
524: Add benchmark on nested fields r=irevoire a=irevoire

fixes #500

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-05-04 11:56:18 +00:00
ad hoc
5ad5d56f7e
remove useless comment 2022-05-04 10:43:54 +02:00
bors[bot]
0c2c8af44e
Merge #520
520: fix mistake in Settings initialization r=irevoire a=MarinPostma

fix settings not being correctly initialized and add a test to make sure that they are in the future.

fix https://github.com/meilisearch/meilisearch/issues/2358


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-05-03 15:32:18 +00:00
bors[bot]
2fe9a02b1c
Merge #522
522: Do not generate keys that are too long for LMDB r=Kerollmops a=Kerollmops

This PR fixes https://github.com/meilisearch/meilisearch/issues/2338 by making sure that we do not generate keys that are too long for LMDB especially when we are creating our prefix and proximity pairs keys.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-05-03 11:54:10 +00:00
Kerollmops
211c8763b9
Make sure that we do not generate too long keys 2022-05-03 10:03:15 +02:00
Kerollmops
7e47031bdc
Add a test for long keys in LMDB 2022-05-03 10:03:13 +02:00
Tamo
f820c9804d
add one nested benchmark 2022-05-02 19:35:57 +02:00
Tamo
3cb1f6d0a1
improve geosearch error messages 2022-05-02 19:20:47 +02:00
ad hoc
1ee3d6ae33
fix mistake in Settings initialization 2022-04-29 16:24:25 +02:00
bors[bot]
312515dd6b
Merge #507
507: deny warnings in CI r=Kerollmops a=MarinPostma

Add `RUSTFLAGS= -D warnings` to the CI so all warnings are treated as hard errors.

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-28 15:16:35 +00:00
ad hoc
3eb3f0269e
deny warnings in CI 2022-04-28 15:35:12 +02:00
bors[bot]
9db86aac51
Merge #518
518: Return facets even when there is no value associated to it r=Kerollmops a=Kerollmops

This PR is related to https://github.com/meilisearch/meilisearch/issues/2352 and should fix the issue when Meilisearch is up-to-date with this PR.

Co-authored-by: Kerollmops <clement@meilisearch.com>
2022-04-28 09:04:36 +00:00
bors[bot]
2aae19dc52
Merge #517
517: Make nightly CI run every week r=Kerollmops a=curquiza



Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-26 16:22:25 +00:00
Kerollmops
a4d343aade
Add a test to check for the returned facet distribution 2022-04-26 18:12:58 +02:00
bors[bot]
c2bd94c871
Merge #511
511: Update version in every workspace r=curquiza a=curquiza

Checked with `@Kerollmops` 

- Update the version into every workspace (the current version is v0.27.0, but I forgot to update it for the previous release)
- add `publish = false` except in `milli` workspace.


Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-26 16:06:47 +00:00
Kerollmops
7d1c2d97bf
Return facets even when there is no values associated to it 2022-04-26 17:59:53 +02:00
bors[bot]
d388ea0f9d
Merge #506
506: fix cargo warnings r=Kerollmops a=MarinPostma

fix cargo warnings


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-26 15:45:20 +00:00
Clémentine Urquizar
ec89030483
Update bors toml 2022-04-26 17:36:04 +02:00
ad hoc
5c29258e8e
fix cargo warnings 2022-04-26 17:33:11 +02:00
bors[bot]
2fdf520271
Merge #514
514: Stop flattening every field r=Kerollmops a=irevoire

When we need to flatten a document:
* The primary key contains a `.`.
* Some fields need to be flattened

Instead of flattening the whole object and thus creating a lot of allocations with the `serde_json_flatten_crate`, we instead generate a minimal sub-object containing only the fields that need to be flattened.
That should create fewer allocations and thus index faster.

---------

```
group                                                             indexing_main_e1e362fa                 indexing_stop-flattening-every-field_40d1bd6b
-----                                                             ----------------------                 ---------------------------------------------
indexing/Indexing geo_point                                       1.99      23.7±0.23s        ? ?/sec    1.00      11.9±0.21s        ? ?/sec
indexing/Indexing movies in three batches                         1.00      18.2±0.24s        ? ?/sec    1.01      18.3±0.29s        ? ?/sec
indexing/Indexing movies with default settings                    1.00      17.5±0.09s        ? ?/sec    1.01      17.7±0.26s        ? ?/sec
indexing/Indexing songs in three batches with default settings    1.00      64.8±0.47s        ? ?/sec    1.00      65.1±0.49s        ? ?/sec
indexing/Indexing songs with default settings                     1.00      54.9±0.99s        ? ?/sec    1.01      55.7±1.34s        ? ?/sec
indexing/Indexing songs without any facets                        1.00      50.6±0.62s        ? ?/sec    1.01      50.9±1.05s        ? ?/sec
indexing/Indexing songs without faceted numbers                   1.00      54.0±1.14s        ? ?/sec    1.01      54.7±1.13s        ? ?/sec
indexing/Indexing wiki                                            1.00     996.2±8.54s        ? ?/sec    1.02   1021.1±30.63s        ? ?/sec
indexing/Indexing wiki in three batches                           1.00    1136.8±9.72s        ? ?/sec    1.00    1138.6±6.59s        ? ?/sec
```

So basically everything slowed down a liiiiiittle bit except the dataset with a nested field which got twice faster

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-26 11:50:33 +00:00