many
360c5ff3df
Remove limit of 1000 position per attribute
...
Instead of using an arbitrary limit we encode the absolute position in a u32
using one strong u16 for the field id and a weak u16 for the relative position in the attribute.
2021-10-12 10:10:50 +02:00
刘瀚骋
d323e35001
add a test case
2021-10-12 13:30:40 +08:00
刘瀚骋
70f576d5d3
error handling
2021-10-12 13:30:40 +08:00
刘瀚骋
28f9be8d7c
support syntax
2021-10-12 13:30:40 +08:00
刘瀚骋
469d92c569
tweak error handling
2021-10-12 13:30:40 +08:00
刘瀚骋
7a90a101ee
reorganize parser logic
2021-10-12 13:30:40 +08:00
刘瀚骋
f7796edc7e
remove everything about pest
2021-10-12 13:30:40 +08:00
刘瀚骋
ac1df9d9d7
fix typo and remove pest
2021-10-12 13:30:40 +08:00
刘瀚骋
50ad750ec1
enhance error handling
2021-10-12 13:30:40 +08:00
刘瀚骋
8748df2ca4
draft without error handling
2021-10-12 13:30:40 +08:00
bors[bot]
8f6b6c9042
Merge #385
...
385: Fix the wiki indexing benchmark r=ManyTheFish a=irevoire
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-11 15:12:24 +00:00
bors[bot]
07fb6d64e5
Merge #386
...
386: fix obkv document r=curquiza a=MarinPostma
When serializing a document, the serializer resolved the field_id of the current field and immediately added it to the obkv document under construction. The issue with that is that obkv expects the fields to be inserted in order, and when a document with out of order fields was added, obkv failed to insert the field.
The current fix first resolves each field_id, and adds all the fields to a temporary `BTreeMap`, until `end` is called on the map serializer, where all the fields are added to the obkv at once, and in order.
Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-10-11 13:45:04 +00:00
bors[bot]
e45c846af5
Merge #387
...
387: Update version for the next release (v0.17.2) r=Kerollmops a=curquiza
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-11 13:21:47 +00:00
Clémentine Urquizar
dd56e82dba
Update version for the next release (v0.17.2)
2021-10-11 15:20:35 +02:00
mpostma
99889a0ed0
add obkv document serialization test
2021-10-11 15:13:17 +02:00
mpostma
799f3d43c8
fix serialization to obkv format
2021-10-11 15:04:47 +02:00
Tamo
ed7fd855af
fix the wiki indexing benchmark
2021-10-11 14:26:36 +02:00
Tom Parker-Shemilt
2dfe24f067
memmap -> memmap2
2021-10-10 22:47:12 +01:00
bors[bot]
a2743baaa3
Merge #383
...
383: Add check on latitude and longitude r=irevoire a=irevoire
Latitudes are not supposed to go beyond 90 degrees or below -90.
The same goes for longitudes with 180 or -180.
This was badly implemented in the filters, and was not implemented for the `AscDesc` rules.
Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Irevoire <tamo@meilisearch.com>
2021-10-08 10:15:25 +00:00
Irevoire
b65aa7b5ac
Apply suggestions from code review
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-10-07 17:51:52 +02:00
Tamo
11dfe38761
Update the check on the latitude and longitude
...
Latitude are not supposed to go beyound 90 degrees or below -90.
The same goes for longitude with 180 or -180.
This was badly implemented in the filters, and was not implemented for the AscDesc rules.
2021-10-07 16:10:43 +02:00
bors[bot]
dde1da1c0e
Merge #382
...
382: Refactor attribute criterion r=Kerollmops a=ManyTheFish
### Re-implement set based algorithm for attribute criterion
#### Levels
Instead of doing level iteration and digging in the interesting level, we only iterate over the lowest level.
#### crossword iteration VS minimal position iteration
Instead of crossing word position in order to iterate strictly over the position that gives the best rank in good order; we iterate word by word starting with the word that increases the rank the little as possible.
This new method is a bit less precise but way simpler.
### Simplify word-level-position database
We don't use levels anymore in the attribute criterion, and so we removed the level complexity of the database making a word-position-docids database.
### Benchmarks on search on big datasets
#### songs main VS refactor-attribute-criterion
```diff
group search_songsmain_31c18f09 search_songsrefactor-attribute-criterion_1bd15d84
----- ------------------------- -------------------------------------------------
- smol-songs.csv: basic filter: <=/Notstandskomitee 1.00 84.8±0.58µs ? ?/sec 1.09 92.2±8.98µs ? ?/sec
+ smol-songs.csv: basic filter: TO/Notstandskomitee 1.18 98.0±6.30µs ? ?/sec 1.00 83.2±0.97µs ? ?/sec
+ smol-songs.csv: basic with quote/"david" "bowie" 114.68 76.0±0.20ms ? ?/sec 1.00 662.5±5.03µs ? ?/sec
- smol-songs.csv: basic with quote/"john" 1.00 197.4±1.06µs ? ?/sec 1.05 208.1±1.53µs ? ?/sec
+ smol-songs.csv: basic with quote/"michael" "jackson" 2.75 2.0±0.01ms ? ?/sec 1.00 738.9±3.91µs ? ?/sec
+ smol-songs.csv: basic without quote/david bowie 297.42 1499.3±0.86ms ? ?/sec 1.00 5.0±0.02ms ? ?/sec
+ smol-songs.csv: basic without quote/michael jackson 2.55 8.9±0.02ms ? ?/sec 1.00 3.5±0.01ms ? ?/sec
+ smol-songs.csv: big filter/john 1.08 473.6±2.25µs ? ?/sec 1.00 438.1±2.59µs ? ?/sec
- smol-songs.csv: prefix search/a 1.00 446.9±1.81µs ? ?/sec 1.79 800.5±4.45µs ? ?/sec
- smol-songs.csv: prefix search/b 1.00 398.5±2.74µs ? ?/sec 1.81 723.1±5.46µs ? ?/sec
- smol-songs.csv: prefix search/i 1.00 486.3±1.99µs ? ?/sec 1.69 823.6±9.42µs ? ?/sec
- smol-songs.csv: prefix search/s 1.00 229.6±3.29µs ? ?/sec 2.59 594.4±2.22µs ? ?/sec
- smol-songs.csv: prefix search/x 1.00 150.2±0.76µs ? ?/sec 1.11 166.0±0.87µs ? ?/sec
```
On songs, the new algorithm gives a big improvement on slow queries, and is slower on one char prefix search (fast queries <1ms).
#### wiki main VS refactor-attribute-criterion
```diff
group search_wikimain_31c18f09 search_wikirefactor-attribute-criterion_1bd15d84
----- ------------------------ ------------------------------------------------
- smol-wiki-articles.csv: basic with quote/"rock" "and" "roll" 1.00 3.2±0.01ms ? ?/sec 1.15 3.7±0.01ms ? ?/sec
- smol-wiki-articles.csv: basic without quote/film 1.00 351.5±2.47µs ? ?/sec 1.13 396.8±1.63µs ? ?/sec
+ smol-wiki-articles.csv: basic without quote/rock and roll 1.10 9.4±0.02ms ? ?/sec 1.00 8.6±0.04ms ? ?/sec
- smol-wiki-articles.csv: basic without quote/spain 1.00 446.0±3.23µs ? ?/sec 1.11 496.6±7.75µs ? ?/sec
- smol-wiki-articles.csv: prefix search/c 1.00 115.6±0.61µs ? ?/sec 2.22 256.7±1.24µs ? ?/sec
- smol-wiki-articles.csv: prefix search/g 1.00 189.7±2.03µs ? ?/sec 1.57 297.0±1.35µs ? ?/sec
- smol-wiki-articles.csv: prefix search/j 1.00 209.2±1.11µs ? ?/sec 1.40 293.0±2.09µs ? ?/sec
- smol-wiki-articles.csv: prefix search/q 1.00 79.0±0.44µs ? ?/sec 1.10 87.2±0.69µs ? ?/sec
- smol-wiki-articles.csv: prefix search/t 1.00 270.1±1.15µs ? ?/sec 1.55 419.9±5.16µs ? ?/sec
- smol-wiki-articles.csv: prefix search/x 1.00 244.9±1.33µs ? ?/sec 1.07 260.9±1.95µs ? ?/sec
- smol-wiki-articles.csv: words/Abraham machin 1.00 8.1±0.03ms ? ?/sec 1.17 9.4±0.02ms ? ?/sec
- smol-wiki-articles.csv: words/Idaho Bellevue pizza 1.00 19.3±0.07ms ? ?/sec 1.07 20.6±0.05ms ? ?/sec
```
On wiki we have some regressions `+17%` and `+15%` on request `>1ms`.
Co-authored-by: many <maxime@meilisearch.com>
2021-10-06 09:19:33 +00:00
many
085bc6440c
Apply PR comments
2021-10-06 11:12:26 +02:00
many
1bd15d849b
Reduce candidates threshold
2021-10-05 18:52:14 +02:00
many
ea4bd29d14
Apply PR comments
2021-10-05 17:35:07 +02:00
many
5ed75de0db
Update infos crate
2021-10-05 13:56:12 +02:00
many
3296bb243c
Simplify word level position DB into a word position DB
2021-10-05 12:15:02 +02:00
many
75d341d928
Re-implement set based algorithm for attribute criterion
2021-10-05 12:14:50 +02:00
bors[bot]
31c18f0953
Merge #381
...
381: Update version for the next release (v0.17.1) r=irevoire a=curquiza
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-10-03 02:12:43 +00:00
Clémentine Urquizar
05d8a33a28
Update version for the next release (v0.17.1)
2021-10-02 16:21:31 +02:00
bors[bot]
c9092c72bf
Merge #380
...
380: Reserved keyword error message r=Kerollmops a=irevoire
And I missed _another_ reserved keyword error message in the filter :(
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-10-01 07:13:31 +00:00
Tamo
d9eba9d145
improve and test the sort error message
2021-09-30 14:38:27 +02:00
Tamo
0ee67bb7d1
improve the reserved keyword error message for the filters
2021-09-30 14:38:27 +02:00
bors[bot]
22551d0941
Merge #379
...
379: Revert "Change chunk size to 4MiB to fit more the end user usage" r=curquiza a=ManyTheFish
Reverts meilisearch/milli#370
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-09-29 13:20:53 +00:00
Many
26b5dad042
Revert "Change chunk size to 4MiB to fit more the end user usage"
2021-09-29 15:08:39 +02:00
bors[bot]
6a057a3bd0
Merge #378
...
378: Hotfix meilisearch#1707 r=Kerollmops a=ManyTheFish
This PR contains an ugly quick fix of [meilisearch#1707](https://github.com/meilisearch/MeiliSearch/issues/1707 ).
- remove comparison reverse on rank. Enhancing relevancy and performances
- iterate over level 0 only. Enhancing performances.
A better fix is in development.
Co-authored-by: many <maxime@meilisearch.com>
Co-authored-by: Many <legendre.maxime.isn@gmail.com>
2021-09-29 12:57:31 +00:00
Many
2e49230ca2
Update milli/src/search/criteria/attribute.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:45 +02:00
Many
7ad0214089
Update milli/src/search/criteria/attribute.rs
...
Co-authored-by: Clément Renault <clement@meilisearch.com>
2021-09-29 14:49:41 +02:00
many
1df5b8712b
Hotfix meilisearch#1707
2021-09-29 14:41:56 +02:00
bors[bot]
bfedbc1b6d
Merge #374
...
374: Enhance CSV document parsing r=Kerollmops a=ManyTheFish
Benchmarks on `search_songs` were crashing because of the CSV parsing.
Co-authored-by: many <maxime@meilisearch.com>
2021-09-29 08:55:54 +00:00
bors[bot]
68c758a533
Merge #376
...
376: Stop casting integer docids to string r=Kerollmops a=irevoire
When a docid is an integer, we stop casting it to a string, and thus we don't add `"` around it.
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-29 08:32:48 +00:00
many
d2427f18e5
Enhance CSV document parsing
2021-09-29 10:25:33 +02:00
bors[bot]
00f94b1ffd
Merge #377
...
377: Update version for the next release (v0.17.0) r=Kerollmops a=curquiza
Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2021-09-28 20:43:33 +00:00
Clémentine Urquizar
0e8665bf18
Update version for the next release (v0.17.0)
2021-09-28 19:38:12 +02:00
Tamo
f65153ad64
stop casting integer docids to string
2021-09-28 18:35:54 +02:00
bors[bot]
adddf3f179
Merge #375
...
375: Fixes #365 r=Kerollmops a=vishnugt
Co-authored-by: Vishnu Ganesan <vganesan@microsoft.com>
Co-authored-by: Vishnu Gt <vishnugt@hotmail.com>
2021-09-28 14:42:48 +00:00
Vishnu Gt
785c1372f2
Change "settings" to "setting"
...
Co-authored-by: Clément Renault <renault.cle@gmail.com>
2021-09-28 20:11:32 +05:30
Vishnu Ganesan
3580b2d803
Fixes #365
2021-09-28 19:30:23 +05:30
bors[bot]
3a12f5887e
Merge #373
...
373: Improve error message for bad sort syntax with geosearch r=Kerollmops a=irevoire
`@Kerollmops` This should be the last PR for the geosearch and error handling, sorry for doing it in so many steps 😬
Co-authored-by: Tamo <tamo@meilisearch.com>
2021-09-28 12:39:32 +00:00
Tamo
a80dcfd4a3
improve error message for bad sort syntax with geosearch
2021-09-28 14:32:24 +02:00