Louis Dureuil
406b8bd248
Add new db caches
2023-04-04 17:04:46 +02:00
Loïc Lecrenier
62b9c6fbee
Add search tests
2023-04-04 16:18:22 +02:00
Loïc Lecrenier
b439d36807
Split query_term module into multiple submodules
2023-04-04 15:38:30 +02:00
Loïc Lecrenier
faceb661e3
Add note that a part of the code needs fixing
2023-04-04 15:02:01 +02:00
Loïc Lecrenier
4129d657e2
Simplify query_term module a bit
2023-04-04 15:01:42 +02:00
Filip Bachul
1e6fe71a67
fix clippy warning
2023-04-03 20:18:26 +02:00
Filip Bachul
fddfb37f1f
remove unnecessary FilterError:ReservedGeo and FilterError:ReservedGeo
2023-04-03 20:18:26 +02:00
Loïc Lecrenier
3f13608002
Fix computation of ngram derivations
2023-04-03 15:27:49 +02:00
Loïc Lecrenier
4708d9b016
Fix compiler warnings/errors
2023-04-03 10:09:27 +02:00
Clément Renault
0d2e7bcc13
Implement the previous way for the exhaustive distinct candidates
2023-04-03 10:08:10 +02:00
Loïc Lecrenier
55fbfb6124
Merge branch 'search-refactor-located-query-terms' into search-refactor
2023-04-03 10:04:36 +02:00
Loïc Lecrenier
58fe260c72
Allow removing all the terms from a query if it contains a phrase
2023-04-03 09:18:02 +02:00
Loïc Lecrenier
24e5f6f7a9
Don't remove phrases with "last" term matching strategy
2023-04-03 09:17:33 +02:00
Louis Dureuil
9b87c36200
Limit the number of derivations for a single word.
2023-03-31 09:19:18 +02:00
Loïc Lecrenier
12b26cd54e
Don't remove phrases from the query with term matching strategy Last
2023-03-30 14:54:08 +02:00
Loïc Lecrenier
061b1e6d7c
Tiny refactor of query graph remove_nodes method
2023-03-30 14:49:25 +02:00
Loïc Lecrenier
0d6e8b5c31
Fix phrase search bug when the phrase has only one word
2023-03-30 14:48:12 +02:00
Loïc Lecrenier
d48cdc67a0
Fix term matching strategy bugs
2023-03-30 14:01:52 +02:00
Loïc Lecrenier
35c16ad047
Use new term matching strategy logic in words ranking rule
2023-03-30 13:15:43 +02:00
Loïc Lecrenier
2997d1f186
Use new term matching strategy logic in resolve_maximally_reduced_...
2023-03-30 13:12:51 +02:00
Loïc Lecrenier
2a5997fb20
Avoid expensive assert! in bucket sort function
2023-03-30 13:07:17 +02:00
Loïc Lecrenier
ee8a9e0bad
Remove outdated sentence in documentation
2023-03-30 12:22:24 +02:00
Loïc Lecrenier
3b0737a092
Fix detailed logger
2023-03-30 12:20:44 +02:00
Loïc Lecrenier
fdd02105ac
Graph-based ranking rule + term matching strategy support
2023-03-30 12:19:21 +02:00
Loïc Lecrenier
aa9592455c
Refactor the paths_of_cost algorithm
...
Support conditions that require certain nodes to be skipped
2023-03-30 12:11:11 +02:00
Loïc Lecrenier
01e24dd630
Rewrite proximity ranking rule
2023-03-30 11:59:06 +02:00
Loïc Lecrenier
ae6bb1ce17
Update the ConditionDocidsCache after change to RankingRuleGraphTrait
2023-03-30 11:41:20 +02:00
Loïc Lecrenier
5fd28620cd
Build ranking rule graph correctly after changes to trait definition
2023-03-30 11:32:55 +02:00
Loïc Lecrenier
728710d63a
Update typo ranking rule to use new query term structure
2023-03-30 11:32:19 +02:00
Loïc Lecrenier
fa81381865
Update the trait requirements of ranking-rule graphs
2023-03-30 11:19:45 +02:00
Loïc Lecrenier
b96a682f16
Update resolve_graph module to work with lazy query terms
2023-03-30 11:10:38 +02:00
Loïc Lecrenier
d0f048c068
Simplify the API of the DatabaseCache
2023-03-30 11:08:17 +02:00
Loïc Lecrenier
223e82a10d
Update QueryGraph to use new lazy query terms + build from paths
2023-03-30 11:06:02 +02:00
Loïc Lecrenier
9507ff5e31
Update query term structure to allow for laziness
2023-03-30 11:06:02 +02:00
Louis Dureuil
c2b025946a
located_query_terms_from_string
: use u16 for positions, hard limit number of iterated tokens.
...
- Refactor phrase logic to reduce number of possible states
2023-03-30 11:04:14 +02:00
Loïc Lecrenier
3a818c5e87
Add more functionality to interners
2023-03-30 09:56:23 +02:00
Louis Dureuil
d74134ce3a
Check sort criteria
2023-03-29 15:21:54 +02:00
Louis Dureuil
5ac129bfa1
Mark geosearch as currently unimplemented for sort rule
2023-03-29 15:20:42 +02:00
ManyTheFish
efea1e5837
Fix facet normalization
2023-03-29 12:02:24 +02:00
Louis Dureuil
abb4522f76
Small comment on ignored rules for placeholder search
2023-03-29 09:11:06 +02:00
Louis Dureuil
ef084ef042
SmallBitmap: Consistently panic on incoherent universe lengths
2023-03-29 08:45:38 +02:00
Louis Dureuil
3524bd1257
SmallBitmap: Add documentation
2023-03-29 08:44:11 +02:00
Tamo
a50b058557
update the geoBoundingBox feature
...
Now instead of using the (top_left, bottom_right) corners of the bounding box it s using the (top_right, bottom_left) corners.
2023-03-28 18:26:18 +02:00
Louis Dureuil
d4f6216966
Resolve rule time sort criteria
2023-03-28 16:42:02 +02:00
Louis Dureuil
77acafe534
Resolve search time sort criteria for placeholder search
2023-03-28 16:41:03 +02:00
Louis Dureuil
abb19d368d
Initialize query time ranking rule for query search
2023-03-28 12:40:52 +02:00
Louis Dureuil
b4a52a622e
BoxRankingRule
2023-03-28 12:39:42 +02:00
Louis Dureuil
e9eb271499
Remove empty attribute_rule mod
2023-03-27 11:08:03 +02:00
Louis Dureuil
3281a88d08
SmallBitmap: don't expose internal items
2023-03-27 11:04:43 +02:00
Louis Dureuil
5a644054ab
Removed unused search impl
2023-03-27 11:04:27 +02:00
Louis Dureuil
16fefd364e
Add TODO notes
2023-03-27 11:04:04 +02:00
Loïc Lecrenier
00bad8c716
Add comments suggesting performance improvements
2023-03-23 10:18:24 +01:00
Loïc Lecrenier
862714a18b
Remove criterion_implementation_strategy param of Search
2023-03-23 09:44:12 +01:00
Loïc Lecrenier
d18ebe4f3a
Remove more warnings
2023-03-23 09:41:18 +01:00
Loïc Lecrenier
7169d85115
Remove old query_tree code and make clippy happy
2023-03-23 09:39:16 +01:00
Loïc Lecrenier
f5f5f03ec0
Remove old criteria code
2023-03-23 09:35:53 +01:00
Loïc Lecrenier
9b2653427d
Split position DB into fid and relative position DB
2023-03-23 09:22:01 +01:00
Loïc Lecrenier
56b7209f26
Make clippy happy
2023-03-23 09:16:17 +01:00
Loïc Lecrenier
9b1f439a91
WIP
2023-03-23 09:12:35 +01:00
Loïc Lecrenier
a86aeba411
WIP
2023-03-22 14:43:08 +01:00
Loïc Lecrenier
384fdc2df4
Fix two bugs in proximity ranking rule
2023-03-21 11:43:25 +01:00
Loïc Lecrenier
83e5b4ed0d
Compute edges of proximity graph lazily
2023-03-21 10:44:40 +01:00
Loïc Lecrenier
272cd7ebbd
Small cleanup
2023-03-20 13:39:19 +01:00
Loïc Lecrenier
c63c7377e6
Switch order of MappedInterner generic params
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
5b50e49522
cargo fmt
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
65474c8de5
Update new sort ranking rule after rebasing
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
fbb1ba3de0
Cargo fmt
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
a59ca28e2c
Add forgotten file
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
825f742000
Simplify graph-based ranking rule impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
dd491320e5
Simplify graph-based ranking rule impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c6ff97a220
Rewrite the dead-ends cache to detect more dead-ends
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
49240c367a
Fix bug in cost of typo conditions
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1e6e624078
Fix bug in SmallBitmap
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
8b4e07e1a3
WIP
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
2853009987
Renaming Edge -> Condition
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
aa59c3bc2c
Replace EdgeCondition with an Option<..> + other code cleanup
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
7b1d8f4c6d
Make PathSet strongly typed
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
a49ddec9df
Prune the query graph after executing a ranking rule
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
05fe856e6e
Merge forward and backward proximity conditions in proximity graph
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c0cdaf9f53
Fix bug in the proximity ranking rule for queries with ngrams
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
e9cf58d584
Refactor of the Interner
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
31628c5cd4
Merge Phrase and WordDerivations into one structure
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
3004e281d7
Support ngram typos + splitwords and splitwords+synonyms in proximity
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
14e8d0aaa2
Rename lifetime
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1c58cf8426
Intern ranking rule graph edge conditions as well
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
5155fd2bf1
Reorganise initialisation of ranking rules + rename PathsMap -> PathSet
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
9ec9c204d3
Small code cleanup
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
78b9304d52
Implement distinct attribute
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
0465ba4a05
Intern more values
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
2099991dd1
Continue documenting and cleaning up the code
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c232cdabf5
Add documentation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
4e266211bf
Small code reorganisation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
57fa689131
Cargo fmt
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
10626dddfc
Add a few more optimisations to new search algorithms
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
9051065c22
Apply a few optimisations for graph-based ranking rules
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
e8c76cf7bf
Intern all strings and phrases in the search logic
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
3f1729a17f
Update new search test
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
cab2b6bcda
Fix: computation of initial universe, code organisation
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c4979a2fda
Fix code visibility issue + unimplemented detail in proximity rule
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
23931f8a4f
Fix small bug in visual logger of search algo
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
aa414565bb
Fix proximity graph edge builder to include all proximities
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
1db152046e
WIP on split words and synonyms support
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
c27ea2677f
Rewrite cheapest path algorithm and empty path cache
...
It is now much simpler and has much better performance.
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
caa1e1b923
Add typo ranking rule to new search impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
71f18e4379
Add sort ranking rule to new search impl
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
600e3dd1c5
Remove warnings
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
362eb0de86
Add support for filters
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
998d46ac10
Add support for search offset and limit
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
6c85c0d95e
Fix more bugs + visual empty path cache logging
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
0e1fbbf7c6
Fix bugs in query graph's "remove word" and "cheapest paths" algos
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
6806640ef0
Fix d2 description of paths map
2023-03-20 09:41:56 +01:00
Loïc Lecrenier
173e37584c
Improve the visual/detailed search logger
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
6ba4d5e987
Add a search logger
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
dd12d44134
Support swapped word pairs in new proximity ranking rule impl
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c8e251bf24
Remove noise in codebase
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a938fbde4a
Use a cache when resolving the query graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
dcf3f1d18a
Remove EdgeIndex and NodeIndex types, prefer u32 instead
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
66d0c63694
Add some documentation and use bitmaps instead of hashmaps when possible
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
132191360b
Introduce the sort ranking rule working with the new search structures
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
345c99d5bd
Introduce the words ranking rule working with the new search structures
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
89d696c1e3
Introduce the proximity ranking rule as a graph-based ranking rule
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c645853529
Introduce a generic graph-based ranking rule
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a70ab8b072
Introduce a function to find the K shortest paths in a graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
48aae76b15
Introduce a function to find the docids of a set of paths in a graph
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
23bf572dea
Introduce cache structures used with ranking rule graphs
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
864f6410ed
Introduce a structure to represent a set of graph paths efficiently
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
c9bf6bb2fa
Introduce a structure to implement ranking rules with graph algorithms
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
46249ea901
Implement a function to find a QueryGraph's docids
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
ce0d1e0e13
Introduce a common way to manage the coordination between ranking rules
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
5065d8b0c1
Introduce a DatabaseCache to memorize the addresses of LMDB values
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
a83007c013
Introduce structure to represent search queries as graphs
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
79e0a6dd4e
Introduce a new search module, eventually meant to replace the old one
...
The code here does not compile, because I am merely splitting one giant
commit into smaller ones where each commit explains a single file.
2023-03-20 09:41:55 +01:00
Loïc Lecrenier
2d88089129
Remove unused term matching strategies
2023-03-20 09:41:55 +01:00
Clément Renault
ea016d97af
Implementing an IS EMPTY filter
2023-03-15 14:12:34 +01:00
Clément Renault
175e8a8495
Fix a diacritic issue
2023-03-09 14:57:47 +01:00
Clément Renault
7c0cd7172d
Introduce the NULL and NOT value NULL operator
2023-03-08 17:14:34 +01:00
bors[bot]
4f1ccbc495
Merge #3525
...
3525: Fix phrase search containing stop words r=ManyTheFish a=ManyTheFish
# Summary
A search with a phrase containing only stop words was returning an HTTP error 500,
this PR filters the phrase containing only stop words dropping them before the search starts, a query with a phrase containing only stop words now behaves like a placeholder search.
fixes https://github.com/meilisearch/meilisearch/issues/3521
related v1.0.2 PR on milli: https://github.com/meilisearch/milli/pull/779
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-03-02 10:55:37 +00:00
ManyTheFish
37489fd495
Return an internal error in the case of matching word is invalid
2023-03-01 19:05:16 +01:00
bors[bot]
ac5a1e4c4b
Merge #3423
...
3423: Add min and max facet stats r=dureuill a=dureuill
# Pull Request
## Related issue
Fixes #3426
## What does this PR do?
### User standpoint
- When using a `facets` parameter in search, the facets that have numeric values are displayed in a new section of the response called `facetStats` that contains, per facet, the numeric min and max value of the hits returned by the search.
<details>
<summary>
Sample request/response
</summary>
```json
❯ curl \
-X POST 'http://localhost:7700/indexes/meteorites/search?facets=mass ' \
-H 'Content-Type: application/json' \
--data-binary '{ "q": "LL6", "facets":["mass", "recclass"], "limit": 5 }' | jsonxf
{
"hits": [
{
"name": "Niger (LL6)",
"id": "16975",
"nametype": "Valid",
"recclass": "LL6",
"mass": 3.3,
"fall": "Fell"
},
{
"name": "Appley Bridge",
"id": "2318",
"nametype": "Valid",
"recclass": "LL6",
"mass": 15000,
"fall": "Fell",
"_geo": {
"lat": 53.58333,
"lng": -2.71667
}
},
{
"name": "Athens",
"id": "4885",
"nametype": "Valid",
"recclass": "LL6",
"mass": 265,
"fall": "Fell",
"_geo": {
"lat": 34.75,
"lng": -87.0
}
},
{
"name": "Bandong",
"id": "4935",
"nametype": "Valid",
"recclass": "LL6",
"mass": 11500,
"fall": "Fell",
"_geo": {
"lat": -6.91667,
"lng": 107.6
}
},
{
"name": "Benguerir",
"id": "30443",
"nametype": "Valid",
"recclass": "LL6",
"mass": 25000,
"fall": "Fell",
"_geo": {
"lat": 32.25,
"lng": -8.15
}
}
],
"query": "LL6",
"processingTimeMs": 15,
"limit": 5,
"offset": 0,
"estimatedTotalHits": 42,
"facetDistribution": {
"mass": {
"110000": 1,
"11500": 1,
"1161": 1,
"12000": 1,
"1215.5": 1,
"127000": 1,
"15000": 1,
"1676": 1,
"1700": 1,
"1710.5": 1,
"18000": 1,
"19000": 1,
"220000": 1,
"2220": 1,
"22300": 1,
"25000": 2,
"265": 1,
"271000": 1,
"2840": 1,
"3.3": 1,
"3000": 1,
"303": 1,
"32000": 1,
"34000": 1,
"36.1": 1,
"45000": 1,
"460": 1,
"478": 1,
"483": 1,
"5500": 2,
"600": 1,
"6000": 1,
"67.8": 1,
"678": 1,
"680.5": 1,
"6930": 1,
"8": 1,
"8300": 1,
"840": 1,
"8400": 1
},
"recclass": {
"L/LL6": 3,
"LL6": 39
}
},
"facetStats": {
"mass": {
"min": 3.3,
"max": 271000.0
}
}
}
```
</details>
## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [ ] Have you read the contributing guidelines?
- [ ] Have you made sure that the title is accurate and descriptive of the changes?
Thank you so much for contributing to Meilisearch!
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-02-22 13:06:43 +00:00
ManyTheFish
900bae3d9d
keep phrases that has at least one word
2023-02-21 18:16:51 +01:00
ManyTheFish
8aa808d51b
Merge branch 'main' into enhance-language-detection
2023-02-20 18:14:34 +01:00
Many the fish
119e6d8811
Update milli/src/search/mod.rs
...
Co-authored-by: Tamo <tamo@meilisearch.com>
2023-02-20 15:33:10 +01:00
Louis Dureuil
eb28d4c525
add facet test
2023-02-20 13:52:28 +01:00
Louis Dureuil
9ac981d025
Remove some clippy type complexity warns by deboxing iters
2023-02-20 13:52:27 +01:00
Louis Dureuil
74859ecd61
Add min and max facet stats
2023-02-20 13:52:27 +01:00
Louis Dureuil
8ae441a4db
Update usage of iterators
2023-02-20 13:52:27 +01:00
Louis Dureuil
042d86cbb3
facet sort ascending/descending now also return the values
2023-02-20 13:52:27 +01:00
bors[bot]
143e3cf948
Merge #3490
...
3490: Fix attributes set candidates r=curquiza a=ManyTheFish
# Pull Request
Fix attributes set candidates for v1.1.0
## details
The attribute criterion was not returning the remaining candidates when its internal algorithm was been exhausted.
We had a loss of candidates by the attribute criterion leading to the bug reported in the issue linked below.
After some investigation, it seems that it was the only criterion that had this behavior.
We are now returning the remaining candidates instead of an empty bitmap.
## Related issue
Fixes #3483
PR on milli for v1.0.1: https://github.com/meilisearch/milli/pull/777
Co-authored-by: ManyTheFish <many@meilisearch.com>
2023-02-15 17:38:07 +00:00
Filip Bachul
a53536836b
fmt
2023-02-14 17:04:22 +01:00
Filip Bachul
d7ad39ad77
fix: clippy error
2023-02-14 00:15:35 +01:00