meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-11-23 02:27:40 +08:00

Author	SHA1	Message	Date
meili-bors[bot]	101f5a20d2	Merge #3757 3757: Adjust the cost of edges in the `position` ranking rule by bucketing positions more aggressively r=loiclec a=loiclec This PR significantly improves the performance of the `position` ranking rule when: 1. a query contains many words 2. the `position` ranking rule needs to be called many times 3. the score of the documents according to `position` is high These conditions greatly increase: 1. the number of edge traversals that are needed to find a valid path from the `start` node to the `end` node 2. the number of edges that need to be deleted from the graph, and therefore the number of times that we need to recompute all the possible costs from START to END As a result, a majority of the search time is spent in `visit_condition`, `visit_node`, and `update_all_costs_before_node`. This is frustrating because it often happens when the "universe" given to the rule consists of only a handful of document ids. By limiting the number of possible edges between two nodes from `20` to `10`, we: 1. reduce the number of possible costs from START to END 2. reduce the number of edges that will be deleted 3. make it faster to update the costs after deleting an edge 4. reduce the number of buckets that need to be computed In terms of relevancy, I don't think we lose or gain much. We still prefer terms that are in a lower positions, with decreasing precision as we go further. The previous choice of bucketing wasn't chosen in a principled way, and neither is this one. They both "feel" right to me. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com> Co-authored-by: meili-bors[bot] <89034592+meili-bors[bot]@users.noreply.github.com>	2023-05-17 11:43:59 +00:00
Loïc Lecrenier	ec8f685d84	Fix bug in cheapest path algorithm	2023-05-16 17:01:30 +02:00
Loïc Lecrenier	5758268866	Don't compute split_words for phrases	2023-05-16 17:01:18 +02:00
Loïc Lecrenier	3e19702de6	Update snapshot tests	2023-05-16 12:22:46 +02:00
meili-bors[bot]	1e762d151f	Merge #3755 3755: Re-add final dot r=curquiza a=ManyTheFish I removed the final dot of the error message in my last PR, this one re-adds it. related to https://github.com/meilisearch/meilisearch/pull/3749 > Oups 😬 Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-05-16 10:10:58 +00:00
Loïc Lecrenier	f6524a6858	Adjust costs of edges in position ranking rule To ensure good performance	2023-05-16 11:28:56 +02:00
meili-bors[bot]	65ad8cce36	Merge #3741 3741: Add ngram support to the highlighter r=ManyTheFish a=loiclec This PR fixes a bug introduced by the search refactor, where ngrams were not highlighted. The solution was to add the ngrams to the vector of `LocatedQueryTerm` that is given to the `MatchingWords` structure. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-05-16 09:03:31 +00:00
ManyTheFish	42650f82e8	Re-add final dot	2023-05-16 10:57:26 +02:00
Loïc Lecrenier	a37da36766	Implement `words` as a graph-based ranking rule and fix some bugs	2023-05-16 10:42:11 +02:00
Loïc Lecrenier	85d96d35a8	Highlight ngram matches as well	2023-05-16 10:39:36 +02:00
meili-bors[bot]	bf66e97b48	Merge #3749 3749: Fix back: sort error message r=ManyTheFish a=ManyTheFish This PR reintroduces the error message modified in https://github.com/meilisearch/milli/pull/375. However, this added double-quotes around `sort` in the message. I don't think another message contains double-quotes, so I have added a separate commit replacing the double-quotes with back-ticks, which seems more consistent with the other error messages, this last change can be reverted easily. ## Detailed changes #### v1.2-rc0 ``` The sort ranking rule must be specified in the ranking rules settings to use the sort parameter at search time. ``` #### [Reintroduce fix (previous and expected behavior)](`23d1c86825`) ``` You must specify where "sort" is listed in the rankingRules setting to use the sort parameter at search time ``` #### [Replace double-quotes with back-ticks (my suggestion)](`4d691d071a`) ``` You must specify where `sort` is listed in the rankingRules setting to use the sort parameter at search time ``` ## Related Fixes #3722 ## Reviewers - technical review: `@irevoire` - to validate the replacement: `@macraig` Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-05-15 14:55:51 +00:00
Kerollmops	1a79fd0c3c	Use the new heed v0.12.6	2023-05-15 11:42:30 +02:00
Kerollmops	f759ec7fad	Expose a flag to enable the MDB_WRITEMAP flag	2023-05-15 11:38:43 +02:00
ManyTheFish	4d691d071a	Change double-quotes by back-ticks in sort error message	2023-05-15 11:10:36 +02:00
ManyTheFish	23d1c86825	Re-introduce the sort error message fix	2023-05-15 11:07:23 +02:00
Kerollmops	c4a40e7110	Use the writemap flag to reduce the memory usage	2023-05-15 10:15:33 +02:00
Loïc Lecrenier	4d352a21ac	Compute split words derivations of terms that don't accept typos	2023-05-10 13:31:19 +02:00
Loïc Lecrenier	3625389057	Highlight ngram matches as well	2023-05-08 15:35:41 +02:00
meili-bors[bot]	eace6df91b	Merge #3726 3726: Fix prefix highlighting r=loiclec a=ManyTheFish The prefix queries were not properly highlighted, this PR now highlights only the start of a word when it matched with a prefix Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-05-08 07:46:46 +00:00
Loïc Lecrenier	83ab8cf4e5	Remove dbg!(..) expression in highlighter tests	2023-05-08 09:45:23 +02:00
ManyTheFish	cd2573fcc3	Fix prefix highlighting	2023-05-04 16:53:50 +02:00
meili-bors[bot]	9f7981df28	Merge #3687 3687: Allow to disable specialized tokenizations (again) r=Kerollmops a=jirutka In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai` feature flags to allow melisearch to be built without huge specialed tokenizations that took up 90% of the melisearch binary size. Unfortunately, due to some recent changes, this doesn't work anymore. The problem lies in excessive use of the `default` feature flag, which infects the dependency graph. Instead of adding `default-features = false` here and there, it's easier and more future-proof to not declare `default` in `milli` and `meilisearch-types`. I've renamed it to `all-tokenizers`, which also makes it a bit clearer what it's about. Co-authored-by: Jakub Jirutka <jakub@jirutka.cz>	2023-05-04 14:48:01 +00:00
Jakub Jirutka	e615fa5ec6	Fix unused_imports warning in milli when japanese is not enabled	2023-05-04 15:46:11 +02:00
Jakub Jirutka	13f1277637	Allow to disable specialized tokenizations (again) In PR #2773, I added the `chinese`, `hebrew`, `japanese` and `thai` feature flags to allow melisearch to be built without huge specialed tokenizations that took up 90% of the melisearch binary size. Unfortunately, due to some recent changes, this doesn't work anymore. The problem lies in excessive use of the `default` feature flag, which infects the dependency graph. Instead of adding `default-features = false` here and there, it's easier and more future-proof to not declare `default` in `milli` and `meilisearch-types`. I've renamed it to `all-tokenizers`, which also makes it a bit clearer what it's about.	2023-05-04 15:45:40 +02:00
Louis Dureuil	a35d3fc708	Add Index::iter_documents	2023-05-04 15:31:54 +02:00
Louis Dureuil	732c52093d	Processing time without autobatching implementation	2023-05-03 17:41:48 +02:00
Louis Dureuil	f8f190cd40	Update exactness tests following charabia camelCase tokenization	2023-05-03 14:45:09 +02:00
Louis Dureuil	3a408e8287	Increase map size for tests following charabia camelCase tokenization	2023-05-03 14:44:48 +02:00
Louis Dureuil	d3e5b10e23	fix nb of dbs	2023-05-03 14:11:20 +02:00
Louis Dureuil	1aaf24ccbf	Cargo fmt	2023-05-03 12:21:58 +02:00
Louis Dureuil	90bc230820	Merge remote-tracking branch 'origin/main' into search-refactor Conflicts \| resolution ----------\|----------- Cargo.lock \| added mimalloc Cargo.toml \| took origin/main version milli/src/search/criteria/exactness.rs \| deleted after checking it was only clippy changes milli/src/search/query_tree.rs \| deleted after checking it was only clippy changes	2023-05-03 12:19:06 +02:00
Louis Dureuil	342c4ff85d	geosort: Remove rtree unwrap	2023-05-03 09:52:16 +02:00
Tamo	c85392ce40	make the descendent geosort fast	2023-05-03 09:13:12 +02:00
Tamo	8875d24a48	deserialize the rtree only when its needed, and keep it in memory once it has been deserialized	2023-05-03 09:13:12 +02:00
Tamo	c470b67fa2	revamp the test to use execute_iterative_and_rtree_returns_the_same	2023-05-03 09:13:12 +02:00
meili-bors[bot]	c0e081cd98	Merge #3702 #3710 3702: Update charabia v0.7.2 r=curquiza a=ManyTheFish fixes #3701 fixes #3689 fixes #3285 3710: Updated messages pointing to the docs website r=curquiza a=roy9495 # Pull Request Fixes partially #3668 ## What does this PR do? - ...Any messages referencing this docs site https://docs.meilisearch.com has been changed to this docs site https://meilisearch.com/docs . Thanks. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: TATHAGATA ROY <98920199+roy9495@users.noreply.github.com>	2023-05-02 17:27:57 +00:00
Louis Dureuil	b60840ebff	Remove self.iterating from words	2023-05-02 18:54:23 +02:00
Louis Dureuil	fdc1763838	Use MultiOps for resolve_query_graph	2023-05-02 18:54:09 +02:00
Louis Dureuil	75819bc940	Remove too many arguments on resolve_maximally_reduced_query_graph	2023-05-02 18:53:40 +02:00
Louis Dureuil	7b8cc25625	rename located_query_terms_from_string -> located_query_terms_from_tokens	2023-05-02 18:53:01 +02:00
Loïc Lecrenier	aa63091752	Fix bug in exact_attribute	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	58735d6d8f	Fix outdated relevancy test	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	1b514517f5	Fix bug in computation of query term at a position	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	11f814821d	Minor cleanup	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	30fb1153cc	Speed up graph based ranking rule when a lot of different costs exist	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	3b2c8b9f25	Improve performance of position rr	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	2a7f9adf78	Build query graph more correctly from paths Update snapshots	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	608ceea440	Fix bug in position rr	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	79001b9c97	Improve performance of the cheapest path finder algorithm	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	59b12fca87	Fix errors, clippy warnings, and add review comments	2023-04-29 11:48:11 +02:00
Loïc Lecrenier	48f5bb1693	Implements the geo-sort ranking rule	2023-04-29 11:02:16 +02:00
Loïc Lecrenier	93188b3c88	Fix indexing of word_prefix_fid_docids	2023-04-29 10:56:48 +02:00
Loïc Lecrenier	bc4efca611	Add more tests for the attribute ranking rule	2023-04-29 10:56:48 +02:00
bors[bot]	414b3fae89	Merge #3571 3571: Introduce two filters to select documents with `null` and empty fields r=irevoire a=Kerollmops # Pull Request ## Related issue This PR implements the `X IS NULL`, `X IS NOT NULL`, `X IS EMPTY`, `X IS NOT EMPTY` filters that [this comment](https://github.com/meilisearch/product/discussions/539#discussioncomment-5115884) is describing in a very detailed manner. ## What does this PR do? ### `IS NULL` and `IS NOT NULL` This PR will be exposed as a prototype for now. Below is the copy/pasted version of a spec that defines this filter. - `IS NULL` matches fields that `EXISTS` AND `= IS NULL` - `IS NOT NULL` matches fields that `NOT EXISTS` OR `!= IS NULL` 1. `{"name": "A", "price": null}` 2. `{"name": "A", "price": 10}` 3. `{"name": "A"}` `price IS NULL` would match 1 `price IS NOT NULL` or `NOT price IS NULL` would match 2,3 `price EXISTS` would match 1, 2 `price NOT EXISTS` or `NOT price EXISTS` would match 3 common query : `(price EXISTS) AND (price IS NOT NULL)` would match 2 ### `IS EMPTY` and `IS NOT EMPTY` - `IS EMPTY` matches Array `[]`, Object `{}`, or String `""` fields that `EXISTS` and are empty - `IS NOT EMPTY` matches fields that `NOT EXISTS` OR are not empty. 1. `{"name": "A", "tags": null}` 2. `{"name": "A", "tags": [null]}` 3. `{"name": "A", "tags": []}` 4. `{"name": "A", "tags": ["hello","world"]}` 5. `{"name": "A", "tags": [""]}` 6. `{"name": "A"}` 7. `{"name": "A", "tags": {}}` 8. `{"name": "A", "tags": {"t1":"v1"}}` 9. `{"name": "A", "tags": {"t1":""}}` 10. `{"name": "A", "tags": ""}` `tags IS EMPTY` would match 3,7,10 `tags IS NOT EMPTY` or `NOT tags IS EMPTY` would match 1,2,4,5,6,8,9 `tags IS NULL` would match 1 `tags IS NOT NULL` or `NOT tags IS NULL` would match 2,3,4,5,6,7,8,9,10 `tags EXISTS` would match 1,2,3,4,5,7,8,9,10 `tags NOT EXISTS` or `NOT tags EXISTS` would match 6 common query : `(tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY)` would match 2,4,5,8,9 ## What should the reviewer do? - Check that I tested the filters - Check that I deleted the ids of the documents when deleting documents Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Kerollmops <clement@meilisearch.com>	2023-04-27 13:14:00 +00:00
Loïc Lecrenier	899baa0ea5	Update forgotten snapshot from previous commit	2023-04-27 13:43:04 +02:00
Loïc Lecrenier	374095d42c	Add tests for stop words and fix a couple of bugs	2023-04-27 13:30:09 +02:00
Louis Dureuil	b41a6cbd7a	Check sort criteria also in placeholder search	2023-04-26 16:28:17 +02:00
Louis Dureuil	c8af572697	Add tests for exact words and exact attributes	2023-04-26 16:13:01 +02:00
ManyTheFish	249053e514	Update feature flags	2023-04-26 14:59:25 +02:00
ManyTheFish	ff2cf2a5ae	Update charabia in milli	2023-04-26 14:56:54 +02:00
Loïc Lecrenier	b448aca49c	Add more tests for exactness rr	2023-04-26 11:04:18 +02:00
Loïc Lecrenier	55bad07c16	Fix bug in exact_attribute rr implementation	2023-04-26 10:40:05 +02:00
Loïc Lecrenier	3421125a55	Prevent the `exactness` ranking rule from removing random words Make it strictly follow the term matching strategy	2023-04-26 09:09:19 +02:00
Clément Renault	14293f6c8f	Make rustfmt happy	2023-04-25 16:55:39 +02:00
Loïc Lecrenier	d3a94e8b25	Fix bugs and add tests to exactness ranking rule	2023-04-25 16:49:08 +02:00
Clément Renault	cfd1b2cc97	Fix the clippy warnings	2023-04-25 16:40:32 +02:00
Kerollmops	a109802d45	Upgrade the incompatible versions of the dependencies	2023-04-24 17:50:57 +02:00
Kerollmops	47b66e49b8	Upgrade the compatible versions of the dependencies	2023-04-24 17:50:52 +02:00
Loïc Lecrenier	8f2e971879	Add tests for "exactness" rr, make correct universe computation	2023-04-24 16:57:34 +02:00
Loïc Lecrenier	d1fdbb63da	Make all search tests pass, fix distinctAttribute bug	2023-04-24 12:12:08 +02:00
Loïc Lecrenier	a7a0891210	Update examples	2023-04-24 10:07:49 +02:00
Loïc Lecrenier	84d9c731f8	Fix bug in encoding of word_position_docids and word_fid_docids	2023-04-24 09:59:30 +02:00
Loïc Lecrenier	bd9aba4d77	Add "position" part of the attribute ranking rule	2023-04-13 10:46:09 +02:00
Loïc Lecrenier	8edad8291b	Add logger to attribute rr, fix a bug	2023-04-13 10:25:00 +02:00
Kerollmops	d9cebff61c	Add a simple test to check that attributes are ranking correctly	2023-04-13 08:27:09 +02:00
Loïc Lecrenier	30f7bd03f6	Fix compiler warning/errors caused by previous merge	2023-04-13 08:27:09 +02:00
Kerollmops	df0d9bb878	Introduce the attribute ranking rule in the list of ranking rules	2023-04-13 08:27:09 +02:00
Kerollmops	5230ddb3ea	Resolve the attribute ranking rule conditions	2023-04-13 08:27:09 +02:00
Kerollmops	d6a7c28e4d	Implement the attribute ranking rule edge computation	2023-04-13 08:27:09 +02:00
Kerollmops	e55efc419e	Introduce a new cache for the words fids	2023-04-13 08:27:09 +02:00
Loïc Lecrenier	644e136aee	Merge branch 'search-refactor-typo-attributes' into search-refactor	2023-04-13 08:26:56 +02:00
Louis Dureuil	38b7b31beb	Decide to use prefix DB if the word is not an ngram	2023-04-12 16:45:38 +02:00
Louis Dureuil	7a01f20df7	Use word_prefix_docids, make get_word_prefix_docids private	2023-04-12 16:45:38 +02:00
Louis Dureuil	c20c38a7fa	Add SearchContext::word_prefix_docids() method	2023-04-12 16:44:43 +02:00
Louis Dureuil	5ab46324c4	Everyone uses the SearchContext::word_docids instead of get_db_word_docids make get_db_word_docids private	2023-04-12 16:44:43 +02:00
Louis Dureuil	325f17488a	Add SearchContext::word_docids() method	2023-04-12 16:37:05 +02:00
Louis Dureuil	e7ff987c46	Update call sites	2023-04-12 16:36:38 +02:00
Louis Dureuil	244003e36f	Refactor DB cache to return Roaring Bitmaps directly instead of byte slices	2023-04-12 16:35:48 +02:00
Loïc Lecrenier	1f813a6f3b	Simplify implementation of the detailed (=visual) logger	2023-04-12 16:32:53 +02:00
Loïc Lecrenier	96183e804a	Simplify the logger	2023-04-12 16:32:53 +02:00
Loïc Lecrenier	7ab48ed8c7	Matching words fixes	2023-04-12 16:21:43 +02:00
Loïc Lecrenier	e7bb8c940f	Merge branch 'search-refactor-highlighter' into search-refactor-highlighter-merged	2023-04-11 12:22:34 +02:00
Loïc Lecrenier	8cb85294ef	Remove unused import warning	2023-04-07 11:09:30 +02:00
Loïc Lecrenier	d0e9d65025	Fix distinct attribute bugs	2023-04-07 11:09:01 +02:00
Loïc Lecrenier	540a396e49	Fix indexing bug in words_prefix_position	2023-04-07 11:08:39 +02:00
Loïc Lecrenier	a81165f0d8	Merge remote-tracking branch 'origin/main' into search-refactor	2023-04-07 10:15:55 +02:00
Loïc Lecrenier	d6585eb10b	Avoid splitting ngrams into their original component words	2023-04-07 10:13:49 +02:00
Loïc Lecrenier	f7d90ad19f	Merge remote-tracking branch 'origin/search-refactor-tests-doc' into search-refactor	2023-04-07 10:13:18 +02:00
Louis Dureuil	31630c85d0	exactness graph rr: Add important TODO/FIXME after review	2023-04-06 17:50:39 +02:00
Louis Dureuil	ab09dc0167	exact_attributes: Add TODOs and additional check after review	2023-04-06 17:50:39 +02:00
Louis Dureuil	618c54915d	exact_attribute: dedup nodes after sorting them	2023-04-06 17:50:39 +02:00
Loïc Lecrenier	130d2061bd	Fix indexing of word_position_docid and fid	2023-04-06 17:50:39 +02:00
Louis Dureuil	66ddee4390	Fix word_position_docids indexing	2023-04-06 17:50:39 +02:00
Louis Dureuil	90a6c01495	Use correct codec in proximity	2023-04-06 17:50:39 +02:00
Louis Dureuil	e58426109a	Fix panics and issues in exactness graph ranking rule	2023-04-06 17:50:39 +02:00
Louis Dureuil	f513cf930a	Exact attribute with state	2023-04-06 17:50:39 +02:00
Louis Dureuil	8a13ed7e3f	Add exactness ranking rules	2023-04-06 17:50:39 +02:00
Louis Dureuil	1b8e4d0301	Add ExactTerm and helper method	2023-04-06 17:50:39 +02:00
Louis Dureuil	996619b22a	Increase position by 8 on hard separator when building query terms	2023-04-06 17:50:39 +02:00
Louis Dureuil	2c9822a337	Rename `is_multiple_words` to `is_ngram` and `zero_typo` to `exact`	2023-04-06 17:50:39 +02:00
Louis Dureuil	7276deee0a	Add new db caches	2023-04-06 17:50:39 +02:00
ManyTheFish	f7e7f438f8	Patch prefix match	2023-04-06 17:22:31 +02:00
ManyTheFish	ba8dcc2d78	Fix clippy	2023-04-06 15:50:47 +02:00
Loïc Lecrenier	7ca91ebb71	Merge branch 'search-refactor-exactness' into search-refactor-tests-doc	2023-04-06 15:16:35 +02:00
ManyTheFish	47f6a3ad3d	Take into account that a logger need the search context	2023-04-06 15:02:23 +02:00
bors[bot]	b4c01581cd	Merge #3641 3641: Bring back changes from `release v1.1.0` into `main` after v1.1.0 release r=curquiza a=curquiza Replace https://github.com/meilisearch/meilisearch/pull/3637 since we don't want to pull commits from `main` into `release-v1.1.0` when fixing git conflicts Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Charlotte Vermandel <charlottevermandel@gmail.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: curquiza <clementine@meilisearch.com> Co-authored-by: Clément Renault <clement@meilisearch.com> Co-authored-by: Many the fish <many@meilisearch.com>	2023-04-06 12:37:54 +00:00
ManyTheFish	ae17c62e24	Remove warnings	2023-04-06 14:07:18 +02:00
ManyTheFish	a1148c09c2	remove old matcher	2023-04-06 14:00:21 +02:00
ManyTheFish	9c5f64769a	Integrate the new Highlighter in the search	2023-04-06 13:58:56 +02:00
ManyTheFish	ebe23b04c9	Make the matcher consume the search context	2023-04-06 12:28:28 +02:00
ManyTheFish	13b7c826c1	add new highlighter	2023-04-06 12:15:37 +02:00
Loïc Lecrenier	5440f43fd3	Fix indexing of word_position_docid and fid	2023-04-05 18:14:00 +02:00
Louis Dureuil	d9460a76f4	Fix word_position_docids indexing	2023-04-05 18:14:00 +02:00
Louis Dureuil	d1ddaa223d	Use correct codec in proximity	2023-04-05 18:14:00 +02:00
Louis Dureuil	f7ecea142e	Fix panics and issues in exactness graph ranking rule	2023-04-05 18:13:46 +02:00
Louis Dureuil	337e75b0e4	Exact attribute with state	2023-04-05 18:12:46 +02:00
Loïc Lecrenier	b5691802a3	Add new tests and fix construction of query graph from paths	2023-04-05 16:31:10 +02:00
Loïc Lecrenier	6e50f23896	Add more search tests	2023-04-05 13:33:23 +02:00
Tamo	597d57bf1d	Merge branch 'main' into bring-back-changes-v1.1.0	2023-04-05 11:32:14 +02:00
Loïc Lecrenier	4c8a0179ba	Add more search tests	2023-04-05 11:30:49 +02:00
Loïc Lecrenier	c69cbec64a	Add more search tests	2023-04-05 11:20:04 +02:00
Loïc Lecrenier	ce328c329d	Move bucket sort function to its own module and fix a bug	2023-04-04 18:03:08 +02:00
Loïc Lecrenier	959e4607bb	Add more search tests	2023-04-04 18:02:46 +02:00
Louis Dureuil	4b4ffb8ec9	Add exactness ranking rules	2023-04-04 17:12:07 +02:00
Louis Dureuil	3951fe22ab	Add ExactTerm and helper method	2023-04-04 17:09:32 +02:00
Louis Dureuil	4d5bc9df4c	Increase position by 8 on hard separator when building query terms	2023-04-04 17:07:26 +02:00
Louis Dureuil	ec2f8e8040	Rename `is_multiple_words` to `is_ngram` and `zero_typo` to `exact`	2023-04-04 17:06:07 +02:00
Louis Dureuil	406b8bd248	Add new db caches	2023-04-04 17:04:46 +02:00
Loïc Lecrenier	62b9c6fbee	Add search tests	2023-04-04 16:18:22 +02:00
Loïc Lecrenier	b439d36807	Split query_term module into multiple submodules	2023-04-04 15:38:30 +02:00
Loïc Lecrenier	faceb661e3	Add note that a part of the code needs fixing	2023-04-04 15:02:01 +02:00
Loïc Lecrenier	4129d657e2	Simplify query_term module a bit	2023-04-04 15:01:42 +02:00
Filip Bachul	1e6fe71a67	fix clippy warning	2023-04-03 20:18:26 +02:00
Filip Bachul	fddfb37f1f	remove unnecessary FilterError:ReservedGeo and FilterError:ReservedGeo	2023-04-03 20:18:26 +02:00
Loïc Lecrenier	3f13608002	Fix computation of ngram derivations	2023-04-03 15:27:49 +02:00
Loïc Lecrenier	4708d9b016	Fix compiler warnings/errors	2023-04-03 10:09:27 +02:00
Clément Renault	0d2e7bcc13	Implement the previous way for the exhaustive distinct candidates	2023-04-03 10:08:10 +02:00
Loïc Lecrenier	55fbfb6124	Merge branch 'search-refactor-located-query-terms' into search-refactor	2023-04-03 10:04:36 +02:00
Loïc Lecrenier	58fe260c72	Allow removing all the terms from a query if it contains a phrase	2023-04-03 09:18:02 +02:00
Loïc Lecrenier	24e5f6f7a9	Don't remove phrases with "last" term matching strategy	2023-04-03 09:17:33 +02:00
Louis Dureuil	9b87c36200	Limit the number of derivations for a single word.	2023-03-31 09:19:18 +02:00
Filip Bachul	1861c69964	fmt	2023-03-30 23:37:26 +02:00
Filip Bachul	cb2b5eb38e	handle _geoDistance(x,x) sort error	2023-03-30 23:21:23 +02:00
Filip Bachul	53aa0a1b54	handle _geo(x,x) sort error	2023-03-30 23:17:34 +02:00
Loïc Lecrenier	12b26cd54e	Don't remove phrases from the query with term matching strategy Last	2023-03-30 14:54:08 +02:00
Loïc Lecrenier	061b1e6d7c	Tiny refactor of query graph remove_nodes method	2023-03-30 14:49:25 +02:00
Loïc Lecrenier	0d6e8b5c31	Fix phrase search bug when the phrase has only one word	2023-03-30 14:48:12 +02:00
Loïc Lecrenier	d48cdc67a0	Fix term matching strategy bugs	2023-03-30 14:01:52 +02:00
Loïc Lecrenier	35c16ad047	Use new term matching strategy logic in words ranking rule	2023-03-30 13:15:43 +02:00
Loïc Lecrenier	2997d1f186	Use new term matching strategy logic in resolve_maximally_reduced_...	2023-03-30 13:12:51 +02:00
Loïc Lecrenier	2a5997fb20	Avoid expensive assert! in bucket sort function	2023-03-30 13:07:17 +02:00
Loïc Lecrenier	ee8a9e0bad	Remove outdated sentence in documentation	2023-03-30 12:22:24 +02:00
Loïc Lecrenier	3b0737a092	Fix detailed logger	2023-03-30 12:20:44 +02:00
Loïc Lecrenier	fdd02105ac	Graph-based ranking rule + term matching strategy support	2023-03-30 12:19:21 +02:00
Loïc Lecrenier	aa9592455c	Refactor the paths_of_cost algorithm Support conditions that require certain nodes to be skipped	2023-03-30 12:11:11 +02:00
Loïc Lecrenier	01e24dd630	Rewrite proximity ranking rule	2023-03-30 11:59:06 +02:00
Loïc Lecrenier	ae6bb1ce17	Update the ConditionDocidsCache after change to RankingRuleGraphTrait	2023-03-30 11:41:20 +02:00
Loïc Lecrenier	5fd28620cd	Build ranking rule graph correctly after changes to trait definition	2023-03-30 11:32:55 +02:00
Loïc Lecrenier	728710d63a	Update typo ranking rule to use new query term structure	2023-03-30 11:32:19 +02:00
Loïc Lecrenier	fa81381865	Update the trait requirements of ranking-rule graphs	2023-03-30 11:19:45 +02:00
Loïc Lecrenier	b96a682f16	Update resolve_graph module to work with lazy query terms	2023-03-30 11:10:38 +02:00
Loïc Lecrenier	d0f048c068	Simplify the API of the DatabaseCache	2023-03-30 11:08:17 +02:00
Loïc Lecrenier	223e82a10d	Update QueryGraph to use new lazy query terms + build from paths	2023-03-30 11:06:02 +02:00
Loïc Lecrenier	9507ff5e31	Update query term structure to allow for laziness	2023-03-30 11:06:02 +02:00
Louis Dureuil	c2b025946a	`located_query_terms_from_string`: use u16 for positions, hard limit number of iterated tokens. - Refactor phrase logic to reduce number of possible states	2023-03-30 11:04:14 +02:00
Loïc Lecrenier	3a818c5e87	Add more functionality to interners	2023-03-30 09:56:23 +02:00
Louis Dureuil	d74134ce3a	Check sort criteria	2023-03-29 15:21:54 +02:00
Louis Dureuil	5ac129bfa1	Mark geosearch as currently unimplemented for sort rule	2023-03-29 15:20:42 +02:00
ManyTheFish	efea1e5837	Fix facet normalization	2023-03-29 12:02:24 +02:00
Louis Dureuil	abb4522f76	Small comment on ignored rules for placeholder search	2023-03-29 09:11:06 +02:00
Louis Dureuil	ef084ef042	SmallBitmap: Consistently panic on incoherent universe lengths	2023-03-29 08:45:38 +02:00
Louis Dureuil	3524bd1257	SmallBitmap: Add documentation	2023-03-29 08:44:11 +02:00
Tamo	a50b058557	update the geoBoundingBox feature Now instead of using the (top_left, bottom_right) corners of the bounding box it s using the (top_right, bottom_left) corners.	2023-03-28 18:26:18 +02:00
Louis Dureuil	d4f6216966	Resolve rule time sort criteria	2023-03-28 16:42:02 +02:00
Louis Dureuil	77acafe534	Resolve search time sort criteria for placeholder search	2023-03-28 16:41:03 +02:00
Louis Dureuil	53afda3237	Update search usage in example	2023-03-28 16:35:46 +02:00
Louis Dureuil	abb19d368d	Initialize query time ranking rule for query search	2023-03-28 12:40:52 +02:00
Louis Dureuil	b4a52a622e	BoxRankingRule	2023-03-28 12:39:42 +02:00
Louis Dureuil	8d7d8cdc2f	Clean-up index example	2023-03-27 18:34:10 +02:00
Louis Dureuil	626a93b348	Search example: panic when missing the index path	2023-03-27 18:18:01 +02:00
Louis Dureuil	af65fe201a	Clean-up search example	2023-03-27 17:49:43 +02:00
Louis Dureuil	9b83b1deb0	Expose SearchLogger trait	2023-03-27 17:49:18 +02:00
Louis Dureuil	e9eb271499	Remove empty attribute_rule mod	2023-03-27 11:08:03 +02:00
Louis Dureuil	3281a88d08	SmallBitmap: don't expose internal items	2023-03-27 11:04:43 +02:00
Louis Dureuil	5a644054ab	Removed unused search impl	2023-03-27 11:04:27 +02:00
Louis Dureuil	16fefd364e	Add TODO notes	2023-03-27 11:04:04 +02:00
Gregory Conrad	e7994cdeb3	feat: check to see if the PK changed before erroring out Previously, if the primary key was set and a Settings update contained a primary key, an error would be returned. However, this error is not needed if the new PK == the current PK. This commit just checks to see if the PK actually changes before raising an error.	2023-03-26 12:18:39 -04:00
Loïc Lecrenier	00bad8c716	Add comments suggesting performance improvements	2023-03-23 10:18:24 +01:00
Loïc Lecrenier	862714a18b	Remove criterion_implementation_strategy param of Search	2023-03-23 09:44:12 +01:00
Loïc Lecrenier	d18ebe4f3a	Remove more warnings	2023-03-23 09:41:18 +01:00
Loïc Lecrenier	7169d85115	Remove old query_tree code and make clippy happy	2023-03-23 09:39:16 +01:00
Loïc Lecrenier	f5f5f03ec0	Remove old criteria code	2023-03-23 09:35:53 +01:00
Loïc Lecrenier	9b2653427d	Split position DB into fid and relative position DB	2023-03-23 09:22:01 +01:00
Loïc Lecrenier	56b7209f26	Make clippy happy	2023-03-23 09:16:17 +01:00
Loïc Lecrenier	9b1f439a91	WIP	2023-03-23 09:12:35 +01:00
Loïc Lecrenier	01c7d2de8f	Add example targets to the milli crate	2023-03-22 14:50:41 +01:00
Loïc Lecrenier	a86aeba411	WIP	2023-03-22 14:43:08 +01:00
Loïc Lecrenier	384fdc2df4	Fix two bugs in proximity ranking rule	2023-03-21 11:43:25 +01:00
Loïc Lecrenier	83e5b4ed0d	Compute edges of proximity graph lazily	2023-03-21 10:44:40 +01:00
Loïc Lecrenier	272cd7ebbd	Small cleanup	2023-03-20 13:39:19 +01:00
Loïc Lecrenier	c63c7377e6	Switch order of MappedInterner generic params	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	5b50e49522	cargo fmt	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	65474c8de5	Update new sort ranking rule after rebasing	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	fbb1ba3de0	Cargo fmt	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	a59ca28e2c	Add forgotten file	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	825f742000	Simplify graph-based ranking rule impl	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	dd491320e5	Simplify graph-based ranking rule impl	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	c6ff97a220	Rewrite the dead-ends cache to detect more dead-ends	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	49240c367a	Fix bug in cost of typo conditions	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	1e6e624078	Fix bug in SmallBitmap	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	8b4e07e1a3	WIP	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	2853009987	Renaming Edge -> Condition	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	aa59c3bc2c	Replace EdgeCondition with an Option<..> + other code cleanup	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	7b1d8f4c6d	Make PathSet strongly typed	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	a49ddec9df	Prune the query graph after executing a ranking rule	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	05fe856e6e	Merge forward and backward proximity conditions in proximity graph	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	c0cdaf9f53	Fix bug in the proximity ranking rule for queries with ngrams	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	e9cf58d584	Refactor of the Interner	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	31628c5cd4	Merge Phrase and WordDerivations into one structure	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	3004e281d7	Support ngram typos + splitwords and splitwords+synonyms in proximity	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	14e8d0aaa2	Rename lifetime	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	1c58cf8426	Intern ranking rule graph edge conditions as well	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	5155fd2bf1	Reorganise initialisation of ranking rules + rename PathsMap -> PathSet	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	9ec9c204d3	Small code cleanup	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	78b9304d52	Implement distinct attribute	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	0465ba4a05	Intern more values	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	2099991dd1	Continue documenting and cleaning up the code	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	c232cdabf5	Add documentation	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	4e266211bf	Small code reorganisation	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	57fa689131	Cargo fmt	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	10626dddfc	Add a few more optimisations to new search algorithms	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	9051065c22	Apply a few optimisations for graph-based ranking rules	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	e8c76cf7bf	Intern all strings and phrases in the search logic	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	3f1729a17f	Update new search test	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	cab2b6bcda	Fix: computation of initial universe, code organisation	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	c4979a2fda	Fix code visibility issue + unimplemented detail in proximity rule	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	23931f8a4f	Fix small bug in visual logger of search algo	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	aa414565bb	Fix proximity graph edge builder to include all proximities	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	1db152046e	WIP on split words and synonyms support	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	c27ea2677f	Rewrite cheapest path algorithm and empty path cache It is now much simpler and has much better performance.	2023-03-20 09:41:56 +01:00

... 3 4 5 6 7 ...

1888 Commits