meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-12-02 18:15:38 +08:00

Author	SHA1	Message	Date
Louis Dureuil	48409c9183	Add missing exactness.matchingWords, exactness.maxMatchingWords	2023-07-04 16:31:01 +02:00
Louis Dureuil	324d448236	Format let-else ❤️ 🎉	2023-07-03 10:20:28 +02:00
Louis Dureuil	8939e85f60	Add rank_to_score for graph based ranking rules	2023-06-22 12:39:14 +02:00
Louis Dureuil	f050634b1e	add virtual conditions to fid and position to always have the max cost	2023-06-20 10:07:18 +02:00
Louis Dureuil	becf1f066a	Change how the cost of removing words is computed	2023-06-20 09:45:43 +02:00
Louis Dureuil	a20e4d447c	Position now takes into account the distance to the position of the word in the query it used to be based on the distance to the position 0	2023-06-20 09:45:42 +02:00
Louis Dureuil	af57c3c577	Proximity costs 0 for documents that are perfectly matching	2023-06-20 09:45:42 +02:00
Loïc Lecrenier	2da86b31a6	Remove comments and add documentation	2023-06-14 12:39:42 +02:00
meili-bors[bot]	2e49d6aec1	Merge #3768 3768: Fix bugs in graph-based ranking rules + make `words` a graph-based ranking rule r=dureuill a=loiclec This PR contains three changes: ## 1. Don't call the `words` ranking rule if the term matching strategy is `All` This is because the purpose of `words` is only to remove nodes from the query graph. It would never do any useful work when the matching strategy was `All`. Remember that the universe was already computed before by computing all the docids corresponding to the "maximally reduced" query graph, which, in the case of `All`, is equal to the original graph. ## 2. The `words` ranking rule is replaced by a graph-based ranking rule. This is for three reasons: 1. performance: graph-based ranking rules benefit from a lot of optimisations by default, which ensures that they are never too slow. The previous implementation of `words` could call `compute_query_graph_docids` many times if some words had to be removed from the query, which would be quite expensive. I was especially worried about its performance in cases where it is placed right after the `sort` ranking rule. Furthermore, `compute_query_graph_docids` would clone a lot of bitmaps many times unnecessarily. 2. consistency: every other ranking rule (except `sort`) is graph-based. It makes sense to implement `words` like that as well. It will automatically benefit from all the features, optimisations, and bug fixes that all the other ranking rules get. 3. surfacing bugs: as the first ranking rule to be called (most of the time), I'd like `words` to behave the same as the other ranking rules so that we can quickly detect bugs in our graph algorithms. This actually already happened, which is why this PR also contains a bug fix. ## 3. Fix the `update_all_costs_before_nodes` function It is a bit difficult to explain what was wrong, but I'll try. The bug happened when we had graphs like: <img width="730" alt="Screenshot 2023-05-16 at 10 58 57" src="https://github.com/meilisearch/meilisearch/assets/6040237/40db1a68-d852-4e89-99d5-0d65757242a7"> and we gave the node `is` as argument. Then, we'd walk backwards from the node breadth-first. We'd update the costs of: 1. `sun` 2. `thesun` 3. `start` 4. `the` which is an incorrect order. The correct order is: 1. `sun` 2. `thesun` 3. `the` 4. `start` That is, we can only update the cost of a node when all of its successors have either already been visited or were not affected by the update to the node passed as argument. To solve this bug, I factored out the graph-traversal logic into a `traverse_breadth_first_backward` function. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-05-23 13:28:08 +00:00
Louis Dureuil	51043f78f0	Remove trailing whitespace	2023-05-23 15:27:25 +02:00
Louis Dureuil	a490a11325	Add explanatory comment on the way we're recomputing costs	2023-05-23 15:24:24 +02:00
Loïc Lecrenier	ec8f685d84	Fix bug in cheapest path algorithm	2023-05-16 17:01:30 +02:00
Loïc Lecrenier	5758268866	Don't compute split_words for phrases	2023-05-16 17:01:18 +02:00
Loïc Lecrenier	f6524a6858	Adjust costs of edges in position ranking rule To ensure good performance	2023-05-16 11:28:56 +02:00
Loïc Lecrenier	a37da36766	Implement `words` as a graph-based ranking rule and fix some bugs	2023-05-16 10:42:11 +02:00
Louis Dureuil	1aaf24ccbf	Cargo fmt	2023-05-03 12:21:58 +02:00
Loïc Lecrenier	1b514517f5	Fix bug in computation of query term at a position	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	11f814821d	Minor cleanup	2023-05-02 10:48:32 +02:00
Loïc Lecrenier	30fb1153cc	Speed up graph based ranking rule when a lot of different costs exist	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	3b2c8b9f25	Improve performance of position rr	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	608ceea440	Fix bug in position rr	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	79001b9c97	Improve performance of the cheapest path finder algorithm	2023-05-02 09:59:42 +02:00
Loïc Lecrenier	bc4efca611	Add more tests for the attribute ranking rule	2023-04-29 10:56:48 +02:00
Loïc Lecrenier	3421125a55	Prevent the `exactness` ranking rule from removing random words Make it strictly follow the term matching strategy	2023-04-26 09:09:19 +02:00
Loïc Lecrenier	d3a94e8b25	Fix bugs and add tests to exactness ranking rule	2023-04-25 16:49:08 +02:00
Loïc Lecrenier	d1fdbb63da	Make all search tests pass, fix distinctAttribute bug	2023-04-24 12:12:08 +02:00
Loïc Lecrenier	bd9aba4d77	Add "position" part of the attribute ranking rule	2023-04-13 10:46:09 +02:00
Kerollmops	d9cebff61c	Add a simple test to check that attributes are ranking correctly	2023-04-13 08:27:09 +02:00
Loïc Lecrenier	30f7bd03f6	Fix compiler warning/errors caused by previous merge	2023-04-13 08:27:09 +02:00
Kerollmops	df0d9bb878	Introduce the attribute ranking rule in the list of ranking rules	2023-04-13 08:27:09 +02:00
Kerollmops	5230ddb3ea	Resolve the attribute ranking rule conditions	2023-04-13 08:27:09 +02:00
Kerollmops	d6a7c28e4d	Implement the attribute ranking rule edge computation	2023-04-13 08:27:09 +02:00
Kerollmops	e55efc419e	Introduce a new cache for the words fids	2023-04-13 08:27:09 +02:00
Louis Dureuil	7a01f20df7	Use word_prefix_docids, make get_word_prefix_docids private	2023-04-12 16:45:38 +02:00
Louis Dureuil	5ab46324c4	Everyone uses the SearchContext::word_docids instead of get_db_word_docids make get_db_word_docids private	2023-04-12 16:44:43 +02:00
Louis Dureuil	e7ff987c46	Update call sites	2023-04-12 16:36:38 +02:00
Loïc Lecrenier	1f813a6f3b	Simplify implementation of the detailed (=visual) logger	2023-04-12 16:32:53 +02:00
Loïc Lecrenier	96183e804a	Simplify the logger	2023-04-12 16:32:53 +02:00
Loïc Lecrenier	f7d90ad19f	Merge remote-tracking branch 'origin/search-refactor-tests-doc' into search-refactor	2023-04-07 10:13:18 +02:00
Louis Dureuil	31630c85d0	exactness graph rr: Add important TODO/FIXME after review	2023-04-06 17:50:39 +02:00
Louis Dureuil	90a6c01495	Use correct codec in proximity	2023-04-06 17:50:39 +02:00
Louis Dureuil	e58426109a	Fix panics and issues in exactness graph ranking rule	2023-04-06 17:50:39 +02:00
Louis Dureuil	8a13ed7e3f	Add exactness ranking rules	2023-04-06 17:50:39 +02:00
Loïc Lecrenier	7ca91ebb71	Merge branch 'search-refactor-exactness' into search-refactor-tests-doc	2023-04-06 15:16:35 +02:00
Louis Dureuil	d1ddaa223d	Use correct codec in proximity	2023-04-05 18:14:00 +02:00
Louis Dureuil	f7ecea142e	Fix panics and issues in exactness graph ranking rule	2023-04-05 18:13:46 +02:00
Louis Dureuil	4b4ffb8ec9	Add exactness ranking rules	2023-04-04 17:12:07 +02:00
Loïc Lecrenier	b439d36807	Split query_term module into multiple submodules	2023-04-04 15:38:30 +02:00
Loïc Lecrenier	aa9592455c	Refactor the paths_of_cost algorithm Support conditions that require certain nodes to be skipped	2023-03-30 12:11:11 +02:00
Loïc Lecrenier	01e24dd630	Rewrite proximity ranking rule	2023-03-30 11:59:06 +02:00

1 2 3

107 Commits