meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-11-23 02:27:40 +08:00

Author	SHA1	Message	Date
Louis Dureuil	00746b32c0	Add Index::map_size	2023-01-10 11:16:51 +01:00
bors[bot]	e27bb8ab3e	Merge #3246 3246: Implement most of the error handling enhancement planned for v1.0 r=irevoire a=irevoire Fix #3095 and #2325 Close https://github.com/meilisearch/meilisearch/pull/2540 Implements most of https://github.com/meilisearch/specifications/pull/212 ## Generic error message we re-implements (in deserr): - [x] Json - [x] Incorrect value kind - [x] Missing field - [x] Unknown key - [x] Unexpected - [x] Reimplement the way we show the location - [x] Query parameter - [x] Incorrect value kind - [x] Missing field - [x] Unknown key - [x] Unexpected ## Routes to implements: - [x] Get search - [x] Post search - [x] Settings - [x] Swap indexes - [x] Task API - [x] Documents ressource Error codes to implements; ## Swap API - [x] `duplicate_index_found` → `invalid_swap_duplicate_index_found` ## Search API - [x] `invalid_search_q` - [x] `invalid_search_offset` - [x] `invalid_search_limit` - [x] `invalid_search_page` - [x] `invalid_search_hits_per_page` - [x] `invalid_search_attributes_to_retrieve` - [x] `invalid_search_attributes_to_crop` - [x] `invalid_search_crop_length` - [x] `invalid_search_attributes_to_highlight` - [x] `invalid_search_show_matches_position` - [x] `invalid_search_filter` - [x] `invalid_search_sort` - [x] `invalid_search_facets` - [x] `invalid_search_highlight_pre_tag` - [x] `invalid_search_highlight_post_tag` - [x] `invalid_search_crop_marker` - [x] `invalid_search_matching_strategy` ## Settings API - [x] invalid_settings_displayed_attributes - [x] invalid_settings_searchable_attributes - [x] invalid_settings_filterable_attributes - [x] invalid_settings_sortable_attributes - [x] invalid_settings_ranking_rules - [x] invalid_settings_stop_words - [x] invalid_settings_synonyms - [x] invalid_settings_distinct_attribute - [x] Add invalid_settings_typo_tolerance - [x] ~~invalid_settings_typo_tolerance_min_word_size_for_typos~~ (Merge in invalid_settings_typo_tolerance) - [x] invalid_settings_faceting - [x] invalid_settings_pagination ## Task API - [x] invalid_task_date_filer → invalid_task_before_enqueued_at_filter (for all date filter) ? ## Document Resource - [x] ~~`primary_key_inference_failed` → `index_primary_key_`~~ This doesn't exists anymore after `@dureuill` PR's on the primary key inference ------------------ # Changes # `code` property ## Swap API - [x] `invalid_swap_duplicate_index_found` ✅ [RENAME] - [x] `invalid_swap_indexes` ✅ [NEW] ## Index API ### POST - [x] `missing_index_uid` ✅ [NEW] ### POST/PATCH - [x] `invalid_index_primary_key` ✅ [NEW] ### GET - [x] `invalid_index_limit` ✅ [NEW] - [x] `invalid_index_offset` ✅ [NEW] ## Documents API ### GET - [x] `fields` parameter error `bad_request` → `invalid_document_fields` ✅ [NEW] - [x] `limit` parameter error `bad_request` → `invalid_document_limit` ✅ [NEW] - [x] `offset` parameter error `bad_request` → `invalid_document_offset` ✅ [NEW] ### POST/PUT - [x] `?primaryKey` parameter error `bad_request` → `invalid_index_primary_key` ✅ [NEW] ## Keys API ### POST - ~~`missing_parameter`~~ - [x] `missing_api_key_actions` ✅ [NEW] - [x] `missing_api_key_indexes` ✅ [NEW] - [x] `missing_api_key_expires_at` ✅ [NEW] ### GET - [x] `limit` parameter `bad_request` → `invalid_api_key_limit` ✅ [NEW] - [x] `offset` parameter `bad_request` → `invalid_api_key_offset` ✅ [NEW] ## Misc - [x] ~~`invalid_geo_field`~~ → `invalid_document_geo_field` ✅ [RENAME] # `type` property ## `system` ✅ [NEW] - [x] `no_space_left_on_device` error code - [x] `io_error` error code (does not exist in the current spec, need a catch-up) - [x] `too_many_open_files` error code (does not exist in the current spec, need a catch-up) Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-09 16:25:48 +00:00
Tamo	ff843881c5	remove the documentation of the query parameter extractor module	2023-01-09 15:14:48 +01:00
Loïc Lecrenier	ae08fba76e	Remove forgotten comment	2023-01-09 13:45:03 +01:00
Loïc Lecrenier	af6d4b3031	Remove unused deserr extractor	2023-01-09 13:43:16 +01:00
Tamo	b03ee54fe0	makes clippy turbo-happy	2023-01-09 13:04:31 +01:00
Tamo	d17efb9ed6	use the published version of deserr	2023-01-09 12:51:10 +01:00
Loïc Lecrenier	9ab791bedc	Update error codes on the api key routes	2023-01-09 12:30:25 +01:00
Loïc Lecrenier	96105a5e8d	Update error codes on the documents/ routes	2023-01-09 12:30:25 +01:00
Tamo	e706628bb1	fix the error code of the swap index route	2023-01-06 14:48:25 +01:00
Tamo	3c630891bb	fix the error code for the swap index	2023-01-05 21:25:20 +01:00
Tamo	97854274b4	rename the invalid_geo_field error code to invalid_document_geo_field	2023-01-05 21:08:19 +01:00
Tamo	0646f63404	implement the new type property for the system error	2023-01-05 21:06:50 +01:00
Tamo	ce3e8794a2	fix the tests after the rebase	2023-01-05 20:52:26 +01:00
Tamo	50ce0409bc	Integrate deserr on the most important routes	2023-01-05 20:48:29 +01:00
bors[bot]	839b05c43d	Merge #3305 3305: Remove hidden but usable CLI arguments r=Kerollmops a=Kerollmops `@curquiza` found out that we were exposing some internal CLI arguments: `nb-max-chunks` and `log-every-n`. In this PR I removed those two, the only two ones that I found. Those options shouldn't be accessible as non-documented in the documentation or the `--help` message. Fixes https://github.com/meilisearch/meilisearch/issues/3307 Co-authored-by: Clément Renault <clement@meilisearch.com>	2023-01-05 17:11:58 +00:00
bors[bot]	cc699fae40	Merge #3308 3308: Remove `--generate-master-key` option r=Kerollmops a=dureuill # Pull Request ## Related issue Related to https://github.com/meilisearch/specifications/pull/210#issuecomment-1372035525 ## What does this PR do? - Remove the short-lived `--generate-master-key` flag that was too beautiful for this world :D. Removal of this option proceeds of the following reasoning: 1. It is the only option that starts meilisearch and then immediately exits 2. We are unsure if we want to keep it under this form in the future or switch to a subcommand. 3. Releasing this option in v1 would make it insta-stable. 5. The option is only marginally useful, as users will be presented with freshly generated key directly in the error messages if their master key is absent/too short. 6. If we remove this option now, we can still add it back in a future v1 release. If we add it now, we won't be able to remove it in any future v1 version. ## PR checklist Please check if your PR fulfills the following requirements: - [ ] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [ ] Have you read the contributing guidelines? - [ ] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! ### Impacts this impacts the docs team as they would previously have had to document this option, and they may have wanted to use it in the user workflow. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-01-05 16:19:40 +00:00
Clément Renault	aa4b813237	Derive Default on IndexerOpts	2023-01-05 16:00:45 +01:00
Louis Dureuil	eb08a0fb0b	Remove --generate-master-key option	2023-01-05 14:55:24 +01:00
Clément Renault	cda529c07b	Remove hidden but usable CLI arguments	2023-01-05 14:25:41 +01:00
bors[bot]	1f8ddb366c	Merge #3302 3302: Update insta snap tests for index dates of dump v5 r=curquiza a=loiclec This PR simply updates the content of the insta snapshot test following https://github.com/meilisearch/meilisearch/pull/3013 . I manually verified that the dates in the snaps are indeed correct. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-05 12:58:10 +00:00
bors[bot]	8a3da0c2a7	Merge #3304 3304: Fix update cargo.toml workflow r=Kerollmops a=curquiza Following https://github.com/meilisearch/meilisearch/pull/3224 Fixes #3219 Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>	2023-01-05 12:16:57 +00:00
Clémentine Urquizar - curqui	c840d55e89	Fix update cargo.toml workflow	2023-01-05 12:56:02 +01:00
bors[bot]	c7a3992510	Merge #3303 3303: Update version for the next release (v1.0.0) in Cargo.toml files r=curquiza a=meili-bot ⚠️ This PR is automatically generated. Check the new version is the expected one before merging. Co-authored-by: curquiza <curquiza@users.noreply.github.com>	2023-01-05 11:53:09 +00:00
curquiza	28408816ef	Update version for the next release (v1.0.0) in Cargo.toml files	2023-01-05 11:45:15 +00:00
bors[bot]	0eaa8ca255	Merge #3266 3266: Improve the way we receive the documents payload- serde multiple ndjson fix r=curquiza a=jiangbo212 # Pull Request ## Related issue Fixes #3037 ## Related PR #3164 ## What does this PR do? Sorry, This PR is mainly to fix the problems caused by my previously provided PR #3164. It causes multiple ndjson data deserialization failures - Fix serde multiple ndjson data failures and add test to it - Fix serde jsonarray error and againest serde it use `from_slice`. only use `from_slice` when serde error category is `data`, it indicate json data is a single json. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: jiangbo212 <peiyaoliukuan@126.com>	2023-01-05 11:30:29 +00:00
bors[bot]	201bc633d2	Merge #3288 3288: Replace underscores with hyphens in documentation link to error code r=dureuill a=loiclec # Pull Request ## Related issue Fixes #3097 ## Implementation Add a new dependency to `convert_case` (already used transitively by `deserr`) so that the link can be generated using: ```rust /// return the doc url associated with the error fn url(&self) -> String { format!( "https://docs.meilisearch.com/errors#{}", self.name().to_case(convert_case::Case::Kebab) ) } ``` ## Review I'd like the reviewer to check whether it is expected that the content of some `dump` snapshot tests changed :-) Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-05 11:08:57 +00:00
Loïc Lecrenier	ba839852f5	Update insta snap tests for index dates of dump v5	2023-01-05 11:45:40 +01:00
Louis Dureuil	be9786bed9	Change primary key inference error messages	2023-01-05 10:40:09 +01:00
Loïc Lecrenier	f9aa897ab5	Update insta tests	2023-01-05 10:19:19 +01:00
Loïc Lecrenier	2d74678b51	Replace underscores with hyphens in doc link to error code	2023-01-05 10:09:02 +01:00
bors[bot]	db7eaf23f4	Merge #3251 3251: Add a specific test on finite pagination placeolder search with disti… r=curquiza a=ManyTheFish Add a specific test on finite pagination placeholder search with distinct attributes related to https://github.com/meilisearch/milli/pull/743 related to https://github.com/meilisearch/meilisearch/issues/3200 poke `@curquiza` > note that the destination branch should be changed Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-01-05 09:06:53 +00:00
bors[bot]	32f7cfa5cb	Merge #3295 3295: Adjust Master Key-related messages r=dureuill a=dureuill # Pull Request ## Related issue Follow up for #3272 ## What does this PR do? - Consistently capitalize "master key" (instead of "Master Key" sometimes) (see https://github.com/meilisearch/specifications/pull/209#discussion_r1060081094) - Clarify that the counted unit for master key length is bytes, not characters (see https://github.com/meilisearch/documentation/issues/2069#issuecomment-1368873167) ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-01-05 08:43:23 +00:00
bors[bot]	a402fc4486	Merge #3013 3013: Extract the dates out of the dumpv5. r=loiclec a=funilrys Hi there, please review this PR that tries to fix #2986. I'm still learning Rust and I found that #2986 is an excellent way for me to read and learn what others do with Rust. So please excuse my semantics ... Stay safe and healthy. --- # Pull Request This patch possibly fixes #2986. This patch introduces a way to fill the IndexMetadata.created_at and IndexMetadata.updated_at keys from the tasks events. This is done by reading the creation date of the first event (created_at) and the creation date of the last event (updated_at). ## Related issue Fixes #2986 ## What does this PR do? - Extract the dates out of the dumpv5. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: funilrys <contact@funilrys.com>	2023-01-05 08:23:52 +00:00
bors[bot]	502d9e4b24	Merge #3278 3278: Remove `--max-index-size` and `--max-task-db-size` flags r=Kerollmops a=dureuill # Pull Request ## Related issue Fixes #3231 ## What does this PR do? - Remove `--max-index-size` and `--max-task-db-size` flags from the CLI, config file and environment variable - Set the size of all indexes to 500GiB and the size of the task DB to 10GiB. Reviewers might want to review these values carefully. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-01-04 16:44:27 +00:00
Louis Dureuil	a85ff1f690	Fix documentation Co-authored-by: Clément Renault <clement@meilisearch.com>	2023-01-04 17:20:03 +01:00
Louis Dureuil	233372abea	Remove `--max-index-size` and `--max-task-db-size`	2023-01-04 17:20:01 +01:00
bors[bot]	13d4ae264a	Merge #3269 3269: Simplify primary key inference r=dureuill a=dureuill # Pull Request ## Related issue Related to https://github.com/meilisearch/meilisearch/issues/3233 ## What does this PR do? - Integrates https://github.com/meilisearch/milli/pull/752 in meilisearch - Remove `Serialize` and `Deserialize` from `error::Code` as it is unused. - No longer filter on `milli` logs when `--log-level` is "info". - `milli` only has the newly-added inference log at the `info` level (from greping `info` in the codebase) - the default value for `--log-level` is "INFO" and not "info" since `v0.30` so the filter is not active by default. - updates milli to v0.38.0 ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-01-04 16:14:36 +00:00
bors[bot]	c766e06003	Merge #3281 3281: Merge `--schedule-snapshot` and `--snapshot-interval-sec` options r=dureuill a=dureuill # Pull Request ## Related issue Fixes #3131 ## What does this PR do? - Removes `--snapshot-interval-sec` - `--schedule-snapshot` now accepts an optional integer value specifying the interval in seconds - The config file no longer has a snapshot_interval_sec key. Instead, the schedule_snapshot key now additionally accepts an integer value specifying the interval in seconds - The env variable MEILI_SNAPSHOT_INTERVAL no longer exists - The env variable MEILI_SCHEDULE_SNAPSHOT is always specified to the interval of the snapshot in seconds when defined. If snapshots are disabled the variable is undefined. --- Relevant part of the `--help` <img width="885" alt="Capture d’écran 2022-12-27 à 18 22 32" src="https://user-images.githubusercontent.com/41078892/209700626-1a1292c1-14e3-45b6-8265-e0adbd76ecf1.png"> --- ### Tests \| `schedule_snapshot` in config.toml \| `--schedule-snapshot` flag on CLI \| `MEILI_SCHEDULE_SNAPSHOT` \| `opt.schedule_snapshot` \| \|--\|--\|--\|--\| \| missing \| missing \| missing \| `Disabled` \| `false` \| missing \| missing \| `Disabled` \| `true` \| missing \| missing \| `Enabled(86400)` \| `1234` \| missing \| missing \| `Enabled(1234)` \| missing \| `--schedule-snapshot` \| missing \| `Enabled(86400)` \| `false` \| `--schedule-snapshot` \| missing \| `Enabled(86400)` \| missing \| `--schedule-snapshot 2345` \| missing \| `Enabled(2345)` \| `false` \| `--schedule-snapshot 2345` \| missing \| `Enabled(2345)` \| `true` \| `--schedule-snapshot 2345` \| missing \| `Enabled(2345)` \| `1234` \| `--schedule-snapshot 2345` \| missing \| `Enabled(2345)` \| `false` \| `--schedule-snapshot 2345` \| 3456 \| `Enabled(2345)` \| `false` \| `--schedule-snapshot` \| 3456 \| `Enabled(86400)` \| `1234` \| missing \| 3456 \| `Enabled(3456)` \| `false` \| missing \| 3456 \| `Enabled(3456)` ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2023-01-04 14:25:47 +00:00
Louis Dureuil	fcbd47281b	Fix tests	2023-01-04 14:24:20 +01:00
Louis Dureuil	b6d80293f7	Propagate new error codes from milli	2023-01-04 14:24:20 +01:00
Louis Dureuil	0e98a71a24	Update milli to v0.38	2023-01-04 14:24:20 +01:00
Louis Dureuil	5cb566b165	No longer filter out milli logs when --log-level is "info"	2023-01-04 14:24:20 +01:00
Louis Dureuil	9d46caba29	Code doesn't need to be serializable/deserializable	2023-01-04 14:16:22 +01:00
Louis Dureuil	c4aa5cc7d0	Merge --schedule-snapshot and --snapshot-interval-sec options	2023-01-04 14:13:54 +01:00
bors[bot]	12c3d432f9	Merge #3293 3293: Explicitly restrict log level options to those that are documented r=loiclec a=loiclec Fixes https://github.com/meilisearch/meilisearch/issues/3292 Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-04 10:30:35 +00:00
bors[bot]	c3f4835e8e	Merge #733 733: Avoid a prefix-related worst-case scenario in the proximity criterion r=loiclec a=loiclec # Pull Request ## Related issue Somewhat fixes (until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3118 ## What does this PR do? When a query ends with a word and a prefix, such as: ``` word pr ``` Then we first determine whether `pre` could possibly be in the proximity prefix database before querying it. There are then three possibilities: 1. `pr` is not in any prefix cache because it is not the prefix of many words. We don't query the proximity prefix database. Instead, we list all the word derivations of `pre` through the FST and query the regular proximity databases. 2. `pr` is in the prefix cache but cannot be found in the proximity prefix databases. In this case, we partially disable the proximity ranking rule for the pair `word pre`. This is done as follows: 1. Only find the documents where `word` is in proximity to `pre` exactly (no derivations) 2. Otherwise, assume that their proximity in all the documents in which they coexist is >= 8 3. `pr` is in the prefix cache and can be found in the proximity prefix databases. In this case we simply query the proximity prefix databases. Note that if a prefix is longer than 2 bytes, then it cannot be in the proximity prefix databases. Also, proximities larger than 4 are not present in these databases either. Therefore, the impact on relevancy is: 1. For common prefixes of one or two letters: we no longer distinguish between proximities from 4 to 8 2. For common prefixes of more than two letters: we no longer distinguish between any proximities 3. For uncommon prefixes: nothing changes Regarding (1), it means that these two documents would be considered equally relevant according to the proximity rule for the query `heard pr` (IF `pr` is the prefix of more than 200 words in the dataset): ```json [ { "text": "I heard there is a faster proximity criterion" }, { "text": "I heard there is a faster but less relevant proximity criterion" } ] ``` Regarding (2), it means that two documents would be considered equally relevant according to the proximity rule for the query "faster pro": ```json [ { "text": "I heard there is a faster but less relevant proximity criterion" } { "text": "I heard there is a faster proximity criterion" }, ] ``` But the following document would be considered more relevant than the two documents above: ```json { "text": "I heard there is a faster swimmer who is competing in the pro section of the competition " } ``` Note, however, that this change of behaviour only occurs when using the set-based version of the proximity criterion. In cases where there are fewer than 1000 candidate documents when the proximity criterion is called, this PR does not change anything. --- ## Performance I couldn't use the existing search benchmarks to measure the impact of the PR, but I did some manual tests with the `songs` benchmark dataset. ``` 1. 10x 'a': - 640ms ⟹ 630ms = no significant difference 2. 10x 'b': - set-based: 4.47s ⟹ 7.42 = bad, ~2x regression - dynamic: 1s ⟹ 870 ms = no significant difference 3. 'Someone I l': - set-based: 250ms ⟹ 12 ms = very good, x20 speedup - dynamic: 21ms ⟹ 11 ms = good, x2 speedup 4. 'billie e': - set-based: 623ms ⟹ 2ms = very good, x300 speedup - dynamic: ~4ms ⟹ 4ms = no difference 5. 'billie ei': - set-based: 57ms ⟹ 20ms = good, ~2x speedup - dynamic: ~4ms ⟹ ~2ms. = no significant difference 6. 'i am getting o' - set-based: 300ms ⟹ 60ms = very good, 5x speedup - dynamic: 30ms ⟹ 6ms = very good, 5x speedup 7. 'prologue 1 a 1: - set-based: 3.36s ⟹ 120ms = very good, 30x speedup - dynamic: 200ms ⟹ 30ms = very good, 6x speedup 8. 'prologue 1 a 10': - set-based: 590ms ⟹ 18ms = very good, 30x speedup - dynamic: 82ms ⟹ 35ms = good, ~2x speedup ``` Performance is often significantly better, but there is also one regression in the set-based implementation with the query `b b b b b b b b b b`. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-04 09:00:50 +00:00
Loïc Lecrenier	d082ded7ad	Explicitly restrict log level options to those that are documented Fixes https://github.com/meilisearch/meilisearch/issues/3292	2023-01-04 09:40:24 +01:00
bors[bot]	49f58b2c47	Merge #732 732: Interpret synonyms as phrases r=loiclec a=loiclec # Pull Request ## Related issue Fixes (when merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/3125 ## What does this PR do? We now map multi-word synonyms to phrases instead of loose words. Such that the request: ``` btw I am going to nyc soon ``` is interpreted as (when the synonym interpretation is chosen for both `btw` and `nyc`): ``` "by the way" I am going to "New York City" soon ``` instead of: ``` by the way I am going to New York City soon ``` This prevents queries containing multi-word synonyms to exceed to word length limit and degrade the search performance. In terms of relevancy, there is a debate to have. I personally think this could be considered an improvement, since it would be strange for a user to search for: ``` good DIY project ``` and have a result such as: ``` { "text": "whether it is a good project to do, you'll have to decide for yourself" } ``` However, for synonyms such as `NYC -> New York City`, then we will stop matching documents where `New York` is separated from `City`. This is however solvable by adding an additional mapping: `NYC -> New York`. ## Performance With the old behaviour, some long search requests making heavy uses of synonyms could take minutes to be executed. This is no longer the case, these search requests now take an average amount of time to be resolved. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-04 08:34:18 +00:00
bors[bot]	947f08793a	Merge #3296 3296: Remove `--disable-auto-batching` CLI option r=gmourier a=loiclec Fixes #3294 The `index-scheduler` code is not modified, only the CLI options have changed. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2023-01-03 16:57:14 +00:00

1 2 3 4 5 ...

7141 Commits