meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-11-24 02:55:06 +08:00

Author	SHA1	Message	Date
bors[bot]	97fb64e40e	Merge #747 747: Soft-deletion computation no longer depends on the mapsize r=irevoire a=dureuill # Pull Request ## Related issue Related to https://github.com/meilisearch/meilisearch/issues/3231: After removing `--max-index-size`, the `mapsize` will always be unrelated to the actual max size the user wants for their DB, so it doesn't make sense to use these values any longer. This implements solution 2.3 from https://github.com/meilisearch/meilisearch/issues/3231#issuecomment-1348628824 ## What does this PR do? ### User-visible - Soft-deleted are no longer deleted when there is less than 10% of the mapsize available or when they take more than 10% of the mapsize - Instead, they are deleted when they are more soft deleted than regular documents, or when they take more than 1GiB disk space (estimated). ### Implementation standpoint 1. Adds a `DeletionStrategy` struct to replace the boolean `disable_soft_deletion` that we had up until now. This enum allows us to specify that we want "always hard", "always soft", or to use the dynamic soft-deletion strategy (default). 2. Uses the current strategy when deleting documents, with the new heuristics being used in the `DeletionStrategy::Dynamic` variant. 3. Updates the tests to use the appropriate DeletionStrategy whenever needed (one of `AlwaysHard` or `AlwaysSoft` depending on the test) Note to reviewers: this PR is optimized for a commit-by-commit review. ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com>	2022-12-19 17:46:18 +00:00
curquiza	b3fce7c366	Remove useless continue-on-error	2022-12-19 18:39:35 +01:00
curquiza	5099a40484	Use ubuntu-18.04 container in publish CIs	2022-12-19 18:35:33 +01:00
Tamo	69edbf9f6d	Update milli/src/update/delete_documents.rs	2022-12-19 18:23:50 +01:00
bors[bot]	8957251eed	Merge #751 751: Update version for the next release (v0.38.0) in Cargo.toml files r=curquiza a=meili-bot ⚠️ This PR is automatically generated. Check the new version is the expected one before merging. Co-authored-by: curquiza <curquiza@users.noreply.github.com>	2022-12-19 17:02:39 +00:00
curquiza	c72535531b	Update version for the next release (v0.38.0) in Cargo.toml files	2022-12-19 16:35:38 +00:00
bors[bot]	19ee9a828f	Merge #3262 3262: Clippy fixes after updating Rust to v1.66 r=curquiza a=dureuill Ran `cargo clippy --fix` Fixes the CI. Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2022-12-19 14:05:59 +00:00
Louis Dureuil	869d331680	Clippy fixes after updating Rust to v1.66	2022-12-19 14:17:12 +01:00
curquiza	913eff5b2f	Use ubuntu-18.04 container in rust tests	2022-12-19 10:46:29 +01:00
Louis Dureuil	916c23e7be	Tests: rename snapshots	2022-12-19 10:07:17 +01:00
Louis Dureuil	ad9937c755	Fix tests after adding DeletionStrategy	2022-12-19 10:07:17 +01:00
Louis Dureuil	171c942282	Soft-deletion computation no longer takes into account the mapsize Implemented solution 2.3 from https://github.com/meilisearch/meilisearch/issues/3231#issuecomment-1348628824	2022-12-19 10:07:17 +01:00
Louis Dureuil	e2ae3b24aa	Hard or soft delete according to the deletion strategy	2022-12-19 10:00:13 +01:00
Louis Dureuil	fc7618d49b	Add DeletionStrategy	2022-12-19 09:49:58 +01:00
amab8901	b4a73f2d74	Remove redundant date-setting	2022-12-16 08:32:44 +01:00
amab8901	4e175ae882	Replace Index::new_with_creation_dates(...) with Index::new(...)	2022-12-16 08:20:13 +01:00
amab8901	5a0a0468df	Combine created and added into date	2022-12-16 08:11:12 +01:00
ManyTheFish	7f88c4ff2f	Fix #1714 test	2022-12-15 18:22:28 +01:00
ManyTheFish	96d4242b93	Update charabia	2022-12-15 18:22:22 +01:00
ManyTheFish	60ebf0ea0b	Add a specific test on finite pagination placeolder search with distinct attributes	2022-12-15 17:28:20 +01:00
bors[bot]	867279f2a4	Merge #3249 3249: Bring back changes from release-v0.30.3 to main r=curquiza a=curquiza ⚠️ ⚠️ I had to fix git conflicts, ensure I did not lose anything ⚠️ ⚠️ Co-authored-by: Kerollmops <clement@meilisearch.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2022-12-15 14:13:30 +00:00
bors[bot]	5114686394	Merge #743 743: Fix finite pagination with placeholder search r=Kerollmops a=ManyTheFish this bug is reproducible on real datasets and is hard to isolate in a simple test. related to: https://github.com/meilisearch/meilisearch/issues/3200 poke `@curquiza` Co-authored-by: ManyTheFish <many@meilisearch.com>	2022-12-15 09:31:47 +00:00
ManyTheFish	3322018c06	Fix placeholder search	2022-12-14 20:09:47 +01:00
Louis Dureuil	ce84a59873	Re-apply some changes from #3132	2022-12-14 20:02:39 +01:00
Tamo	d66bb3a53f	rename the two new functions	2022-12-14 17:27:43 +01:00
Tamo	6c0b8edab5	Fix typos Co-authored-by: Louis Dureuil <louis@meilisearch.com>	2022-12-14 17:27:37 +01:00
Tamo	fbbc6eaeca	Fix the import of dumps and snapshot. Some flags were badly applied + the database wrongly deleted when they shouldn't	2022-12-14 17:27:28 +01:00
Kerollmops	60c3bac108	Bump milli to v0.37.3	2022-12-14 17:25:40 +01:00
bors[bot]	9491fe0704	Merge #3247 3247: Re-add push in docker CI r=curquiza a=curquiza I made a mistake here https://github.com/meilisearch/meilisearch/pull/3229, `push` is not `true` by default, see https://github.com/docker/build-push-action#customizing Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>	2022-12-14 13:15:41 +00:00
Clémentine Urquizar - curqui	240c73d292	Re-add push	2022-12-14 14:05:25 +01:00
amab8901	d3eb8d2d5c	Enable create_raw_index(...) to specify time	2022-12-14 10:44:25 +01:00
bors[bot]	0276d5212a	Merge #728 728: Add some integration tests on the sort criterion r=ManyTheFish a=loiclec This is simply an integration test ensuring that the sort criterion works properly. However, only one version of the algorithm is tested here (the iterative one). To test the version that uses the facet DB, one has to manually set the `CANDIDATES_THRESHOLD` constant to `0`. I have done that and ensured that the test still succeeds. However, in the future, we will probably want to have an option to force which algorithm is used at runtime, for testing purposes. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2022-12-14 09:27:12 +00:00
bors[bot]	660be071b5	Merge #3236 3236: Improves clarity of the code that receives payloads r=Kerollmops a=Kerollmops This PR makes small changes to #3164. It improves the clarity and simplicity of some parts of the code. Co-authored-by: Kerollmops <clement@meilisearch.com>	2022-12-13 18:20:24 +00:00
bors[bot]	89542d7d8b	Merge #3241 3241: Remove core mention r=curquiza a=curquiza No impact for the users or the team Co-authored-by: curquiza <clementine@meilisearch.com>	2022-12-13 17:35:50 +00:00
curquiza	f62e7a3501	Remove core mention	2022-12-13 17:34:43 +01:00
Kerollmops	a08cc82983	Revert "Simplify the code when array_each failed" This reverts commit `271685cceb`.	2022-12-13 16:29:49 +01:00
bors[bot]	e2ffc3d69a	Merge #741 741: Add test reproducing the bug fixed by #737 r=Kerollmops a=ManyTheFish related to #737 Co-authored-by: ManyTheFish <many@meilisearch.com>	2022-12-13 15:02:19 +00:00
ManyTheFish	739da9fd4d	Add test	2022-12-13 15:54:43 +01:00
bors[bot]	2af93966e0	Merge #740 740: Fix two nightly errors r=Kerollmops a=irevoire Currently, we have these two errors on rust nightly. It would be nice to help rustc understand what's going on ``` error[E0658]: anonymous lifetimes in `impl Trait` are unstable --> filter-parser/src/lib.rs:173:53 \| 173 \| fn ws<'a, O>(inner: impl FnMut(Span<'a>) -> IResult<O>) -> impl FnMut(Span<'a>) -> IResult<O> { \| ^ expected named lifetime parameter \| = help: add `#![feature(anonymous_lifetime_in_impl_trait)]` to the crate attributes to enable help: consider introducing a named lifetime parameter \| 173 \| fn ws<'a, 'a, O>(inner: impl FnMut(Span<'a>) -> IResult<'a, O>) -> impl FnMut(Span<'a>) -> IResult<O> { \| +++ +++ error[E0658]: anonymous lifetimes in `impl Trait` are unstable --> filter-parser/src/error.rs:36:49 \| 36 \| mut parser: impl FnMut(Span<'a>) -> IResult<O>, \| ^ expected named lifetime parameter \| = help: add `#![feature(anonymous_lifetime_in_impl_trait)]` to the crate attributes to enable help: consider introducing a named lifetime parameter \| 35 ~ pub fn cut_with_err<'a, 'a, O>( 36 ~ mut parser: impl FnMut(Span<'a>) -> IResult<'a, O>, \| For more information about this error, try `rustc --explain E0658`. error: could not compile `filter-parser` due to 2 previous errors ``` Co-authored-by: Tamo <tamo@meilisearch.com>	2022-12-13 14:33:40 +00:00
Kerollmops	7b2f2a4f9c	Do only one convertion to u64	2022-12-13 15:31:55 +01:00
Tamo	2c47500bc3	fix two nightly errors	2022-12-13 15:29:52 +01:00
Kerollmops	5d5615ef45	Rename the ReceivePayload error variant	2022-12-13 15:07:35 +01:00
Kerollmops	526793b5b2	Handle empty arrays the same way we handle other arrays	2022-12-13 14:58:40 +01:00
Kerollmops	271685cceb	Simplify the code when array_each failed	2022-12-13 14:58:05 +01:00
bors[bot]	1af590d3bc	Merge #3234 3234: Update README.md r=curquiza a=tpayet Change Slack link to Discord link Co-authored-by: Thomas Payet <thomas@meilisearch.com>	2022-12-13 11:41:10 +00:00
bors[bot]	dab2634ca8	Merge #3164 3164: Improve the way we receive the documents payload r=Kerollmops a=jiangbo212 # Pull Request ## Related issue Fixes #3037 ## What does this PR do? - writing the playload to a temporary file via BufWritter - deserialising the json tempporary file to an array of Objects by means of a memory map - deserialising thie csv tempporary file by means of a memory map - Adapted some read_json tests ## PR checklist Please check if your PR fulfills the following requirements: - [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)? - [x] Have you read the contributing guidelines? - [x] Have you made sure that the title is accurate and descriptive of the changes? Thank you so much for contributing to Meilisearch! Co-authored-by: jiangbo212 <peiyaoliukuan@gmail.com> Co-authored-by: jiangbo212 <peiyaoliukuan@126.com>	2022-12-13 10:58:24 +00:00
bors[bot]	406ee31d1a	Merge #737 737: Fix typo initial candidates computation r=Kerollmops a=ManyTheFish When `Typo` criterion was after a different criterion than `Words` and the previous criterion wasn't returning any candidates at the first iteration of the bucket sort, then the `initial_candidates` were lost. Now, `Typo`ensure to keep the `initial_candidates` between iterations. related to https://github.com/meilisearch/meilisearch/issues/3200#issuecomment-1345179578 related to https://github.com/meilisearch/meilisearch/issues/3228 Co-authored-by: ManyTheFish <many@meilisearch.com>	2022-12-13 10:29:28 +00:00
ManyTheFish	2d8d0af1a6	Rename short name bc by ic for initial_candidates	2022-12-13 10:56:38 +01:00
Thomas Payet	8a7f90250c	Update README.md Change Slack link to Discord link	2022-12-13 10:46:05 +01:00
bors[bot]	e0a8f8cb5a	Merge #734 734: Fix bug 2945/3021 (missing key in documents database) r=Kerollmops a=loiclec # Pull Request ## Related issue Fixes (partially, until merged into meilisearch) https://github.com/meilisearch/meilisearch/issues/2945 (until we integrate the new milli bump into meilisearch). Note that a dump will not be sufficient to upgrade from meilisearch v0.30.2 to meilisearch v0.30.3 due to this fix because the bug could have caused the `documents` database to be corrupted. Instead, a full manual reimport of the documents will be necessary. ## What does this PR do? There was a bug happening when: 1. A few documents are added to the index 2. Some of these documents are soft-deleted 3. New documents are added, replacing existing ones and triggering a hard-deletion The `IndexDocuments::execute` method would then perform the hard-deletion but forget to change the `external_document_ids` structure appropriately. As a result, the `external_document_ids` would contain keys corresponding to documents that do no exist anymore. To fix this bug, I split the `DeleteDocuments::execute` method into two: `execute_inner` and `execute`. - `execute_inner` returns a `DetailedDocumentDeletionResult` which says whether soft-deletion was used or not - `execute` keeps the exact same signature and behaviour Then, when deleting replaced documents inside `IndexDocuments::execute`, we call `DeleteDocuments::execute_inner` instead of `DeleteDocuments::execute`. If soft-deletion was used, nothing more is done. But if hard-deletion was used, we remove every reference to soft-deleted documents in the new `external_documents_ids` structure. ## Correctness - Every other test still passes - The reproduction test case now passes - In a different branch ([`update-fuzz-test`](https://github.com/meilisearch/milli/pull/735)), I created a fuzz-test that reproduces the past two bugs. This fuzz test cannot find this bug through any combination of some hand-selected `DocumentAddition / DocumentDeletion / DocumentClear / SettingsUpdate` operations. In that test, each relevant operations can be executed with or without soft-deletion, and document additions can be done in batches, replacing or updating existing documents. Co-authored-by: Loïc Lecrenier <loic.lecrenier@me.com>	2022-12-13 09:45:57 +00:00

... 13 14 15 16 17 ...

7675 Commits