meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2024-11-29 16:45:30 +08:00

Author	SHA1	Message	Date
meili-bors[bot]	6b8b6062bc	Merge #4069 4069: Revert the fix on the filters about escaping sequences r=irevoire a=Kerollmops This PR reverts #4038 as it introduces a breaking change. However, the fix will be kept for the v1.4.0 release. Co-authored-by: Clément Renault <renault.cle@gmail.com>	2023-09-19 16:51:54 +00:00
Clément Renault	70efb0df78	Revert "Fix filter escaping issues"	2023-09-19 17:23:03 +02:00
curquiza	0084aebf40	Update version for the next release (v1.3.5) in Cargo.toml	2023-09-19 12:50:31 +00:00
curquiza	8e2bb29cf1	Update version for the next release (v1.3.4) in Cargo.toml	2023-09-11 16:20:03 +00:00
meili-bors[bot]	9945cbf9db	Merge #4038 4038: Fix filter escaping issues r=ManyTheFish a=Kerollmops This PR fixes #4034 by always escaping the sequences. Users must always put quotes (simple or double) to escape the filter values. Co-authored-by: Kerollmops <clement@meilisearch.com>	2023-09-06 12:29:29 +00:00
Kerollmops	03d0f628bd	Use the unescaper crate to unescape any char sequence	2023-09-06 13:59:45 +02:00
curquiza	93285041a9	Update version for the next release (v1.3.3) in Cargo.toml	2023-09-06 09:23:20 +00:00
Kerollmops	717b069907	Bump charabia to 0.8.3	2023-08-22 16:25:00 +02:00
irevoire	b947f3bb9d	Update version for the next release (v1.3.2) in Cargo.toml	2023-08-16 08:20:36 +00:00
irevoire	75c87d5391	Update version for the next release (v1.3.1) in Cargo.toml	2023-08-08 10:30:06 +00:00
Clément Renault	d8b47b689e	Use the new read-txn-no-tls heed feature	2023-07-26 15:45:15 +02:00
Kerollmops	29ab54b259	Replace the hnsw crate by the instant-distance one	2023-07-25 12:37:35 +02:00
ManyTheFish	0497f93494	Update Charabia to the last version	2023-07-19 15:19:32 +02:00
ManyTheFish	c106906f8f	deactivate camelCase segmentation	2023-07-13 12:06:27 +02:00
Kerollmops	e7f8daaf86	Update criterion to 0.5.1 to remove the atty dependency	2023-07-03 18:51:42 +02:00
Kerollmops	d1ff631df8	Replace the atty dependency with the is-terminal one	2023-07-03 18:51:42 +02:00
gillian-meilisearch	1d40452057	Update version for the next release (v1.3.0) in Cargo.toml	2023-07-03 08:32:21 +00:00
meili-bors[bot]	661d1f90dc	Merge #3866 3866: Update charabia v0.8.0 r=dureuill a=ManyTheFish # Pull Request Update Charabia: - enhance Japanese segmentation - enhance Latin Tokenization - words containing `_` are now properly segmented into several words - brackets `{([])}` are no more considered as context separators so word separated by brackets are now considered near together for the proximity ranking rule - fixes #3815 - fixes #3778 - fixes [product#151](https://github.com/meilisearch/product/discussions/151) > Important note: now the float numbers are segmented around the `.` so `3.22` is segmented as [`3`, `.`, `22`] but the middle dot isn't considered as a hard separator, which means that if we search `3.22` we find documents containing `3.22` Co-authored-by: ManyTheFish <many@meilisearch.com>	2023-06-29 15:24:36 +00:00
ManyTheFish	e8dee3ca65	Update lock file	2023-06-29 17:02:24 +02:00
ManyTheFish	84845de9ef	Update Charabia	2023-06-29 15:56:32 +02:00
Kerollmops	a385642ec3	Replace the BTreeMap by an IndexMap to return values in order	2023-06-29 14:33:31 +02:00
Kerollmops	737aec1705	Expose an _semanticSimilarity as a dot product in the documents	2023-06-27 12:32:41 +02:00
Kerollmops	c79e82c62a	Move back to the hnsw crate This reverts commit 7a4b6c065482f988b01298642f4c18775503f92f.	2023-06-27 12:32:39 +02:00
Kerollmops	268a9ef416	Move to the hgg crate	2023-06-27 12:32:38 +02:00
Clément Renault	4571e512d2	Store the vectors in an HNSW in LMDB	2023-06-27 12:32:38 +02:00
Clément Renault	34349faeae	Create a new _vector extractor	2023-06-27 12:32:37 +02:00
meili-bors[bot]	45636d315c	Merge #3670 3670: Fix addition deletion bug r=irevoire a=irevoire The first commit of this PR is a revert of https://github.com/meilisearch/meilisearch/pull/3667. It re-enable the auto-batching of addition and deletion of tasks. No new changes have been introduced outside of `milli`. So all the changes you see on the autobatcher have actually already been reviewed. It fixes https://github.com/meilisearch/meilisearch/issues/3440. ### What was happening? The issue was that the `external_documents_ids` generated in the `transform` were used in a very strange way that wasn’t compatible with the deletion of documents. Instead of doing a clear merge between the external document IDs of the DB and the one returned by the transform + writing it on disk, we were doing some weird tricks with the soft-deleted to avoid writing the fst on disk as much as possible. The new algorithm may be a bit slower but is way more straightforward and doesn’t change depending on if the soft deletion was used or not. Here is a list of the changes introduced: 1. We now do a clear distinction between the `new_external_documents_ids` coming from the transform and only held on RAM and the `external_documents_ids` coming from the DB. 2. The `new_external_documents_ids` (coming out of the transform) are now represented as an `fst`. We don't need to struggle with the hard, soft distinction + the soft_deleted => That's easier to understand 3. When indexing documents, we merge the `external_documents_ids` coming from the DB and the `new_external_documents_ids` coming from the transform. ### Other things introduced in this PR Since we constantly have to write small, very specialized fuzzers for this kind of bug, we decided to push the one used to reproduce this bug. It's not perfect, but it's easy to improve in the future. It'll also run for as long as possible on every merge on the main branch. Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: Loïc Lecrenier <loic.lecrenier@icloud.com>	2023-06-19 09:09:30 +00:00
Tamo	6c6387d05e	move the fuzzer to its own crate	2023-05-29 12:27:39 +02:00
Tamo	4391cba6ca	fix the addition + deletion bug	2023-05-17 18:28:57 +02:00
Kerollmops	1a79fd0c3c	Use the new heed v0.12.6	2023-05-15 11:42:30 +02:00
Kerollmops	c4a40e7110	Use the writemap flag to reduce the memory usage	2023-05-15 10:15:33 +02:00
curquiza	3533d4f2bb	Update version for the next release (v1.2.0) in Cargo.toml	2023-05-08 17:52:33 +00:00
Louis Dureuil	90bc230820	Merge remote-tracking branch 'origin/main' into search-refactor Conflicts \| resolution ----------\|----------- Cargo.lock \| added mimalloc Cargo.toml \| took origin/main version milli/src/search/criteria/exactness.rs \| deleted after checking it was only clippy changes milli/src/search/query_tree.rs \| deleted after checking it was only clippy changes	2023-05-03 12:19:06 +02:00
ManyTheFish	1bf2694604	Update cargo lock	2023-04-26 17:41:29 +02:00
Kerollmops	a109802d45	Upgrade the incompatible versions of the dependencies	2023-04-24 17:50:57 +02:00
Kerollmops	47b66e49b8	Upgrade the compatible versions of the dependencies	2023-04-24 17:50:52 +02:00
bors[bot]	654a3a9e19	Merge #3688 3688: Following release v1.1.1: bring back changes into `main` r=curquiza a=curquiza `@meilisearch/engine-team` ensure the changes we bring to `main` are the ones you want Co-authored-by: Louis Dureuil <louis@meilisearch.com> Co-authored-by: bors[bot] <26634292+bors[bot]@users.noreply.github.com> Co-authored-by: Tamo <tamo@meilisearch.com> Co-authored-by: dureuill <dureuill@users.noreply.github.com>	2023-04-24 11:38:23 +00:00
dependabot[bot]	f0b4046c43	Bump h2 from 0.3.15 to 0.3.17 Bumps [h2](https://github.com/hyperium/h2) from 0.3.15 to 0.3.17. - [Release notes](https://github.com/hyperium/h2/releases) - [Changelog](https://github.com/hyperium/h2/blob/master/CHANGELOG.md) - [Commits](https://github.com/hyperium/h2/compare/v0.3.15...v0.3.17) --- updated-dependencies: - dependency-name: h2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-04-13 17:03:48 +00:00
dureuill	cd45d21d6e	Update version for the next release (v1.1.1) in Cargo.toml	2023-04-13 13:25:10 +00:00
Loïc Lecrenier	9259cdb12e	Update Cargo.lock (was mistakenly changed during rebase)	2023-03-20 09:41:56 +01:00
Loïc Lecrenier	6c659dc12f	Use MiMalloc in milli tests	2023-03-20 09:41:37 +01:00
curquiza	577e7126f9	Update version for the next release (v1.1.0) in Cargo.toml	2023-03-06 13:52:54 +00:00
Louis Dureuil	42577403d8	Authentication: Directly pass the authfilter to the index scheduler	2023-02-22 16:35:52 +01:00
bors[bot]	39407885c2	Merge #3347 3347: Enhance language detection r=irevoire a=ManyTheFish ## Summary Some completely unrelated Languages can share the same characters, in Meilisearch we detect the Languages using `whatlang`, which works well on large texts but fails on small search queries leading to a bad segmentation and normalization of the query. This PR now stores the Languages detected during the indexing in order to reduce the Languages list that can be detected during the search. ## Detail - Create a 19th database mapping the scripts and the Languages detected with the documents where the Language is detected - Fill the newly created database during indexing - Create an allow-list with this database and pass it to Charabia - Add a test ensuring that a Japanese request containing kanjis only is detected as Japanese and not Chinese ## Related issues Fixes #2403 Fixes #3513 Co-authored-by: f3r10 <frledesma@outlook.com> Co-authored-by: ManyTheFish <many@meilisearch.com> Co-authored-by: Many the fish <many@meilisearch.com>	2023-02-21 10:52:13 +00:00
ManyTheFish	8aa808d51b	Merge branch 'main' into enhance-language-detection	2023-02-20 18:14:34 +01:00
bors[bot]	1e9ac00800	Merge #3505 3505: Csv delimiter r=irevoire a=irevoire Fixes https://github.com/meilisearch/meilisearch/issues/3442 Closes https://github.com/meilisearch/meilisearch/pull/2803 Specified in https://github.com/meilisearch/specifications/pull/221 This PR is a reimplementation of https://github.com/meilisearch/meilisearch/pull/2803, on the new engine. Thanks for your idea and initial PR `@MixusMinimax;` sorry I couldn’t update/merge your PR. Way too many changes happened on the engine in the meantime. Attention to reviewer; I had to update deserr to implement the support of deserializing `char`s ------- It introduces four new error messages; - Invalid value in parameter csvDelimiter: expected a string of one character, but found an empty string - Invalid value in parameter csvDelimiter: expected a string of one character, but found the following string of 5 characters: doggo - csv delimiter must be an ascii character. Found: 🍰 - The Content-Type application/json does not support the use of a csv delimiter. The csv delimiter can only be used with the Content-Type text/csv. And one error code; - `invalid_index_csv_delimiter` The `invalid_content_type` error code is now also used when we encounter the `csvDelimiter` query parameter with a non-csv content type. Co-authored-by: Tamo <tamo@meilisearch.com>	2023-02-20 17:01:36 +00:00
ManyTheFish	cb8d5f2d4b	Update Charabia to 0.7.1	2023-02-20 14:00:31 +01:00
Tamo	8c074f5028	implements the csv delimiter without tests Co-authored-by: Maxi Barmetler <maxi.barmetler@gmail.com>	2023-02-16 17:35:36 +01:00
Tamo	74d1a67a99	Use the workspace inheritance feature of rust 1.64	2023-02-15 13:51:07 +01:00
Tamo	a43765d454	use the pre-defined deserr extractors	2023-02-14 20:05:30 +01:00

1 2 3 4 5 ...

548 Commits