Commit Graph

2512 Commits

Author SHA1 Message Date
F. Levi
c3de3a9ab7 Refactor 2024-10-04 11:30:31 +03:00
F. Levi
8221c94e7f Split into multiple files, refactor 2024-10-03 15:37:51 +03:00
F. Levi
c427d9e2ad Merge branch 'main' into change-matches-position-phrase-search 2024-10-03 10:42:34 +03:00
F. Levi
40336ce87d Fix and refactor crop_bounds 2024-10-03 10:40:14 +03:00
F. Levi
37a9d64c44 Fix failing test, refactor 2024-10-01 22:52:01 +03:00
F. Levi
d9e4db9983 Refactor 2024-10-01 17:50:59 +03:00
F. Levi
6d16230f17 Refactor 2024-10-01 17:19:15 +03:00
F. Levi
eabc14c268 Refactor, handle more cases for phrases 2024-09-30 21:24:41 +03:00
meili-bors[bot]
e78da35287
Merge #4930
4930: Return `UserError::InvalidDocumentId` for primary keys with a length greater than 512 bytes r=curquiza a=flevi29

# Pull Request

## Related issue
Fixes #4843

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: F. Levi <55688616+flevi29@users.noreply.github.com>
2024-09-30 15:55:05 +00:00
F. Levi
00ccf53ffa Merge branch 'main' into change-matches-position-phrase-search 2024-09-27 15:52:05 +03:00
F. Levi
d20a39b959 Refactor find_best_match_interval 2024-09-27 15:44:30 +03:00
meili-bors[bot]
462a2329f1
Merge #4941
4941: Implement the binary quantization in meilisearch r=irevoire a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4873

## What does this PR do?
- Add a settings for the binary quantization
- Once enabled, the bq cannot be disabled

TODO:
- [ ] Missing a bunch of tests

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-09-19 15:50:24 +00:00
Tamo
f6483cf15d apply review comment 2024-09-19 16:47:06 +02:00
meili-bors[bot]
bd34ed01d9
Merge #4945
4945: Add swedish in default pipelines r=dureuill a=ManyTheFish

# Summary
## Fix Swedish support

In Swedish the characters `å`/`ä`/`ö` are completely different than `a` or `o`  and should not be normalized as the same character.
because the Swedish specialized pipeline was not activated by default, these characters were normalized even with the settings:
```json
{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}
```

## Update Charabia adding German support

German segmentation will now be activated using the setting:
```json
{
  "localizedAttributes": [ { "locales": ["deu"], "attributePatterns": ["*"] } ]
}
```

# TODO

- [x] Activate Swedish Pipeline
- [x] Add a test to avoid future regressions
- [x] Update Charabia


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-09-19 14:42:03 +00:00
Tamo
74199f328d Make clippy happy 2024-09-19 16:27:34 +02:00
Tamo
1113c42de0 fix broken comments 2024-09-19 16:18:36 +02:00
ManyTheFish
7d6768e4c4 Add german tokenization pipeline 2024-09-19 16:09:01 +02:00
ManyTheFish
f77661ec44 Update Charabia v0.9.1 2024-09-19 16:08:59 +02:00
Tamo
b8fd85a46d Get rids of useless collect before an iteration on the readers 2024-09-19 15:57:38 +02:00
Tamo
fd43c6c404 Improve the error message explaining you can't un-bq an embedder 2024-09-19 15:51:29 +02:00
Tamo
2564ec1496
Update milli/src/index.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:44 +02:00
Tamo
b6b73fe41c
Update milli/src/update/settings.rs
Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 15:41:14 +02:00
Tamo
6dde41cc46 stop using a local version of arroy and instead point to the git repo with the rev 2024-09-19 15:25:38 +02:00
Tamo
163f8023a1 remove debug println 2024-09-19 12:13:25 +02:00
Tamo
633537ccd7 fix updating documents without updating the settings 2024-09-19 12:00:58 +02:00
Tamo
3f6301dbc9 fix the missing embedder name in the error message when trying to disable the binary quantization 2024-09-19 12:00:58 +02:00
Tamo
2b6952eda1 rename the ArroyReader to an ArroyWrapper since it can read and write 2024-09-19 12:00:58 +02:00
Tamo
79f29eed3c fix the tests and the arroy_readers method 2024-09-19 12:00:58 +02:00
Tamo
cc45e264ca implement the binary quantization in meilisearch 2024-09-19 12:00:56 +02:00
meili-bors[bot]
5f474a640d
Merge #4938
4938: Remove default embedder r=ManyTheFish a=dureuill

# Pull Request

## Related issue
Fixes #4738 

## What does this PR do?

[See public usage](https://meilisearch.notion.site/v1-11-AI-search-changes-0e37727193884a70999f254fa953ce6e#1044b06b651f80edb9d4ef6dc367bad0)

- Remove `hybrid.embedder` boolean from analytics because embedder is now mandatory and so the boolean would always be `true`
- Rework search kind so that a search without query but with vector is a vector search regardless of (non-zero) semantic ratio


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-09-19 09:17:14 +00:00
ManyTheFish
bbaee3dbc6 Add Swedish pipeline in all-tokenization feature 2024-09-19 08:34:51 +02:00
F. Levi
0ffeea5a52 Remove wrong comments 2024-09-19 09:06:40 +03:00
meili-bors[bot]
ff523a2357
Merge #4939
4939: Introduce the `STARTS WITH` filter operator r=irevoire a=Kerollmops

This PR fixes #4872 by introducing the `STARTS WITH` filter operator and gating it under the _contains filter_ experimental feature along with the `CONTAINS` one. I also updated [the experimental feature discussion page](https://github.com/orgs/meilisearch/discussions/763).

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-09-18 10:19:48 +00:00
F. Levi
83113998f9 Add more test assertions 2024-09-18 10:35:23 +03:00
Clément Renault
9f1fb4b425
Introduce the STARTS WITH filter operator gated under an experimental feature 2024-09-17 16:44:11 +02:00
F. Levi
f7337affd6 Adjust tests to changes 2024-09-17 17:31:09 +03:00
Louis Dureuil
3c5e363554
Remove default embedders 2024-09-17 16:30:43 +02:00
F. Levi
e098cc8320 Make comparison simpler, add IndexUid error details similarly 2024-09-17 00:16:15 +03:00
F. Levi
993408d3ba Change closure to fn 2024-09-15 16:15:09 +03:00
F. Levi
dcb61f8b3a Return error for primary keys with a length greater than 512 bytes 2024-09-14 11:34:13 +03:00
F. Levi
51085206cc Misc adjustments 2024-09-14 10:14:07 +03:00
F. Levi
a2a16bf846 Move MatchPosition impl to Match, adjust counting score for phrases 2024-09-13 21:20:06 +03:00
F. Levi
cab63abc84 Improve MatchesPosition enum with an impl 2024-09-13 14:35:28 +03:00
F. Levi
65e3d61a95 Make use of helper function in one more place 2024-09-13 13:35:58 +03:00
F. Levi
cc6a2aec06 Improve changes to Matcher 2024-09-13 13:31:07 +03:00
Louis Dureuil
23e14138bb
facet distribution: implement Display for OrderBy 2024-09-12 17:43:50 +02:00
Louis Dureuil
e44325683a
Facet distribution: fix issue where truncated facet distribution would have a wrong order 2024-09-12 17:43:49 +02:00
F. Levi
e7af499314 Improve changes to Matcher 2024-09-12 16:58:13 +03:00
F. Levi
edcb4c60ba Change Matcher so that phrases are counted as one instead of word by word 2024-09-12 09:46:08 +03:00
Louis Dureuil
f18e9cb7b3
Change openai default model 2024-09-09 13:09:35 +02:00