Commit Graph

9678 Commits

Author SHA1 Message Date
ManyTheFish
5f5a486895 Reduce formatting time 2024-01-11 11:36:41 +01:00
ManyTheFish
5f4fc6c955 Add timer logs 2024-01-11 09:44:16 +01:00
meili-bors[bot]
1f5e8fc072
Merge #4311
4311: Limit the number of values returned by the facet search r=dureuill a=Kerollmops

This PR fixes a bug where the number of values per facet returned by the `indexes/{index}/facet-search` route was not tacking the `faceting.maxValuePerFacet` setting. It also adds a test.

Co-authored-by: Clément Renault <clement@meilisearch.com>
2024-01-10 16:04:06 +00:00
Clément Renault
3f3462ab62
Limit the number of values returned by the facet search 2024-01-10 16:54:08 +01:00
meili-bors[bot]
93363b0201
Merge #4308
4308: Fix hang on `/indexes` and `/stats` routes r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4218 

## Context

- A previous fix added a field to the `IndexScheduler` to memorize the `currently_updating_index`, so that accessing it through the search would return the handle without trying to open it. This resolved a hang on the search, but #4218 reported further hangs on the `/indexes` and `/stats` routes
- These routes were shunting the `IndexScheduler` and using internal `IndexMapper` logic to access the indexes, again trying to reopen the updating index.

## What does this PR do?

- Moves the logic relative to the `currently_updating_index` from the `IndexScheduler` to the `IndexMapper`, so that any index request to the `IndexMapper` can benefit from it.

## Test

1. Follow reproducer from #4218 
2. Before this PR, notice a hang on `/stats` and `/indexes`, but not on `/indexes/<updating_index>/search`
3. After this PR, notice no hang on either of `/stats`, `/indexes` or `/indexes/<updating_index>/search`



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-10 10:46:20 +00:00
Louis Dureuil
97bb1ff9e2
Move currently_updating_index to IndexMapper 2024-01-09 15:37:27 +01:00
meili-bors[bot]
5ee1378856
Merge #4303
4303: Display default value when proximityPrecision is not set r=dureuill a=ManyTheFish

# Pull Request

## Related
Issue: #4187
Spec change requests: https://github.com/meilisearch/specifications/pull/261#discussion_r1441725272

## What does this PR do?
- Display default value when proximityPrecision is not set instead of Null


Co-authored-by: ManyTheFish <many@meilisearch.com>
2024-01-08 14:29:57 +00:00
ManyTheFish
e27b850b09 move the default display strategy on setting getter function 2024-01-08 14:03:47 +01:00
ManyTheFish
f75f22e026 Display default value when proximityPrecision is not set 2024-01-08 11:09:37 +01:00
meili-bors[bot]
6203f4acef
Merge #4296
4296: Fix single element search r=irevoire a=dureuill

# Pull Request

Before this PR, indexing a single vector in a single document would result in the vector not being found by the vector search.

This PR adds a test case for this condition, and resolves it by bumping arroy to a version containing the fix.

# Test case

Output of the test before and after this PR:

```diff
diff --git a/meilisearch/tests/search/hybrid.rs b/meilisearch/tests/search/hybrid.rs
index 2cd4b83e7..79819cab2 100644
--- a/meilisearch/tests/search/hybrid.rs on release-v1.6.0
+++ b/meilisearch/tests/search/hybrid.rs on fix-single-element-search
`@@` -171,5 +171,5 `@@` async fn single_document() {
     .await;

     snapshot!(code, `@"200` OK");
-    snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":0.0}"###);
+    snapshot!(response["hits"][0], `@r###"{"title":"Shazam!","desc":"a` Captain Marvel ersatz","id":"1","_vectors":{"default":[1.0,3.0]},"_rankingScore":1.0,"_semanticScore":1.0}"###);
 }
```



Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2024-01-03 15:01:43 +00:00
Louis Dureuil
12edc2c20a
Update arroy to a fixed version 2024-01-03 15:59:37 +01:00
Louis Dureuil
94b9f3b310
Add test 2024-01-03 15:56:20 +01:00
meili-bors[bot]
5204c0b60b
Merge #4297
4297: Update license for 2024 r=curquiza a=meili-bot

_This PR is auto-generated._


Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com>
2024-01-03 13:54:19 +00:00
meili-bot
e73cd692db Update LICENSE 2024-01-03 14:32:41 +01:00
meili-bors[bot]
29b453346b
Merge #4293
4293: Update SDK test dependencies r=curquiza a=curquiza

Replace dependabot updates

The changes are really un-impactful for the engine team velocity because is about a CI
- that does not run during release deployment
- that does not run to merge a PR

It's only a weekly scheduled CI to check the breaking we introduced in the integrations.

I updated the dependencies based on what we do on the integration CIs
For example for dart, I looked at what we have in the [Dart CI](63fd758882/.github/workflows/tests.yml (L16-L54)) and I updated our CI in this repo accordingly. I did the same for each repository. This ensures we test the same things.


Co-authored-by: curquiza <clementine@meilisearch.com>
2024-01-03 13:26:50 +00:00
meili-bors[bot]
c4bb435374
Merge #4295
4295: fix compilation warnings on main r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4292

## What does this PR do?
- Removed unused imports

#4294 fixes the issue for the release v1.6

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-02 15:33:06 +00:00
meili-bors[bot]
da99a04eb3
Merge #4294
4294: fix compilation warnings for release v1.6 r=curquiza a=irevoire

# Pull Request

## Related issue
Fixes #4292

## What does this PR do?
- Removed unused imports

#4295 fixes the issue no main

Co-authored-by: Tamo <tamo@meilisearch.com>
2024-01-02 15:00:40 +00:00
Tamo
54ae6951eb
fix warning 2024-01-02 15:19:30 +01:00
Tamo
2bcff2ea46
fix warning 2024-01-02 15:19:00 +01:00
curquiza
1275e72e0b Update SDK test dependencies 2024-01-02 09:59:46 +01:00
meili-bors[bot]
658ec6e0a4
Merge #4279
4279: Check experimental feature on setting update query rather than in the task. r=ManyTheFish a=dureuill

Improve the UX by checking for the vector store feature and returning an error synchronously when sending a setting update, rather than in the indexing task.

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-22 11:36:12 +00:00
meili-bors[bot]
43e822e802
Merge #4238
4238: Task queue webhook r=dureuill a=irevoire

# Prototype `prototype-task-queue-webhook-1`

The prototype is available through Docker by using the following command:

```bash
docker run -p 7700:7700 -v $(pwd)/meili_data:/meili_data getmeili/meilisearch:prototype-task-queue-webhook-1
```

# Pull Request

Implements the task queue webhook.

## Related issue
Fixes https://github.com/meilisearch/meilisearch/issues/4236

## What does this PR do?
- Provide a new cli and env var for the webhook, respectively called `--task-webhook-url` and `MEILI_TASK_WEBHOOK_URL`
- Also supports sending the requests with a custom `Authorization` header by specifying the optional `--task-webhook-authorization-header` CLI parameter or `MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER` env variable.
- Throw an error if the specified URL is invalid
- Every time a batch is processed, send all the finished tasks into the webhook with our public `TaskView` type as a JSON Line GZIPed body.
- Add one test.

## PR checklist

### Before becoming ready to review
- [x] Add a test
- [x] Compress the data we send
- [x] Chunk and stream the data we send
- [x] Remove the unwrap in the index-scheduler when sending the data fails
- [x] The analytics are missing

### Before merging
- [x] Release a prototype



Co-authored-by: Tamo <tamo@meilisearch.com>
Co-authored-by: Clément Renault <clement@meilisearch.com>
2023-12-21 14:43:46 +00:00
Louis Dureuil
ee54d3171e
Check experimental feature at query time 2023-12-21 15:26:12 +01:00
meili-bors[bot]
a0e713c4e7
Merge #4277
4277: Update mini-dashboard to v0.2.12 r=curquiza a=mdubus

# Pull Request

## Related issue
Fixes #4276

## What does this PR do?
Upgrade mini-dashboard to version 0.2.12 ([see changes](https://github.com/meilisearch/mini-dashboard/releases/tag/v0.2.12))

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Morgane Dubus <30866152+mdubus@users.noreply.github.com>
2023-12-21 11:03:46 +00:00
meili-bors[bot]
d4cb0a885b
Merge #4275
4275: Flatten settings r=dureuill a=dureuill

# Pull Request

## Related issue
Initial internal feedback seems to indicate that the current shape of the `embedders` setting is undesirable: it has too much depth.

This PR changes this by flattening the structure of the embedders to the following:

```json5
// NEW structure
"embedders": {
  // still starts with the embedder name
  "default": {
    "source": "huggingFace", // now a string
    // properties of the source are all at the same level as the source
    "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
    "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd",
    "documentTemplate": "A product titled '{{doc.title}}'" // now a string
  }
}
```

By comparison, the old structure was:

```json5
// PREVIOUS version, no longer working with this PR
"embedders": {
  // still starts with the embedder name
  "default": {
    "source": {
      "huggingFace": {
        "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
        "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd"
      },
    "documentTemplate": { 
      "template": "A product titled '{{doc.title}}'" // now a string
    }
  }
}
```

The fields that are accepted in the new version of the `embedders` setting are depending on the value of the `source` field:

```json5
// huggingFace
"embedders": {
   "default": {
    "source": "huggingFace",
    "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
    "revision": "a9c555277f9bcf24f28fa5e092e665fc6f7c49cd",
    "documentTemplate": "A product titled '{{doc.title}}'"
  }
}

// openAi
"embedders": {
   "default": {
    "source": "openAi",
    "model": "text-embedding-ada-002",
    "apiKey": "open_ai_api_key",
    "documentTemplate": "A product titled '{{doc.title}}'"
  }
}

// userProvided
"embedders": {
   "default": {
    "source": "userProvided",
    "dimensions": 42, // mandatory
  }
}
```

## What does this PR do?
- Flatten the settings structure
- Validate the prompt earlier to return a synchronous error on setting change rather than in the failing task
- Make it an error to pass a field for the wrong source (see above for allowed fields for each source)
- Not changed: It is still an error not to pass `dimensions` to the `userProvided` embedder
- If `source` was specified in the settings, validate the setting early to return a synchronous error in case of a missing mandatory field for the userProvided source (dimensions) or a forbidden field for the specified source.
- If `source` was not specified in the settings, still validate the setting, but only at indexing time, by using the source stored in the DB.
- Resets all values if the source changes, even if the user did not reset them explicitly.

## PR checklist
Please check if your PR fulfills the following requirements:
- [ ] Change the public facing guide for using the API
- [ ] Change examples of use in the changelog


Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-21 09:58:01 +00:00
Morgane Dubus
f52dee2b3b
Update Cargo.toml
Update mini-dashboard with v0.2.12
2023-12-21 09:53:13 +01:00
Louis Dureuil
0bf879fb88
Fix warning on rust stable 2023-12-20 17:48:09 +01:00
Louis Dureuil
6ff81de401
Fix tests 2023-12-20 17:16:46 +01:00
Louis Dureuil
2e4c9651df
Validate settings in route 2023-12-20 17:16:46 +01:00
Louis Dureuil
ec9649c922
Add function to validate settings in Meilisearch, to be used in the routes 2023-12-20 17:16:46 +01:00
Louis Dureuil
9123370e90
Validate fused settings in settings task after fusing with existing setting 2023-12-20 17:16:46 +01:00
Louis Dureuil
14b396d302
Add new errors 2023-12-20 17:16:45 +01:00
Louis Dureuil
393216bf30
Flatten embedders settings 2023-12-20 17:16:43 +01:00
Louis Dureuil
e249e4db7b
Change Setting::apply function signature 2023-12-20 17:15:24 +01:00
meili-bors[bot]
de2ca7006e
Merge #4272
4272: Don't pass default revision when the model is explicitly set in config r=Kerollmops a=dureuill

# Pull Request

## Related issue
Fixes #4271 

## What does this PR do?

- When the `model` is explicitly set in the `embedders` setting, we reset the `revision` to `None`, such that if the user doesn't specify a revision, the head of the model repository is chosen. 
- Not changed: If the user specifies a revision, it applies, like previously. 
- Not changed: If the user doesn't specify a model, the default model with the default revision applies, like previously.

## Manual testing on a fresh DB

1. Enable experimental feature:
```sh
curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' -H 'Authorization: Bearer foo' \
--data-binary '{ "vectorStore": true
  }'
```
2. Send settings with a specified model but no specified revision:
```sh
curl \
-X PATCH 'http://localhost:7700/indexes/products/settings' \
-H 'Content-Type: application/json' --data-binary \
'{ "embedders": { "default": { "source": { "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" } }, "documentTemplate": { "template": "A product titled '{{doc.title}}'"} } } }'
```
3. Check that the task was successful:
```sh
curl 'http://localhost:7700/tasks/0'

{"uid":0,"indexUid":"products","status":"succeeded","type":"settingsUpdate","canceledBy":null,"details":{"embedders":{"default":{"source":{"huggingFace":{"model":"sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}},"documentTemplate":{"template":"A product titled {{doc.title}}"}}}},"error":null,"duration":"PT0.001892S","enqueuedAt":"2023-12-20T09:17:01.73789Z","startedAt":"2023-12-20T09:17:01.73854Z","finishedAt":"2023-12-20T09:17:01.740432Z"}
```
4. Send documents to index:
```sh
curl 'https://localhost:7700/indexes/products/documents' -H 'Content-Type: application/json' --data-binary '{"id": 0, "title": "Best product"}'
```

Co-authored-by: Louis Dureuil <louis@meilisearch.com>
2023-12-20 14:27:51 +00:00
Louis Dureuil
333ce12eb2
Fixed issue where the default revision is always the one we picked for the default model 2023-12-20 10:17:49 +01:00
meili-bors[bot]
fb9db1eba6
Merge #4269
4269: Remove dependency that requires libstdc++ r=dureuill a=dureuill

Removes the dependency that caused the additional runtime dependency on libstdc++ by disabling the default features of the hf tokenizer.

## Discussion

- This removes a feature that is using a C++ dependency and is supposed to accelerate the tokenizer. As the tokenizer is likely to be a significant bottleneck for embedding texts using a HF model, this is an issue.
- We should at least rerun the movies vector indexing and check that it still works correctly and that it has a runtime in the ballpark of what it used to be.

Co-authored-by: Louis Dureuil <louis.dureuil@xinra.net>
2023-12-19 12:26:48 +00:00
Clément Renault
fa2b96b9a5
Add an Authorization Header along with the webhook calls 2023-12-19 12:18:45 +01:00
Tamo
19736cefe8
add the analytics 2023-12-19 10:36:04 +01:00
Tamo
4fb25b8782
fix clippy 2023-12-19 10:35:51 +01:00
Tamo
c83a33017e
stream and chunk the data 2023-12-19 10:35:51 +01:00
Tamo
be72326c0a
gzip the tasks 2023-12-19 10:35:51 +01:00
Tamo
547379abb0
parse the url correctly 2023-12-19 10:35:51 +01:00
Tamo
0b2fff27f2
update and fix the test 2023-12-19 10:35:51 +01:00
Tamo
3adbc2b942
return a task view instead of a task 2023-12-19 10:35:51 +01:00
Tamo
fbea721378
add a first working test with actixweb 2023-12-19 10:35:51 +01:00
Tamo
391eb72137
start writing a test with actix but it doesn't works 2023-12-19 10:35:50 +01:00
Tamo
d78ad51082
Implement the webhook 2023-12-19 10:35:50 +01:00
Tamo
1956045a06
add the option 2023-12-19 10:23:56 +01:00
Louis Dureuil
b2193e612f
Revert "Add libstdc++ in Dockerfile" as it is no longer needed
This reverts commit 9df8cfc013.
2023-12-18 22:17:29 +01:00