Commit Graph

9692 Commits

Author SHA1 Message Date
Kerollmops
436d2032c4
Add benchmarks to the flatten-serde-json subcrate 2022-04-13 13:12:57 +02:00
bors[bot]
3828635fb2
Merge #489
489: fix distinct count bug r=curquiza a=MarinPostma

fix https://github.com/meilisearch/meilisearch/issues/2152

I think the issue was that we didn't take off the excluded candidates from the initial candidates when returning the candidates with the search result.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-13 10:15:30 +00:00
ad hoc
dda28d7415
exclude excluded canditates from search result candidates 2022-04-13 12:10:35 +02:00
ad hoc
cd83014fff
add test for disctinct nb hits 2022-04-13 12:10:35 +02:00
ad hoc
bbb6728d2f
add distinct attributes to cli 2022-04-13 12:10:35 +02:00
bors[bot]
49fbbacafc
Merge #492
492: Add the new `Specify breaking` check to bors.toml r=curquiza a=curquiza

Should prevent this problem: https://github.com/meilisearch/milli/pull/489#issuecomment-1094988060

Co-authored-by: Clémentine Urquizar - curqui <clementine@meilisearch.com>
2022-04-13 08:59:40 +00:00
Clémentine Urquizar - curqui
7ad582f39f
Update bors.toml 2022-04-13 10:56:56 +02:00
Clémentine Urquizar - curqui
aa896f0e7a
Update bors.toml 2022-04-13 10:56:56 +02:00
Clémentine Urquizar - curqui
0261a0e3cf
Add the new Specify breaking check to bors.toml 2022-04-13 10:56:55 +02:00
Paul Sanders
41249be274 Add version flag 2022-04-12 15:22:36 -04:00
ManyTheFish
5809d3ae0d Add first benchmarks on formatting 2022-04-12 16:31:58 +02:00
bors[bot]
049cf0fcee
Merge #2313
2313: fix(search): remove the back and forth between the IndexMap and the serde_json::Map r=irevoire a=irevoire

This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap.

See https://github.com/meilisearch/meilisearch/pull/2298#discussion_r845228412_


Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-12 14:17:26 +00:00
Tamo
2ee210483f
fix(search): remove the back and forth between the IndexMap and the serde_json::Map
This is ok because we're using the preserve_order feature in serde_json which is already internally using an IndexMap.
2022-04-12 16:12:52 +02:00
ManyTheFish
827cedcd15 Add format option structure 2022-04-12 13:42:14 +02:00
ManyTheFish
011f8210ed Make compute_matches more rust idiomatic 2022-04-12 10:19:02 +02:00
bors[bot]
6b0737384b
Merge #491
491: remove the unused key warning r=curquiza a=irevoire

When I copy-pasted my flatten crate I forgot to remove the key used to publish the package and that throw a warning.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 16:55:25 +00:00
bors[bot]
13205066f3
Merge #2311
2311: Change version for the next release (v0.27.0) r=irevoire a=curquiza

Fixes #2310 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-11 14:49:33 +00:00
Clémentine Urquizar
b3661bf8ec
Change version for the next release (v0.27.0) 2022-04-11 16:25:15 +02:00
ManyTheFish
0990e95830 Feat(Analytics): Add analytics for search format options 2022-04-11 14:53:15 +02:00
Tamo
e153418b8a
remove the unused key warning 2022-04-11 14:52:41 +02:00
bors[bot]
f67167fa9f
Merge #2178
2178: Refacto docker r=irevoire a=irevoire

closes #2166 and #2085

-----------

I noticed many people had issues with the default configuration of our Dockerfile.
Some examples:
- #2166: If you use ubuntu and mount your `data.ms` in a volume (as shown in the [doc](https://docs.meilisearch.com/learn/getting_started/installation.html#download-and-launch)), you can't run meilisearch
- #2085: Here, meilisearch was not able to erase the `data.ms` when loading a dump because it's the mount point
Currently, we don't show how to use the snapshot and dumps with docker in the documentation. And it's quite hard to do:
  - You either send a big command to meilisearch to change the dump-path, snapshot-path and db-path a single directory and then mount that one
  - Or you mount three volumes
- And there were other issues on the slack community

I think this PR solve the problem.
Now the image contains the `meilisearch` binary in the `/bin` directory, so it's easy to find and always in the `PATH`.
It creates a `data` directory and moves the working-dir in it.
So now you can find the `dumps`, `snapshots` and `data.ms` directory in `/data`.

Here is the new command to run meilisearch with a volume:
```
docker run -it --rm -v $PWD/meili_data:/data -p 7700:7700 getmeili/meilisearch:latest
```

And if you need to import a dump or a snapshot, you don't need to restart your container and mount another volume. You can directly hit the `POST /dumps` route and then run:
```
docker run -it --rm -v $PWD/meili_data:/data -p 7700:7700 getmeili/meilisearch:latest meilisearch --import-dump dumps/20220217-152115159.dump
```

-------

You can already try this PR with the following docker image:
```
getmeili/meilisearch:test-docker-v0.26.0
```

If you want to use the v0.25.2 I created another image;
```
getmeili/meilisearch:test-docker-v0.25.2
```

------

If you're using helm I created a branch [here](https://github.com/meilisearch/meilisearch-kubernetes/tree/test-docker-v0.26.0) that use the v0.26.0 image with the good volume 👍 
If you use this conf with the v0.25.2, it should also work.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 12:01:25 +00:00
bors[bot]
31584f34e8
Merge #2298
2298: Nested fields r=irevoire a=irevoire

There are a few things that I want to fix _AFTER_ merging this PR.
For the following RCs.

## Stop the useless conversion
In the `search.rs` I convert a `Document` to a `Value`, and then the `Value` to a `Document` and then back to a `Value` etc. I should stop doing all these conversion and stick to one format.
Probably by merging my `permissive-json-pointer` crate into meilisearch.
That would also give me the opportunity to work directly with obkvs and stops deserializing fields I don't need.

## Add more test specific to the nested
Everything seems to works but I should write tests to double check that the nested works well with the `formatted` field.

## See how I could stop iterating on hashmap and instead fill them correctly
This is related to milli. I really often needs to iterate over hashmap to see if a field is a subset of another field. I could probably generate a structure containing all the possible key values.
ie. the user say `doggo` is an attribute to retrieve. Instead of iterating on all the attributes to retrieve to check if `doggo.name` is a subset of `doggo`. I should insert `doggo.name` in the attributes to retrieve map.

Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 11:45:37 +00:00
bors[bot]
a70e0a6422
Merge #2304
2304: chore(bors): comments clippy out r=curquiza a=irevoire

There is currently an issue with clippy that stops us from merging PRs.
https://github.com/rust-lang/rust-clippy/issues/8662#issuecomment-1093899755

We can't use clippy in the CI while that's not merged


Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-11 11:20:18 +00:00
Tamo
348345f555
chore(bors): comments clippy out
There is currently an issue with clippy that stops us from merging PRs.
https://github.com/rust-lang/rust-clippy/issues/8662#issuecomment-1093899755

We can't use clippy in the CI while that's not merged
2022-04-11 13:19:00 +02:00
Tamo
683206e140
feat(docker): refactoring the dockerfile
- Move the meilisearch binary to `/bin/meilisearch` so it's always in scope.
- Create a `meili_data` directory used as the default working directory
2022-04-11 13:14:44 +02:00
bors[bot]
c8306616e0
Merge #490
490: Enforce labelling for the PRs r=curquiza a=curquiza

- Enforce one of the following labels to make the CI pass: `no breaking`, `DB breaking`, `API breaking` (milli API, not the Meilisearch API of course), or `skip changelog`. This new CI is now `Required` in the GitHub settings for merging a PR.
- Adapt the release drafter to these new labels
- rename `skip-changelog` into `skip changelog` according to the new label name

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-11 08:24:23 +00:00
Clémentine Urquizar
9383629d13
Enforce labelling for the PRs 2022-04-09 23:47:06 +02:00
ManyTheFish
a16de5de84 Symplify format and remove intermediate function 2022-04-08 11:20:41 +02:00
ManyTheFish
a769e09dfa Make token_crop_bounds more rust idiomatic 2022-04-07 20:15:14 +02:00
Tamo
69d312209e
feat(search): Implements the nested fields
See https://github.com/meilisearch/specifications/pull/121
2022-04-07 19:47:20 +02:00
bors[bot]
9ac2fd1c37
Merge #487
487: Update version (v0.26.0) r=Kerollmops a=curquiza

breaking because of #458 

Co-authored-by: Clémentine Urquizar <clementine@meilisearch.com>
2022-04-07 17:10:24 +00:00
bors[bot]
80ae020bee
Merge #458
458: Nested fields r=Kerollmops a=irevoire

For the following document:
```json
{
  "id": 1,
  "person": {
    "name": "tamo",
    "age": 25,
  }
}
```
Suppose the user sets `person` as a filterable attribute. We need to store `person` in the filterable _obviously_. But we also need to keep track of `person.name` and `person.age` somewhere.
That’s where I changed a little bit the logic of the engine.

Currently, we have a function called `faceted_field` that returns the union of the filterable and sortable.
I renamed this function in `user_defined_faceted_field`. And now, when we finish indexing documents, we look at all the fields and see if they « match » a `user_defined_faceted_field`.
So in our case:
- does `id` match `person`: 🔴 
- does `person.name` match `person`: 🟢 
- does `person.age` match `person`: 🟢 

And thus, we insert in the database the following faceted fields: `person, person.name, person.age`.

The good thing about that solution is that we generate everything during the indexing phase, and then during the search, we can access our field without recomputing too much globbing.

-----

Now the bad thing is that I had to create a new db.

And if that was only one db, that would be ok, but actually, I need to do the same for the:
- Displayed attributes
- Attributes to retrieve
- Attributes to highlight
- Attribute to crop

`@Kerollmops` 
Do you think there is a better way to do it?
Apart from all the code, can we have a problem because we have too many dbs?

Co-authored-by: Irevoire <tamo@meilisearch.com>
Co-authored-by: Tamo <tamo@meilisearch.com>
2022-04-07 16:26:09 +00:00
Tamo
bab898ce86
move the flatten-serde-json crate inside of milli 2022-04-07 18:20:44 +02:00
ManyTheFish
c8ed1675a7 Add some documentation 2022-04-07 17:32:13 +02:00
ManyTheFish
b1905dfa24 Make split_best_frequency returns references instead of owned data 2022-04-07 17:05:44 +02:00
Tamo
ab458d8840
fix tests after rebase 2022-04-07 17:00:00 +02:00
Irevoire
4f3ce6d9cd
nested fields 2022-04-07 16:58:46 +02:00
Clémentine Urquizar
ee1d627803
Update version (v0.26.0) 2022-04-07 15:56:10 +02:00
bors[bot]
013fe4cbc9
Merge #2297
2297: Feat(Search): Enhance formating search results r=ManyTheFish a=ManyTheFish

Add new settings and change crop_len behavior to count words instead of characters.

- [x] `highlightPreTag`
- [x] `highlightPostTag`
- [x] `cropMarker`
- [x] `cropLength` count word instead of chars
- [x] `cropLength` 0 is now considered as no `cropLength`
- [ ] ~smart crop finding the best matches interval~ (postponed)

Partially fixes  #2214. (no smart crop)


Co-authored-by: ManyTheFish <many@meilisearch.com>
2022-04-07 13:29:56 +00:00
ManyTheFish
dc2cc1ee89 Feat(Search): Enhance formating search results 2022-04-07 15:04:08 +02:00
bors[bot]
bb5f0e1485
Merge #2271
2271: Simplify Dockerfile r=ManyTheFish a=Thearas

# Pull Request

## What does this PR do?

1. Fixes #2234
2. Replace `$TARGETPLATFORM` with `apk --print-arch` to make Dockerfile available for `docker build` as well, not just `docker buildx` (inspired by [rust-lang/docker-rust](https://github.com/rust-lang/docker-rust/blob/master/1.59.0/alpine3.14/Dockerfile#L13))

PTAL `@curquiza` 

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Thearas <thearas850@gmail.com>
2022-04-07 11:50:21 +00:00
bors[bot]
d5e33637b7
Merge #2296
2296: disable typo for attributes r=curquiza a=MarinPostma

Introduce the disable typos on attribute feature as per https://github.com/meilisearch/specifications/pull/117.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 18:10:45 +00:00
bors[bot]
c321ac61b5
Merge #2259
2259: disable typos on words r=MarinPostma a=MarinPostma

Introduce the disable typo setting as per https://github.com/meilisearch/specifications/pull/117.

waiting for https://github.com/meilisearch/milli/pull/474.


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 17:40:08 +00:00
ad hoc
67dea08a0a
feat(http, lib): enable disable typos on attributes 2022-04-06 19:25:12 +02:00
ad hoc
e9f66b8766
feat(all): introduce disable typo on words 2022-04-06 19:16:36 +02:00
ad hoc
dd43ba6234
feat(all): introduce disable typos 2022-04-06 19:10:12 +02:00
ad hoc
27a88bcd47
feat(all): introduce minWordLengthForTypo
fix typo in settting

skip serializing not set typo settings
2022-04-06 19:03:24 +02:00
bors[bot]
065fe19452
Merge #2249
2249: feat(all): introduce disable typos r=irevoire a=MarinPostma

Introduce the disable typo setting, that allows disabling typos for an index.

waiting on https://github.com/meilisearch/milli/pull/469

Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 16:55:46 +00:00
ad hoc
981fba5b44
feat(all): introduce disable typos 2022-04-06 15:47:48 +02:00
bors[bot]
09734f0732
Merge #2294
2294: chore(lib): bump milli to 0.25.0 r=MarinPostma a=MarinPostma

bump milli


Co-authored-by: ad hoc <postma.marin@protonmail.com>
2022-04-06 13:19:20 +00:00