217: Improve the benchmarks readme r=Kerollmops a=irevoire

- Move the Dataset part to the end of the readme so when peoples just want to run the benchmarks they are not tempted to download the benchmarks by hand (which are going to be downloaded anyway by the `build.rs` scritp)
- Fix the links in the dataset -- wiki part


Co-authored-by: Irevoire <tamo@meilisearch.com>
This commit is contained in:
bors[bot] 2021-06-08 08:44:16 +00:00 committed by GitHub
commit fd032165d7
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -7,36 +7,6 @@ Benchmarks
- [Comparison between benchmarks](#comparison-between-benchmarks)
- [Datasets](#datasets)
## Datasets
The benchmarks are available for the following datasets:
- `songs`
- `wiki`
### Songs
`songs` is a subset of the [`songs.csv` dataset](https://meili-datasets.s3.fr-par.scw.cloud/songs.csv.gz).
It was generated with this command:
```bash
xsv sample --seed 42 1000000 songs.csv -o smol-songs.csv
```
_[Download the generated `songs` dataset](https://meili-datasets.s3.fr-par.scw.cloud/benchmarks/smol-songs.csv.gz)._
### Wiki
`wiki` is a subset of the [`wikipedia-articles.csv` dataset](https://meili-datasets.s3.fr-par.scw.cloud/wikipedia-articles.csv.gz).
It was generated with the following command:
```bash
xsv sample --seed 42 500000 wikipedia-articles.csv -o smol-wikipedia-articles.csv
```
_[Download the generated `wiki` dataset](https://meili-datasets.s3.fr-par.scw.cloud/benchmarks/smol-wikipedia-articles.csv.gz)._
## Run the benchmarks
### On our private server
@ -108,3 +78,34 @@ Run the comparison script:
```bash
./benchmarks/scripts/compare.sh songs_main_09a4321.json songs_geosearch_24ec456.json
```
## Datasets
The benchmarks are available for the following datasets:
- `songs`
- `wiki`
### Songs
`songs` is a subset of the [`songs.csv` dataset](https://meili-datasets.s3.fr-par.scw.cloud/songs.csv.gz).
It was generated with this command:
```bash
xsv sample --seed 42 1000000 songs.csv -o smol-songs.csv
```
_[Download the generated `songs` dataset](https://meili-datasets.s3.fr-par.scw.cloud/benchmarks/smol-songs.csv.gz)._
### Wiki
`wiki` is a subset of the [`wikipedia-articles.csv` dataset](https://meili-datasets.s3.fr-par.scw.cloud/wiki-articles.csv.gz).
It was generated with the following command:
```bash
xsv sample --seed 42 500000 wiki-articles.csv -o smol-wiki-articles.csv
```
_[Download the generated `wiki` dataset](https://meili-datasets.s3.fr-par.scw.cloud/benchmarks/smol-wiki-articles.csv.gz)._