doc: Update the README

This commit is contained in:
Clément Renault 2018-12-11 16:17:22 +01:00
parent 2cbb943cbe
commit f97f7f93f3
No known key found for this signature in database
GPG Key ID: 0151CDAB43460DAE

View File

@ -1,17 +1,24 @@
# MeiliDB # MeiliDB
A search engine based on the [blog posts serie](https://blog.algolia.com/inside-the-algolia-engine-part-1-indexing-vs-search/) of the great Algolia company. A _full-text search database_ using a key-value store internally.
If you want to be involved in the project you can [read the deep dive](deep-dive.md). It uses [RocksDB](https://github.com/facebook/rocksdb) like a classic database, to store documents and internal data. The key-value store power allow us to handle updates and queries with small memory and CPU overheads.
This is a library, this means that binary are not part of this repository You can [read the deep dive](deep-dive.md) if you want more informations on the engine, it describes the whole process of generating updates and handling queries.
but since I'm still nice I have made some examples for you in the `examples/` folder.
We will be proud if you send pull requests to help us grow this project, you can start with [issues tagged "good-first-issue"](https://github.com/Kerollmops/MeiliDB/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) to start !
At the moment this is a library only, this means that binaries are not part of this repository but since I'm still nice I have made some examples for you in the `examples/` folder that works with the data located in the `misc/` folder.
In a near future MeiliDB we be a binary like any database: updated and queried using some kind of protocol. It is the final goal, [see the milestones](https://github.com/Kerollmops/MeiliDB/milestones). MeiliDB will just be a bunch of network and protocols functions wrapping the library which itself will be published to https://crates.io, following the same update cycle.
## Performances ## Performances
We made some tests on remote machines and found that we can handle, on a server that cost 5$/month with 1vCPU and 1GB of ram and on the same index and with a simple query: _these informations have been made with a version dated of october 2018, we must update them_
We made some tests on remote machines and found that we can handle with a dataset of near 280k products, on a server that cost 5$/month with 1vCPU and 1GB of ram and on the same index and with a simple query:
- near 190 users with an average response time of 90ms - near 190 users with an average response time of 90ms
- 150 users with an average response time of 70ms - 150 users with an average response time of 70ms
@ -27,21 +34,14 @@ MeiliDB work with an index like most of the search engines.
So to test the library you can create one by indexing a simple csv file. So to test the library you can create one by indexing a simple csv file.
```bash ```bash
cargo build --release --example csv-indexer cargo run --release --example create-database -- test.mdb misc/kaggle.csv
time ./target/release/examples/csv-indexer --stop-words misc/en.stopwords.txt misc/kaggle.csv
``` ```
The `en.stopwords.txt` file here is a simple file that contains one stop word by line (e.g. or, and). Once the command finished indexing the database should have been saved under the `test.mdb` folder.
Once the command finished indexing you will have 3 files that compose the index: Now you can easily run the `query-database` example to check what is stored in it.
- The `xxx.map` represent the fst map.
- The `xxx.idx` represent the doc indexes matching the words in the map.
- The `xxx.sst` is a file that contains all the fields and the values asociated with it, it is passed to the internal RocksDB.
Now you can easily run the `serve-console` or `serve-http` examples with the name of the dump. (e.g. relaxed-colden).
```bash ```bash
cargo build --release --example serve-console cargo run --release --example query-database -- test.mdb
./target/release/examples/serve-console relaxed-colden
``` ```