mirror of https://github.com/meilisearch/meilisearch.git synced 2025-01-18 08:48:32 +08:00

A lightning-fast search API that fits effortlessly into your apps, websites, and workflow

Go to file

Clément Renault 9a48091b21 Merge pull request #353 from meilisearch/bump-version Bump meilisearch crates to v0.8.3		2019-11-29 14:13:37 +01:00
.github/workflows	Make sure the lock file is up to date	2019-11-27 12:06:14 +01:00
datasets/movies	Rename MeiliDB into MeiliSearch	2019-11-26 11:12:30 +01:00
meilisearch-core	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
meilisearch-http	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
meilisearch-schema	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
meilisearch-tokenizer	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
meilisearch-types	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
misc	Add a gif to show a demo using crates.io	2019-11-09 12:59:39 +01:00
.dockerignore	Remove Azure CI	2019-11-25 13:20:54 +01:00
.gitignore	Make the repository be a binary and version the Cargo.lock	2019-11-09 12:13:28 +01:00
Cargo.lock	Bump meilisearch crates to v0.8.3	2019-11-29 14:06:17 +01:00
Cargo.toml	Rename MeiliDB into MeiliSearch	2019-11-26 11:12:30 +01:00
deep-dive.md	Rename MeiliDB into MeiliSearch	2019-11-26 11:12:30 +01:00
Dockerfile	Remove many dependencies from the Dockerfile	2019-11-28 17:04:01 +01:00
download-latest.sh	Add script for binary installation	2019-11-28 18:34:12 +01:00
LICENSE	Update README license badge	2019-11-28 14:28:30 +01:00
README.md	Clarification of readme file	2019-11-28 16:28:25 +01:00
typos-ranking-rules.md	Rename MeiliDB into MeiliSearch	2019-11-26 11:12:30 +01:00

README.md

MeiliSearch

⚡ Ultra relevant and instant full-text search API 🔍

MeiliSearch is a powerful, fast, open-source, easy to use, and deploy search engine. The search and indexation are fully customizable and handles features like typo-tolerance, filters, and synonyms.

What MeiliSearch has to offer

Search as-you-type experience (answers < 50ms)
Full-text search
Typo tolerant (understands typos and spelling mistakes)
Supports Kanji
Supports Synonym
Easy to install, deploy, and maintain
Whole documents returned
Highly customizable
RESTfull API

For more details about those features, go to our documentation.

Meili helps the Rust community find crates on crates.meilisearch.com

In-depth features

Provides 6 default ranking criteria used to bucket sort documents
Accepts custom criteria and can apply them in any custom order
Support ranged queries, useful for paginating results
Can distinct and filter returned documents based on context defined rules
Searches for concatenated and splitted query words to improve the search quality.
Can store complete documents or only user schema specified fields
The default tokenizer can index latin and kanji based languages
Returns the matching text areas, useful to highlight matched words in results
Accepts query time search config like the searchable attributes
Supports runtime incremental indexing

Quick Start

You can deploy your instant, relevant, and typo-tolerant MeiliSearch search engine by yourself too. Something similar to the demo above can be achieved by following these little three steps first. You still need to create your front-end to make it pretty, though.

Deploy the Server

If you have not yet installed Rust and its package manager cargo, go to the installation page.
You can deploy the server on your machine; it listens to HTTP requests on the 8080 port by default.

cargo run --release

For more logs during the execution, run:

RUST_LOG=info cargo run --release

Create an Index and Upload Some Documents

MeiliSearch can serve multiple indexes, with different kinds of documents, therefore, it is required to create the index before sending documents to it.

curl -i -X POST 'http://127.0.0.1:8080/indexes' --data '{ "name": "Movies", "uid": "movies" }'

Now that the server knows about our brand new index, we can send it data. We provided you a small dataset that is available in the datasets/ directory.

curl -i -X POST 'http://127.0.0.1:8080/indexes/movies/documents' \
  --header 'content-type: application/json' \
  --data @datasets/movies/movies.json

Search for Documents

The search engine is now aware of our documents and can serve those via our HTTP server again. The jq command-line tool can significantly help you read the server responses.

curl 'http://127.0.0.1:8080/indexes/movies/search?q=botman'

{
  "hits": [
    {
      "id": "29751",
      "title": "Batman Unmasked: The Psychology of the Dark Knight",
      "poster": "https://image.tmdb.org/t/p/w1280/jjHu128XLARc2k4cJrblAvZe0HE.jpg",
      "overview": "Delve into the world of Batman and the vigilante justice tha",
      "release_date": "2008-07-15"
    },
    {
      "id": "471474",
      "title": "Batman: Gotham by Gaslight",
      "poster": "https://image.tmdb.org/t/p/w1280/7souLi5zqQCnpZVghaXv0Wowi0y.jpg",
      "overview": "ve Victorian Age Gotham City, Batman begins his war on crime",
      "release_date": "2018-01-12"
    }
  ],
  "offset": 0,
  "limit": 2,
  "processingTimeMs": 1,
  "query": "botman"
}

Performances

With a dataset composed of 100 353 documents with 352 attributes each and 3 of them indexed. So more than 300 000 fields indexed for 35 million stored we can handle more than 2.8k req/sec with an average response time of 9 ms on an Intel i7-7700 (8) @ 4.2GHz.

Requests are made using wrk and scripted to simulate real users' queries.

Running 10s test @ http://localhost:2230
  2 threads and 25 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.52ms    7.61ms  99.25ms   84.58%
    Req/Sec     1.41k   119.11     1.78k    64.50%
  28080 requests in 10.01s, 7.42MB read
Requests/sec:   2806.46
Transfer/sec:    759.17KB

We also indexed a dataset containing something like 12 millions cities names in 24 minutes on a machine with 8 cores, 64 GB of RAM, and a 300 GB NMVe SSD.
The resulting database was 16 GB and search results were between 30 ms and 4 seconds for short prefix queries.

Notes

With Rust 1.32 the allocator has been changed to use the system allocator. We have seen much better performances when using jemalloc as the global allocator.

How it works

MeiliSearch uses LMDB as the internal key-value store. The key-value store allows us to handle updates and queries with small memory and CPU overheads. The whole ranking system is data oriented and provides excellent great performances.

You can read the deep dive if you want more information on the engine; it describes the whole process of generating updates and handling queries. Also, you can take a look at the typos and ranking rules if you want to know the default rules used to sort the documents.

Contributing

We will be glad if you submit issues and pull requests. You can help to grow this project and start contributing by checking issues tagged "good-first-issue". It is a good start!

Analytic Events

We send events to our Amplitude instance to be aware of the number of people who use MeiliSearch.
We only send the platform on which the server runs once by day. No other information is sent.
If you do not want us to send events, you can disable these analytics by using the MEILI_NO_ANALYTICS env variable.