meilisearch

mirror of https://github.com/meilisearch/meilisearch.git synced 2025-03-15 21:32:55 +08:00

History

meili-bors[bot] 81a38099ec

5336: Meilitool Hair Dryer r=dureuill a=Kerollmops

This pull request introduces a new subcommand to hair dry a specific part of specific indexes. It is useful when [the memory-mapped pages are not hot in the cache](https://arc.net/l/quote/ixhcdwcq) and must be. Hair drying those interesting pages makes the search requests using the vector store much faster.

The previous technique used the "cat method," which consists of reading the whole LMDB data file and pipping it into the null file descriptor. By doing that, the whole LMDB data file becomes hot in the cache. However, when the database is large, at least 30% of it is free, and unused pages and many other pages don't need to be hot, e.g., raw JSON documents or uninteresting parts of the inverted index.

This new subcommand reads all the Arroy pages of a given index to make them hot, and only those. More coming...

The current algorithm is single-threaded and takes a lot of time. I am in the process of multithreading it. This is the time it takes to hair dry a 305GiB database with a single thread.

```
real    21m51.054s
user    0m3.155s
sys     0m19.393s
```

## To Do
- [ ] (optional) Do the reads in parallel.

Co-authored-by: Kerollmops <clement@meilisearch.com>

2025-02-12 10:45:16 +00:00

benchmarks

fix the bad index version on opening

2025-01-23 16:51:24 +01:00

build-info

Upgrade compatible dependencies

2025-01-08 13:52:14 +01:00

dump

use serde_json::to_writer instead of serializing + writing

2025-02-11 11:14:49 +01:00

file-store

Upgrade incompatible dependencies