meilisearch/CONTRIBUTING.md
2024-07-18 17:28:01 +02:00

9.2 KiB

Contributing

First, thank you for contributing to Meilisearch! The goal of this document is to provide everything you need to start contributing to Meilisearch.

Remember that there are many ways to contribute other than writing code: writing tutorials or blog posts, improving the documentation, submitting bug reports and feature requests...

Meilisearch can manage multiple indexes, handle the update store, and expose an HTTP API. Search and indexation are the domain of our core engine, milli, while tokenization is handled by our charabia library.

If Meilisearch does not offer optimized support for your language, please consider contributing to charabia by following the CONTRIBUTING.md file and integrating your intended normalizer/segmenter.

Table of Contents

Assumptions

  1. You're familiar with GitHub and the Pull Requests (PR) workflow.
  2. You've read the Meilisearch documentation.
  3. You know about the Meilisearch community on Discord. Please use this for help.

How to Contribute

  1. Ensure your change has an issue! Find an existing issue or open a new issue.
    • This is where you can get a feel if the change will be accepted or not.
  2. Once approved, fork the Meilisearch repository in your own GitHub account.
  3. Create a new Git branch
  4. Review the Development Workflow section that describes the steps to maintain the repository.
  5. Make your changes on your branch.
  6. Submit the branch as a Pull Request pointing to the main branch of the Meilisearch repository. A maintainer should comment and/or review your Pull Request within a few days. Although depending on the circumstances, it may take longer.

Development Workflow

Setup and run Meilisearch

cargo run --release

We recommend using the --release flag to test the full performance of Meilisearch.

Test

cargo test

This command will be triggered to each PR as a requirement for merging it.

Faster build

You can set the LINDERA_CACHE environment variable to speed up your successive builds by up to 2 minutes. It'll store some built artifacts in the directory of your choice.

We recommend using the standard $HOME/.cache/lindera directory:

export LINDERA_CACHE=$HOME/.cache/lindera

Furthermore, you can improve incremental compilation by setting the MEILI_NO_VERGEN environment variable. Setting this variable will prevent the Meilisearch binary from being rebuilt each time the directory that hosts the Meilisearch repository changes. Do not enable this environment variable for production builds (as it will break the version route, among other things).

Snapshot-based tests

We are using insta to perform snapshot-based testing. We recommend using the insta tooling (such as cargo-insta) to update the snapshots if they change following a PR.

New tests should use insta where possible rather than manual assert statements.

Furthermore, we provide some macros on top of insta, notably a way to use snapshot hashes instead of inline snapshots, saving a lot of space in the repository.

To effectively debug snapshot-based hashes, we recommend you export the MEILI_TEST_FULL_SNAPS environment variable so that snapshot are fully created locally:

export MEILI_TEST_FULL_SNAPS=true # add this to your .bashrc, .zshrc, ...

Test troubleshooting

If you get a "Too many open files" error you might want to increase the open file limit using this command:

ulimit -Sn 3000

Build tools

Meilisearch follows the cargo xtask workflow to provide some build tools.

Run cargo xtask --help from the root of the repository to find out what is available.

Logging

Meilisearch uses tracing for logging purposes. Tracing logs are structured and can be displayed as JSON to the end user, so prefer passing arguments as fields rather than interpolating them in the message.

Refer to the documentation for the syntax of the spans and events.

Logging spans are used for 3 distinct purposes:

  1. Regular logging
  2. Profiling
  3. Benchmarking

As a result, the spans should follow some rules:

  • They should not be put on functions that are called too often. That is because opening and closing a span causes some overhead. For regular logging, avoid putting spans on functions that are taking less than a few hundred nanoseconds. For profiling or benchmarking, avoid putting spans on functions that are taking less than a few microseconds.
  • For profiling and benchmarking, use the TRACE level.
  • For profiling and benchmarking, use the following target prefixes:
    • indexing:: for spans meant when profiling the indexing operations.
    • search:: for spans meant when profiling the search operations.

Benchmarking

See BENCHMARKS.md

Git Guidelines

Git Branches

All changes must be made in a branch and submitted as PR.

We do not enforce any branch naming style, but please use something descriptive of your changes.

Git Commits

As minimal requirements, your commit message should:

  • be capitalized
  • not finish by a dot or any other punctuation character (!,?)
  • start with a verb so that we can read your commit message this way: "This commit will ...", where "..." is the commit message. e.g.: "Fix the home page button" or "Add more tests for create_index method"

We don't follow any other convention, but if you want to use one, we recommend the Chris Beams one.

GitHub Pull Requests

Some notes on GitHub PRs:

  • All PRs must be reviewed and approved by at least one maintainer.
  • The PR title should be accurate and descriptive of the changes.
  • Convert your PR as a draft if your changes are a work in progress: no one will review it until you pass your PR as ready for review.
    The draft PRs are recommended when you want to show that you are working on something and make your work visible.
  • The branch related to the PR must be up-to-date with main before merging. Fortunately, this project uses Bors to automatically enforce this requirement without the PR author having to rebase manually.

Release Process (for internal team only)

Meilisearch tools follow the Semantic Versioning Convention.

Automation to rebase and Merge the PRs

This project integrates a bot that helps us manage pull requests merging.
Read more about this.

How to Publish a new Release

The full Meilisearch release process is described in this guide. Please follow it carefully before doing any release.

How to publish a prototype

Depending on the developed feature, you might need to provide a prototyped version of Meilisearch to make it easier to test by the users.

This happens in two steps:

Release assets

For each release, the following assets are created:

  • Binaries for different platforms (Linux, MacOS, Windows and ARM architectures) are attached to the GitHub release
  • Binaries are pushed to HomeBrew and APT (not published for RC)
  • Docker tags are created/updated:
    • vX.Y.Z
    • vX.Y (not published for RC)
    • latest (not published for RC)

Thank you again for reading this through, we can not wait to begin to work with you if you made your way through this contributing guide ❤️