meilisearch/milli/src/search
Samyak S Sarnayak 30247d70cd
Fix search highlight for non-unicode chars
The `matching_bytes` function takes a `&Token` now and:
- gets the number of bytes to highlight (unchanged).
- uses `Token.num_graphemes_from_bytes` to get the number of grapheme
  clusters to highlight.

In essence, the `matching_bytes` function returns the number of matching
grapheme clusters instead of bytes. Should this function be renamed
then?

Added proper highlighting in the HTTP UI:
- requires dependency on `unicode-segmentation` to extract grapheme
  clusters from tokens
- `<mark>` tag is put around only the matched part
    - before this change, the entire word was highlighted even if only a
      part of it matched
2022-01-17 11:37:44 +05:30
..
criteria store the geopoint in three dimensions 2021-12-14 12:21:24 +01:00
distinct remove update_id in UpdateBuilder 2021-11-16 13:07:04 +01:00
facet store the geopoint in three dimensions 2021-12-14 12:21:24 +01:00
matching_words.rs Fix search highlight for non-unicode chars 2022-01-17 11:37:44 +05:30
mod.rs merge with main 2021-11-06 16:34:30 +01:00
query_tree.rs Count the number of char instead of counting bytes to assign the typo tolerance 2021-09-28 12:10:43 +02:00