Commit Graph

9840 Commits

Author SHA1 Message Date
Clément Renault
1bb9348a90
Remove the chinese-words.txt previous tokenizer related file 2021-01-13 11:01:57 +01:00
Clément Renault
9141f5ef94
Remove the custom build.rs file 2021-01-13 11:01:38 +01:00
mpostma
e5c220b82c
fix authentication cors bug 2021-01-12 18:08:16 +01:00
mpostma
60c636738b
fix cors error 2021-01-12 16:46:53 +01:00
many
06b2a587af
normalize synonyms during indexation 2021-01-12 13:53:32 +01:00
Clément Renault
51d1785576
Merge pull request #63 from meilisearch/meilisearch-tokenizer
Meilisearch tokenizer
2021-01-12 13:26:24 +01:00
mpostma
4f7f7538f7
highlight with new tokenizer 2021-01-11 21:59:37 +01:00
marin
26b1e5a51b
Merge pull request #1171 from meilisearch/fix-changelog-typo
fix changelog typo
2021-01-11 14:13:30 +01:00
mpostma
81f343a46a
add word limit to search queries 2021-01-08 16:23:23 +01:00
KARASZI István
956adfc90a Replace in-place compression
Compress gzip files to a temporary file first and then do an atomic
rename.
2021-01-07 17:36:42 +01:00
mpostma
1ae761311e
integrate with meilisearch tokenizer 2021-01-07 16:14:27 +01:00
Clément Renault
7e1c94ab9c
Merge pull request #65 from meilisearch/improve-facet-value-display
Improve the facet value displaying
2021-01-07 16:12:32 +01:00
Clément Renault
0a1beb688c
Improve the facet value displaying, extracting the facet level 2021-01-07 16:05:09 +01:00
mpostma
c7c8ca63b6
fix changelog typo 2021-01-07 12:38:24 +01:00
bors[bot]
fa40c6e3d4
Merge #1168
1168: Bump meilisearch r=LegendreM a=MarinPostma



Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-06 11:02:16 +00:00
mpostma
7ccbbb7a75
update changelog 2021-01-06 11:54:06 +01:00
mpostma
948c89c26f
bump meilisearch 2021-01-06 11:41:44 +01:00
bors[bot]
768791440a
Merge #1167
1167: Update dumps ci r=LegendreM a=MarinPostma

Now that the dump test are re-entrant, they can be run from a multithreaded context, whereas they used to be ran from a single threaded context, in a separate CI task.

Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-06 09:42:59 +00:00
bors[bot]
08a8dc0d0d
Merge #1091
1091: New tokenizer r=LegendreM a=MarinPostma

Integration of the new tokenizer to meilisearch.

- Tokenize and normalizes the query string for better search results
- Language sensitive tokenization and normalization during indexation
- better support for Chinese thanks to jieba (when Chinese characters are detected)

To do in a later PR:
- Use a common tokenization instance
- use tokenization for synonyms

close #624

Co-authored-by: mpostma <postma.marin@protonmail.com>
Co-authored-by: many <maxime@meilisearch.com>
2021-01-06 08:47:53 +00:00
mpostma
0675ecdd73
remove specific task for dump in ci 2021-01-05 21:55:14 +01:00
mpostma
08c160c178
un-ignore dump tests 2021-01-05 21:54:14 +01:00
many
677627586c
fix test set
fix dump tests
2021-01-05 21:37:05 +01:00
mpostma
0731971300
fix style 2021-01-05 15:21:06 +01:00
mpostma
c290719984
remove byte offset in index_seq 2021-01-05 15:21:06 +01:00
mpostma
2a145e288c
fix style 2021-01-05 15:21:06 +01:00
many
aeb676e757
skip indexation while token is not a word 2021-01-05 15:21:06 +01:00
many
2852349e68
update tokenizer version 2021-01-05 15:21:06 +01:00
many
0447594e02
add search test on chinese scripts 2021-01-05 15:21:05 +01:00
many
748a8240dd
fix highlight shifting bug 2021-01-05 15:21:05 +01:00
mpostma
808be4678a
fix style 2021-01-05 15:21:05 +01:00
mpostma
398577f116
bump tokenizer 2021-01-05 15:21:05 +01:00
mpostma
8e64a24d19
fix suggestions 2021-01-05 15:21:05 +01:00
mpostma
8b149c9aa3
update tokenizer dep to release 2021-01-05 15:21:05 +01:00
mpostma
a7c88c7951
restore synonyms tests 2021-01-05 15:21:05 +01:00
mpostma
db64e19b8d
all tests pass 2021-01-05 15:21:05 +01:00
mpostma
b574960755
fix split_query_string 2021-01-05 15:21:05 +01:00
mpostma
c6434f609c
fix indexing length 2021-01-05 15:21:05 +01:00
mpostma
206308c1aa
replace hashset with fst::Set 2021-01-05 15:21:05 +01:00
mpostma
6527d3e492
better separator handling 2021-01-05 15:21:05 +01:00
mpostma
e616b1e356
hard separator offset 2021-01-05 15:21:05 +01:00
mpostma
8843062604
fix indexer tests 2021-01-05 15:21:05 +01:00
mpostma
5e00842087
integration with new tokenizer wip 2021-01-05 15:21:05 +01:00
mpostma
8a4d05b7bb
remove meilisearch tokenizer 2021-01-05 15:21:05 +01:00
bors[bot]
061832af7f
Merge #1163
1163: remove benches r=LegendreM a=MarinPostma

remove unused benches, that did not compile either


Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-05 13:27:42 +00:00
bors[bot]
9dd818ed7b
Merge #1165
1165: Bumps r=MarinPostma a=MarinPostma



Co-authored-by: mpostma <postma.marin@protonmail.com>
2021-01-05 12:55:50 +00:00
mpostma
0e04c90abe
remove benches 2021-01-05 10:54:19 +01:00
mpostma
b07e21ab3c temp 2021-01-05 00:21:42 +01:00
mpostma
83ea088bf7
fix incompatible deps 2021-01-04 18:33:22 +01:00
mpostma
48eb78b14d
bump deps 2021-01-04 16:56:28 +01:00
bors[bot]
e3d1314bd8
Merge #1147
1147: Increasing payload default size r=LegendreM a=sanders41

References issue #1137

Increasing the default payload size from 10mb to 100mb.

Co-authored-by: Paul Sanders <psanders1@gmail.com>
2021-01-04 12:47:06 +00:00