mirror of
https://github.com/meilisearch/meilisearch.git
synced 2024-11-27 04:25:06 +08:00
Merge #400
400: Rewrite the filter parser and add a lot of tests r=irevoire a=irevoire This PR is a complete rewrite of #358, which was reverted in #403. You can already try this PR in Meilisearch here https://github.com/meilisearch/MeiliSearch/pull/1880. Since writing a parser is quite complicated, I moved all the logic to another workspace called `filter_parser`. In this workspace, we don't know anything about milli, the filterable fields / field ID or anything. As you can see in its `cargo.toml`, it has only three dependencies entirely focused on the parsing part: ``` nom = "7.0.0" nom_locate = "4.0.0" ``` But introducing this new workspace made some changes necessary on the “AST”. Now the parser only returns `Tokens` (a simple `&str` with a bit of context). Everything is interpreted when we execute the filter later in milli. This crate provides a new error type for all filter related errors. --------- ## Errors Currently, we have multiple kinds of errors. Sometimes we are generating errors looking like that: (for `name = truc`) ``` Attribute `name` is not filterable. Available filterable attributes are: ``. ``` While sometimes pest was generating errors looking like that: ``` Invalid syntax for the filter parameter: ` --> 1:7 | 1 | name = | ^--- | = expected word`. ``` Which most people were seeing like that: (for `name =`) ``` Invalid syntax for the filter parameter: ` --> 1:7\n |\n1 | name =\n | ^---\n |\n = expected word`. ``` ----------- With this PR, the error format is unified between all errors. All errors follow this more straightforward format: ``` The error message. [from char]:[to char] filter ``` This should be way easier to read when embedded in the JSON for a human. And it should also allow us to parse the errors easily and provide highlighting or something with a frontend playground. Here is an example of the two previous errors with the new format: For `name = truc`: ``` Attribute `name` is not filterable. Available filterable attributes are: ``. 1:4 name = truc ``` Or in one line: ``` Attribute `name` is not filterable. Available filterable attributes are: ``.\n1:4 name = truc ``` And for `name =`: ``` Was expecting a value but instead got nothing. 7:7 name = ``` Or in one line: ``` Was expecting a value but instead got nothing.\n7:7 name = ``` Also, since we now have control over the parser, we can generate more explicit error messages so a lot of new errors have been created. I tried to be as helpful as possible for the user; here is a little overview of the new error message you can get when misusing a filter: ``` Expression `"truc` is missing the following closing delimiter: `"`. 8:13 name = "truc ``` The `_geoRadius` filter is an operation and can't be used as a value. 8:30 name = _geoRadius(12, 13, 14) ``` etc ## Tests A lot of tests have been written in the `filter_parser` crate. I think there is a unit test for every part of the syntax. But since we can never be sure we covered all the cases, I also fuzzed the new parser A LOT (for ±8 hours on 20 threads). And the code to fuzz the parser is included in the workspace, so if one day we need to change something to the syntax, we'll be able to re-use it by simply running: ``` cargo fuzz run --release parse ``` ## Milli I renamed the type and module `filter_condition.rs` / `FilterCondition` to `filter.rs` / `Filter`. Co-authored-by: Tamo <tamo@meilisearch.com>
This commit is contained in:
commit
8dff08d772
@ -1,5 +1,5 @@
|
||||
[workspace]
|
||||
members = ["milli", "http-ui", "benchmarks", "infos", "helpers", "cli"]
|
||||
members = ["milli", "filter-parser", "http-ui", "benchmarks", "infos", "helpers", "cli"]
|
||||
default-members = ["milli"]
|
||||
|
||||
[profile.dev]
|
||||
|
@ -9,7 +9,7 @@ use criterion::BenchmarkId;
|
||||
use heed::EnvOpenOptions;
|
||||
use milli::documents::DocumentBatchReader;
|
||||
use milli::update::{IndexDocumentsMethod, Settings, UpdateBuilder};
|
||||
use milli::{FilterCondition, Index};
|
||||
use milli::{Filter, Index};
|
||||
use serde_json::{Map, Value};
|
||||
|
||||
pub struct Conf<'a> {
|
||||
@ -117,7 +117,7 @@ pub fn run_benches(c: &mut criterion::Criterion, confs: &[Conf]) {
|
||||
let mut search = index.search(&rtxn);
|
||||
search.query(query).optional_words(conf.optional_words);
|
||||
if let Some(filter) = conf.filter {
|
||||
let filter = FilterCondition::from_str(&rtxn, &index, filter).unwrap();
|
||||
let filter = Filter::from_str(filter).unwrap();
|
||||
search.filter(filter);
|
||||
}
|
||||
if let Some(sort) = &conf.sort {
|
||||
|
@ -250,7 +250,7 @@ impl Search {
|
||||
}
|
||||
|
||||
if let Some(ref filter) = self.filter {
|
||||
let condition = milli::FilterCondition::from_str(&txn, &index, filter)?;
|
||||
let condition = milli::Filter::from_str(filter)?;
|
||||
search.filter(condition);
|
||||
}
|
||||
|
||||
|
10
filter-parser/Cargo.toml
Normal file
10
filter-parser/Cargo.toml
Normal file
@ -0,0 +1,10 @@
|
||||
[package]
|
||||
name = "filter-parser"
|
||||
version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
|
||||
|
||||
[dependencies]
|
||||
nom = "7.0.0"
|
||||
nom_locate = "4.0.0"
|
36
filter-parser/README.md
Normal file
36
filter-parser/README.md
Normal file
@ -0,0 +1,36 @@
|
||||
# Filter parser
|
||||
|
||||
This workspace is dedicated to the parsing of the MeiliSearch filters.
|
||||
|
||||
Most of the code and explanation are in the [`lib.rs`](./src/lib.rs). Especially, the BNF of the filters at the top of this file.
|
||||
|
||||
The parser use [nom](https://docs.rs/nom/) to do most of its work and [nom-locate](https://docs.rs/nom_locate/) to keep track of what we were doing when we encountered an error.
|
||||
|
||||
## Cli
|
||||
A simple main is provided to quick-test if a filter can be parsed or not without bringing milli.
|
||||
It takes one argument and try to parse it.
|
||||
```
|
||||
cargo run -- 'field = value' # success
|
||||
cargo run -- 'field = "doggo' # error => missing closing delimiter "
|
||||
```
|
||||
|
||||
## Fuzz
|
||||
The workspace have been fuzzed with [cargo-fuzz](https://rust-fuzz.github.io/book/cargo-fuzz.html).
|
||||
|
||||
### Setup
|
||||
You'll need rust-nightly to execute the fuzzer.
|
||||
|
||||
```
|
||||
cargo install cargo-fuzz
|
||||
```
|
||||
|
||||
### Run
|
||||
When the filter parser is executed by the fuzzer it's triggering a stackoverflow really fast. We can avoid this problem by limiting the `max_len` of [libfuzzer](https://llvm.org/docs/LibFuzzer.html) at 500 characters.
|
||||
```
|
||||
cargo fuzz run parse -- -max_len=500
|
||||
```
|
||||
|
||||
## What to do if you find a bug in the parser
|
||||
|
||||
- Write a test at the end of the [`lib.rs`](./src/lib.rs) to ensure it never happens again.
|
||||
- Add a file in [the corpus directory](./fuzz/corpus/parse/) with your filter to help the fuzzer find new bugs. Since this directory is going to be heavily polluted by the execution of the fuzzer it's in the gitignore and you'll need to force push your new test.
|
25
filter-parser/fuzz/Cargo.toml
Normal file
25
filter-parser/fuzz/Cargo.toml
Normal file
@ -0,0 +1,25 @@
|
||||
[package]
|
||||
name = "filter-parser-fuzz"
|
||||
version = "0.0.0"
|
||||
authors = ["Automatically generated"]
|
||||
publish = false
|
||||
edition = "2018"
|
||||
|
||||
[package.metadata]
|
||||
cargo-fuzz = true
|
||||
|
||||
[dependencies]
|
||||
libfuzzer-sys = "0.4"
|
||||
|
||||
[dependencies.filter-parser]
|
||||
path = ".."
|
||||
|
||||
# Prevent this from interfering with workspaces
|
||||
[workspace]
|
||||
members = ["."]
|
||||
|
||||
[[bin]]
|
||||
name = "parse"
|
||||
path = "fuzz_targets/parse.rs"
|
||||
test = false
|
||||
doc = false
|
1
filter-parser/fuzz/corpus/parse/test_1
Normal file
1
filter-parser/fuzz/corpus/parse/test_1
Normal file
@ -0,0 +1 @@
|
||||
channel = Ponce
|
1
filter-parser/fuzz/corpus/parse/test_10
Normal file
1
filter-parser/fuzz/corpus/parse/test_10
Normal file
@ -0,0 +1 @@
|
||||
channel != ponce
|
1
filter-parser/fuzz/corpus/parse/test_11
Normal file
1
filter-parser/fuzz/corpus/parse/test_11
Normal file
@ -0,0 +1 @@
|
||||
NOT channel = ponce
|
1
filter-parser/fuzz/corpus/parse/test_12
Normal file
1
filter-parser/fuzz/corpus/parse/test_12
Normal file
@ -0,0 +1 @@
|
||||
subscribers < 1000
|
1
filter-parser/fuzz/corpus/parse/test_13
Normal file
1
filter-parser/fuzz/corpus/parse/test_13
Normal file
@ -0,0 +1 @@
|
||||
subscribers > 1000
|
1
filter-parser/fuzz/corpus/parse/test_14
Normal file
1
filter-parser/fuzz/corpus/parse/test_14
Normal file
@ -0,0 +1 @@
|
||||
subscribers <= 1000
|
1
filter-parser/fuzz/corpus/parse/test_15
Normal file
1
filter-parser/fuzz/corpus/parse/test_15
Normal file
@ -0,0 +1 @@
|
||||
subscribers >= 1000
|
1
filter-parser/fuzz/corpus/parse/test_16
Normal file
1
filter-parser/fuzz/corpus/parse/test_16
Normal file
@ -0,0 +1 @@
|
||||
NOT subscribers < 1000
|
1
filter-parser/fuzz/corpus/parse/test_17
Normal file
1
filter-parser/fuzz/corpus/parse/test_17
Normal file
@ -0,0 +1 @@
|
||||
NOT subscribers > 1000
|
1
filter-parser/fuzz/corpus/parse/test_18
Normal file
1
filter-parser/fuzz/corpus/parse/test_18
Normal file
@ -0,0 +1 @@
|
||||
NOT subscribers <= 1000
|
1
filter-parser/fuzz/corpus/parse/test_19
Normal file
1
filter-parser/fuzz/corpus/parse/test_19
Normal file
@ -0,0 +1 @@
|
||||
NOT subscribers >= 1000
|
1
filter-parser/fuzz/corpus/parse/test_2
Normal file
1
filter-parser/fuzz/corpus/parse/test_2
Normal file
@ -0,0 +1 @@
|
||||
subscribers = 12
|
1
filter-parser/fuzz/corpus/parse/test_20
Normal file
1
filter-parser/fuzz/corpus/parse/test_20
Normal file
@ -0,0 +1 @@
|
||||
subscribers 100 TO 1000
|
1
filter-parser/fuzz/corpus/parse/test_21
Normal file
1
filter-parser/fuzz/corpus/parse/test_21
Normal file
@ -0,0 +1 @@
|
||||
NOT subscribers 100 TO 1000
|
1
filter-parser/fuzz/corpus/parse/test_22
Normal file
1
filter-parser/fuzz/corpus/parse/test_22
Normal file
@ -0,0 +1 @@
|
||||
_geoRadius(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_23
Normal file
1
filter-parser/fuzz/corpus/parse/test_23
Normal file
@ -0,0 +1 @@
|
||||
NOT _geoRadius(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_24
Normal file
1
filter-parser/fuzz/corpus/parse/test_24
Normal file
@ -0,0 +1 @@
|
||||
channel = ponce AND 'dog race' != 'bernese mountain'
|
1
filter-parser/fuzz/corpus/parse/test_25
Normal file
1
filter-parser/fuzz/corpus/parse/test_25
Normal file
@ -0,0 +1 @@
|
||||
channel = ponce OR 'dog race' != 'bernese mountain'
|
1
filter-parser/fuzz/corpus/parse/test_26
Normal file
1
filter-parser/fuzz/corpus/parse/test_26
Normal file
@ -0,0 +1 @@
|
||||
channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000
|
1
filter-parser/fuzz/corpus/parse/test_27
Normal file
1
filter-parser/fuzz/corpus/parse/test_27
Normal file
@ -0,0 +1 @@
|
||||
channel = ponce AND ( 'dog race' != 'bernese mountain' OR subscribers > 1000 )
|
1
filter-parser/fuzz/corpus/parse/test_28
Normal file
1
filter-parser/fuzz/corpus/parse/test_28
Normal file
@ -0,0 +1 @@
|
||||
(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_29
Normal file
1
filter-parser/fuzz/corpus/parse/test_29
Normal file
@ -0,0 +1 @@
|
||||
channel = Ponce = 12
|
1
filter-parser/fuzz/corpus/parse/test_3
Normal file
1
filter-parser/fuzz/corpus/parse/test_3
Normal file
@ -0,0 +1 @@
|
||||
channel = 'Mister Mv'
|
1
filter-parser/fuzz/corpus/parse/test_30
Normal file
1
filter-parser/fuzz/corpus/parse/test_30
Normal file
@ -0,0 +1 @@
|
||||
channel =
|
1
filter-parser/fuzz/corpus/parse/test_31
Normal file
1
filter-parser/fuzz/corpus/parse/test_31
Normal file
@ -0,0 +1 @@
|
||||
channel = 🐻
|
1
filter-parser/fuzz/corpus/parse/test_32
Normal file
1
filter-parser/fuzz/corpus/parse/test_32
Normal file
@ -0,0 +1 @@
|
||||
OR
|
1
filter-parser/fuzz/corpus/parse/test_33
Normal file
1
filter-parser/fuzz/corpus/parse/test_33
Normal file
@ -0,0 +1 @@
|
||||
AND
|
1
filter-parser/fuzz/corpus/parse/test_34
Normal file
1
filter-parser/fuzz/corpus/parse/test_34
Normal file
@ -0,0 +1 @@
|
||||
channel Ponce
|
1
filter-parser/fuzz/corpus/parse/test_35
Normal file
1
filter-parser/fuzz/corpus/parse/test_35
Normal file
@ -0,0 +1 @@
|
||||
channel = Ponce OR
|
1
filter-parser/fuzz/corpus/parse/test_36
Normal file
1
filter-parser/fuzz/corpus/parse/test_36
Normal file
@ -0,0 +1 @@
|
||||
_geoRadius
|
1
filter-parser/fuzz/corpus/parse/test_37
Normal file
1
filter-parser/fuzz/corpus/parse/test_37
Normal file
@ -0,0 +1 @@
|
||||
_geoRadius = 12
|
1
filter-parser/fuzz/corpus/parse/test_38
Normal file
1
filter-parser/fuzz/corpus/parse/test_38
Normal file
@ -0,0 +1 @@
|
||||
_geoPoint(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_39
Normal file
1
filter-parser/fuzz/corpus/parse/test_39
Normal file
@ -0,0 +1 @@
|
||||
position <= _geoPoint(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_4
Normal file
1
filter-parser/fuzz/corpus/parse/test_4
Normal file
@ -0,0 +1 @@
|
||||
channel = "Mister Mv"
|
1
filter-parser/fuzz/corpus/parse/test_40
Normal file
1
filter-parser/fuzz/corpus/parse/test_40
Normal file
@ -0,0 +1 @@
|
||||
position <= _geoRadius(12, 13, 14)
|
1
filter-parser/fuzz/corpus/parse/test_41
Normal file
1
filter-parser/fuzz/corpus/parse/test_41
Normal file
@ -0,0 +1 @@
|
||||
channel = 'ponce
|
1
filter-parser/fuzz/corpus/parse/test_42
Normal file
1
filter-parser/fuzz/corpus/parse/test_42
Normal file
@ -0,0 +1 @@
|
||||
channel = "ponce
|
1
filter-parser/fuzz/corpus/parse/test_43
Normal file
1
filter-parser/fuzz/corpus/parse/test_43
Normal file
@ -0,0 +1 @@
|
||||
channel = mv OR (followers >= 1000
|
1
filter-parser/fuzz/corpus/parse/test_5
Normal file
1
filter-parser/fuzz/corpus/parse/test_5
Normal file
@ -0,0 +1 @@
|
||||
'dog race' = Borzoi
|
1
filter-parser/fuzz/corpus/parse/test_6
Normal file
1
filter-parser/fuzz/corpus/parse/test_6
Normal file
@ -0,0 +1 @@
|
||||
"dog race" = Chusky
|
1
filter-parser/fuzz/corpus/parse/test_7
Normal file
1
filter-parser/fuzz/corpus/parse/test_7
Normal file
@ -0,0 +1 @@
|
||||
"dog race" = "Bernese Mountain"
|
1
filter-parser/fuzz/corpus/parse/test_8
Normal file
1
filter-parser/fuzz/corpus/parse/test_8
Normal file
@ -0,0 +1 @@
|
||||
'dog race' = 'Bernese Mountain'
|
1
filter-parser/fuzz/corpus/parse/test_9
Normal file
1
filter-parser/fuzz/corpus/parse/test_9
Normal file
@ -0,0 +1 @@
|
||||
"dog race" = 'Bernese Mountain'
|
18
filter-parser/fuzz/fuzz_targets/parse.rs
Normal file
18
filter-parser/fuzz/fuzz_targets/parse.rs
Normal file
@ -0,0 +1,18 @@
|
||||
#![no_main]
|
||||
use filter_parser::{ErrorKind, FilterCondition};
|
||||
use libfuzzer_sys::fuzz_target;
|
||||
|
||||
fuzz_target!(|data: &[u8]| {
|
||||
if let Ok(s) = std::str::from_utf8(data) {
|
||||
// When we are fuzzing the parser we can get a stack overflow very easily.
|
||||
// But since this doesn't happens with a normal build we are just going to limit the fuzzer to 500 characters.
|
||||
if s.len() < 500 {
|
||||
match FilterCondition::parse(s) {
|
||||
Err(e) if matches!(e.kind(), ErrorKind::InternalError(_)) => {
|
||||
panic!("Found an internal error: `{:?}`", e)
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
73
filter-parser/src/condition.rs
Normal file
73
filter-parser/src/condition.rs
Normal file
@ -0,0 +1,73 @@
|
||||
//! BNF grammar:
|
||||
//!
|
||||
//! ```text
|
||||
//! condition = value ("==" | ">" ...) value
|
||||
//! to = value value TO value
|
||||
//! ```
|
||||
|
||||
use nom::branch::alt;
|
||||
use nom::bytes::complete::tag;
|
||||
use nom::combinator::cut;
|
||||
use nom::sequence::tuple;
|
||||
use Condition::*;
|
||||
|
||||
use crate::{parse_value, FilterCondition, IResult, Span, Token};
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum Condition<'a> {
|
||||
GreaterThan(Token<'a>),
|
||||
GreaterThanOrEqual(Token<'a>),
|
||||
Equal(Token<'a>),
|
||||
NotEqual(Token<'a>),
|
||||
LowerThan(Token<'a>),
|
||||
LowerThanOrEqual(Token<'a>),
|
||||
Between { from: Token<'a>, to: Token<'a> },
|
||||
}
|
||||
|
||||
impl<'a> Condition<'a> {
|
||||
/// This method can return two operations in case it must express
|
||||
/// an OR operation for the between case (i.e. `TO`).
|
||||
pub fn negate(self) -> (Self, Option<Self>) {
|
||||
match self {
|
||||
GreaterThan(n) => (LowerThanOrEqual(n), None),
|
||||
GreaterThanOrEqual(n) => (LowerThan(n), None),
|
||||
Equal(s) => (NotEqual(s), None),
|
||||
NotEqual(s) => (Equal(s), None),
|
||||
LowerThan(n) => (GreaterThanOrEqual(n), None),
|
||||
LowerThanOrEqual(n) => (GreaterThan(n), None),
|
||||
Between { from, to } => (LowerThan(from), Some(GreaterThan(to))),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// condition = value ("==" | ">" ...) value
|
||||
pub fn parse_condition(input: Span) -> IResult<FilterCondition> {
|
||||
let operator = alt((tag("<="), tag(">="), tag("!="), tag("<"), tag(">"), tag("=")));
|
||||
let (input, (fid, op, value)) = tuple((parse_value, operator, cut(parse_value)))(input)?;
|
||||
|
||||
let condition = match *op.fragment() {
|
||||
"<=" => FilterCondition::Condition { fid, op: LowerThanOrEqual(value) },
|
||||
">=" => FilterCondition::Condition { fid, op: GreaterThanOrEqual(value) },
|
||||
"!=" => FilterCondition::Condition { fid, op: NotEqual(value) },
|
||||
"<" => FilterCondition::Condition { fid, op: LowerThan(value) },
|
||||
">" => FilterCondition::Condition { fid, op: GreaterThan(value) },
|
||||
"=" => FilterCondition::Condition { fid, op: Equal(value) },
|
||||
_ => unreachable!(),
|
||||
};
|
||||
|
||||
Ok((input, condition))
|
||||
}
|
||||
|
||||
/// to = value value TO value
|
||||
pub fn parse_to(input: Span) -> IResult<FilterCondition> {
|
||||
let (input, (key, from, _, to)) =
|
||||
tuple((parse_value, parse_value, tag("TO"), cut(parse_value)))(input)?;
|
||||
|
||||
Ok((
|
||||
input,
|
||||
FilterCondition::Condition {
|
||||
fid: key.into(),
|
||||
op: Between { from: from.into(), to: to.into() },
|
||||
},
|
||||
))
|
||||
}
|
158
filter-parser/src/error.rs
Normal file
158
filter-parser/src/error.rs
Normal file
@ -0,0 +1,158 @@
|
||||
use std::fmt::Display;
|
||||
|
||||
use nom::error::{self, ParseError};
|
||||
use nom::Parser;
|
||||
|
||||
use crate::{IResult, Span};
|
||||
|
||||
pub trait NomErrorExt<E> {
|
||||
fn is_failure(&self) -> bool;
|
||||
fn map_err<O: FnOnce(E) -> E>(self, op: O) -> nom::Err<E>;
|
||||
fn map_fail<O: FnOnce(E) -> E>(self, op: O) -> nom::Err<E>;
|
||||
}
|
||||
|
||||
impl<E> NomErrorExt<E> for nom::Err<E> {
|
||||
fn is_failure(&self) -> bool {
|
||||
matches!(self, Self::Failure(_))
|
||||
}
|
||||
|
||||
fn map_err<O: FnOnce(E) -> E>(self, op: O) -> nom::Err<E> {
|
||||
match self {
|
||||
e @ Self::Failure(_) => e,
|
||||
e => e.map(|e| op(e)),
|
||||
}
|
||||
}
|
||||
|
||||
fn map_fail<O: FnOnce(E) -> E>(self, op: O) -> nom::Err<E> {
|
||||
match self {
|
||||
e @ Self::Error(_) => e,
|
||||
e => e.map(|e| op(e)),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// cut a parser and map the error
|
||||
pub fn cut_with_err<'a, O>(
|
||||
mut parser: impl FnMut(Span<'a>) -> IResult<O>,
|
||||
mut with: impl FnMut(Error<'a>) -> Error<'a>,
|
||||
) -> impl FnMut(Span<'a>) -> IResult<O> {
|
||||
move |input| match parser.parse(input) {
|
||||
Err(nom::Err::Error(e)) => Err(nom::Err::Failure(with(e))),
|
||||
rest => rest,
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct Error<'a> {
|
||||
context: Span<'a>,
|
||||
kind: ErrorKind<'a>,
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub enum ErrorKind<'a> {
|
||||
ReservedGeo(&'a str),
|
||||
Geo,
|
||||
MisusedGeo,
|
||||
InvalidPrimary,
|
||||
ExpectedEof,
|
||||
ExpectedValue,
|
||||
MissingClosingDelimiter(char),
|
||||
Char(char),
|
||||
InternalError(error::ErrorKind),
|
||||
External(String),
|
||||
}
|
||||
|
||||
impl<'a> Error<'a> {
|
||||
pub fn kind(&self) -> &ErrorKind<'a> {
|
||||
&self.kind
|
||||
}
|
||||
|
||||
pub fn context(&self) -> &Span<'a> {
|
||||
&self.context
|
||||
}
|
||||
|
||||
pub fn new_from_kind(context: Span<'a>, kind: ErrorKind<'a>) -> Self {
|
||||
Self { context, kind }
|
||||
}
|
||||
|
||||
pub fn new_from_external(context: Span<'a>, error: impl std::error::Error) -> Self {
|
||||
Self::new_from_kind(context, ErrorKind::External(error.to_string()))
|
||||
}
|
||||
|
||||
pub fn char(self) -> char {
|
||||
match self.kind {
|
||||
ErrorKind::Char(c) => c,
|
||||
_ => panic!("Internal filter parser error"),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> ParseError<Span<'a>> for Error<'a> {
|
||||
fn from_error_kind(input: Span<'a>, kind: error::ErrorKind) -> Self {
|
||||
let kind = match kind {
|
||||
error::ErrorKind::Eof => ErrorKind::ExpectedEof,
|
||||
kind => ErrorKind::InternalError(kind),
|
||||
};
|
||||
Self { context: input, kind }
|
||||
}
|
||||
|
||||
fn append(_input: Span<'a>, _kind: error::ErrorKind, other: Self) -> Self {
|
||||
other
|
||||
}
|
||||
|
||||
fn from_char(input: Span<'a>, c: char) -> Self {
|
||||
Self { context: input, kind: ErrorKind::Char(c) }
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> Display for Error<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
let input = self.context.fragment();
|
||||
|
||||
// When printing our error message we want to escape all `\n` to be sure we keep our format with the
|
||||
// first line being the diagnostic and the second line being the incriminated filter.
|
||||
let escaped_input = input.escape_debug();
|
||||
|
||||
match self.kind {
|
||||
ErrorKind::ExpectedValue if input.trim().is_empty() => {
|
||||
writeln!(f, "Was expecting a value but instead got nothing.")?
|
||||
}
|
||||
ErrorKind::MissingClosingDelimiter(c) => {
|
||||
writeln!(f, "Expression `{}` is missing the following closing delimiter: `{}`.", escaped_input, c)?
|
||||
}
|
||||
ErrorKind::ExpectedValue => {
|
||||
writeln!(f, "Was expecting a value but instead got `{}`.", escaped_input)?
|
||||
}
|
||||
ErrorKind::InvalidPrimary if input.trim().is_empty() => {
|
||||
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` but instead got nothing.")?
|
||||
}
|
||||
ErrorKind::InvalidPrimary => {
|
||||
writeln!(f, "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` at `{}`.", escaped_input)?
|
||||
}
|
||||
ErrorKind::ExpectedEof => {
|
||||
writeln!(f, "Found unexpected characters at the end of the filter: `{}`. You probably forgot an `OR` or an `AND` rule.", escaped_input)?
|
||||
}
|
||||
ErrorKind::Geo => {
|
||||
writeln!(f, "The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`.")?
|
||||
}
|
||||
ErrorKind::ReservedGeo(name) => {
|
||||
writeln!(f, "`{}` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance) built-in rule to filter on `_geo` coordinates.", name.escape_debug())?
|
||||
}
|
||||
ErrorKind::MisusedGeo => {
|
||||
writeln!(f, "The `_geoRadius` filter is an operation and can't be used as a value.")?
|
||||
}
|
||||
ErrorKind::Char(c) => {
|
||||
panic!("Tried to display a char error with `{}`", c)
|
||||
}
|
||||
ErrorKind::InternalError(kind) => writeln!(
|
||||
f,
|
||||
"Encountered an internal `{:?}` error while parsing your filter. Please fill an issue", kind
|
||||
)?,
|
||||
ErrorKind::External(ref error) => writeln!(f, "{}", error)?,
|
||||
}
|
||||
let base_column = self.context.get_utf8_column();
|
||||
let size = self.context.fragment().chars().count();
|
||||
|
||||
write!(f, "{}:{} {}", base_column, base_column + size, self.context.extra)
|
||||
}
|
||||
}
|
587
filter-parser/src/lib.rs
Normal file
587
filter-parser/src/lib.rs
Normal file
@ -0,0 +1,587 @@
|
||||
//! BNF grammar:
|
||||
//!
|
||||
//! ```text
|
||||
//! filter = expression ~ EOF
|
||||
//! expression = or
|
||||
//! or = and (~ "OR" ~ and)
|
||||
//! and = not (~ "AND" not)*
|
||||
//! not = ("NOT" ~ not) | primary
|
||||
//! primary = (WS* ~ "(" expression ")" ~ WS*) | geoRadius | condition | to
|
||||
//! condition = value ("==" | ">" ...) value
|
||||
//! to = value value TO value
|
||||
//! value = WS* ~ ( word | singleQuoted | doubleQuoted) ~ WS*
|
||||
//! singleQuoted = "'" .* all but quotes "'"
|
||||
//! doubleQuoted = "\"" .* all but double quotes "\""
|
||||
//! word = (alphanumeric | _ | - | .)+
|
||||
//! geoRadius = WS* ~ "_geoRadius(" ~ WS* ~ float ~ WS* ~ "," ~ WS* ~ float ~ WS* ~ "," float ~ WS* ~ ")"
|
||||
//! ```
|
||||
//!
|
||||
//! Other BNF grammar used to handle some specific errors:
|
||||
//! ```text
|
||||
//! geoPoint = WS* ~ "_geoPoint(" ~ (float ~ ",")* ~ ")"
|
||||
//! ```
|
||||
//!
|
||||
//! Specific errors:
|
||||
//! ================
|
||||
//! - If a user try to use a geoPoint, as a primary OR as a value we must throw an error.
|
||||
//! ```text
|
||||
//! field = _geoPoint(12, 13, 14)
|
||||
//! field < 12 AND _geoPoint(1, 2)
|
||||
//! ```
|
||||
//!
|
||||
//! - If a user try to use a geoRadius as a value we must throw an error.
|
||||
//! ```text
|
||||
//! field = _geoRadius(12, 13, 14)
|
||||
//! ```
|
||||
//!
|
||||
|
||||
mod condition;
|
||||
mod error;
|
||||
mod value;
|
||||
|
||||
use std::fmt::Debug;
|
||||
use std::ops::Deref;
|
||||
use std::str::FromStr;
|
||||
|
||||
pub use condition::{parse_condition, parse_to, Condition};
|
||||
use error::{cut_with_err, NomErrorExt};
|
||||
pub use error::{Error, ErrorKind};
|
||||
use nom::branch::alt;
|
||||
use nom::bytes::complete::tag;
|
||||
use nom::character::complete::{char, multispace0};
|
||||
use nom::combinator::{cut, eof, map};
|
||||
use nom::multi::{many0, separated_list1};
|
||||
use nom::number::complete::recognize_float;
|
||||
use nom::sequence::{delimited, preceded, terminated, tuple};
|
||||
use nom::Finish;
|
||||
use nom_locate::LocatedSpan;
|
||||
pub(crate) use value::parse_value;
|
||||
|
||||
pub type Span<'a> = LocatedSpan<&'a str, &'a str>;
|
||||
|
||||
type IResult<'a, Ret> = nom::IResult<Span<'a>, Ret, Error<'a>>;
|
||||
|
||||
#[derive(Debug, Clone, Eq)]
|
||||
pub struct Token<'a>(Span<'a>);
|
||||
|
||||
impl<'a> Deref for Token<'a> {
|
||||
type Target = &'a str;
|
||||
|
||||
fn deref(&self) -> &Self::Target {
|
||||
&self.0
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> PartialEq for Token<'a> {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.0.fragment() == other.0.fragment()
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> Token<'a> {
|
||||
pub fn new(position: Span<'a>) -> Self {
|
||||
Self(position)
|
||||
}
|
||||
|
||||
pub fn as_external_error(&self, error: impl std::error::Error) -> Error<'a> {
|
||||
Error::new_from_external(self.0, error)
|
||||
}
|
||||
|
||||
pub fn parse<T>(&self) -> Result<T, Error>
|
||||
where
|
||||
T: FromStr,
|
||||
T::Err: std::error::Error,
|
||||
{
|
||||
self.0.parse().map_err(|e| self.as_external_error(e))
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> From<Span<'a>> for Token<'a> {
|
||||
fn from(span: Span<'a>) -> Self {
|
||||
Self(span)
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum FilterCondition<'a> {
|
||||
Condition { fid: Token<'a>, op: Condition<'a> },
|
||||
Or(Box<Self>, Box<Self>),
|
||||
And(Box<Self>, Box<Self>),
|
||||
GeoLowerThan { point: [Token<'a>; 2], radius: Token<'a> },
|
||||
GeoGreaterThan { point: [Token<'a>; 2], radius: Token<'a> },
|
||||
Empty,
|
||||
}
|
||||
|
||||
impl<'a> FilterCondition<'a> {
|
||||
pub fn negate(self) -> FilterCondition<'a> {
|
||||
use FilterCondition::*;
|
||||
|
||||
match self {
|
||||
Condition { fid, op } => match op.negate() {
|
||||
(op, None) => Condition { fid, op },
|
||||
(a, Some(b)) => Or(
|
||||
Condition { fid: fid.clone(), op: a }.into(),
|
||||
Condition { fid, op: b }.into(),
|
||||
),
|
||||
},
|
||||
Or(a, b) => And(a.negate().into(), b.negate().into()),
|
||||
And(a, b) => Or(a.negate().into(), b.negate().into()),
|
||||
Empty => Empty,
|
||||
GeoLowerThan { point, radius } => GeoGreaterThan { point, radius },
|
||||
GeoGreaterThan { point, radius } => GeoLowerThan { point, radius },
|
||||
}
|
||||
}
|
||||
|
||||
pub fn parse(input: &'a str) -> Result<Self, Error> {
|
||||
if input.trim().is_empty() {
|
||||
return Ok(Self::Empty);
|
||||
}
|
||||
let span = Span::new_extra(input, input);
|
||||
parse_filter(span).finish().map(|(_rem, output)| output)
|
||||
}
|
||||
}
|
||||
|
||||
/// remove OPTIONAL whitespaces before AND after the provided parser.
|
||||
fn ws<'a, O>(inner: impl FnMut(Span<'a>) -> IResult<O>) -> impl FnMut(Span<'a>) -> IResult<O> {
|
||||
delimited(multispace0, inner, multispace0)
|
||||
}
|
||||
|
||||
/// or = and (~ "OR" ~ and)
|
||||
fn parse_or(input: Span) -> IResult<FilterCondition> {
|
||||
let (input, lhs) = parse_and(input)?;
|
||||
// if we found a `OR` then we MUST find something next
|
||||
let (input, ors) = many0(preceded(ws(tag("OR")), cut(parse_and)))(input)?;
|
||||
|
||||
let expr = ors
|
||||
.into_iter()
|
||||
.fold(lhs, |acc, branch| FilterCondition::Or(Box::new(acc), Box::new(branch)));
|
||||
Ok((input, expr))
|
||||
}
|
||||
|
||||
/// and = not (~ "AND" not)*
|
||||
fn parse_and(input: Span) -> IResult<FilterCondition> {
|
||||
let (input, lhs) = parse_not(input)?;
|
||||
// if we found a `AND` then we MUST find something next
|
||||
let (input, ors) = many0(preceded(ws(tag("AND")), cut(parse_not)))(input)?;
|
||||
let expr = ors
|
||||
.into_iter()
|
||||
.fold(lhs, |acc, branch| FilterCondition::And(Box::new(acc), Box::new(branch)));
|
||||
Ok((input, expr))
|
||||
}
|
||||
|
||||
/// not = ("NOT" ~ not) | primary
|
||||
/// We can have multiple consecutive not, eg: `NOT NOT channel = mv`.
|
||||
/// If we parse a `NOT` we MUST parse something behind.
|
||||
fn parse_not(input: Span) -> IResult<FilterCondition> {
|
||||
alt((map(preceded(tag("NOT"), cut(parse_not)), |e| e.negate()), parse_primary))(input)
|
||||
}
|
||||
|
||||
/// geoRadius = WS* ~ "_geoRadius(float ~ "," ~ float ~ "," float)
|
||||
/// If we parse `_geoRadius` we MUST parse the rest of the expression.
|
||||
fn parse_geo_radius(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to forbid space BEFORE the _geoRadius but not after
|
||||
let parsed = preceded(
|
||||
tuple((multispace0, tag("_geoRadius"))),
|
||||
// if we were able to parse `_geoRadius` and can't parse the rest of the input we return a failure
|
||||
cut(delimited(char('('), separated_list1(tag(","), ws(recognize_float)), char(')'))),
|
||||
)(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::Geo)));
|
||||
|
||||
let (input, args) = parsed?;
|
||||
|
||||
if args.len() != 3 {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::Geo)));
|
||||
}
|
||||
|
||||
let res = FilterCondition::GeoLowerThan {
|
||||
point: [args[0].into(), args[1].into()],
|
||||
radius: args[2].into(),
|
||||
};
|
||||
Ok((input, res))
|
||||
}
|
||||
|
||||
/// geoPoint = WS* ~ "_geoPoint(float ~ "," ~ float ~ "," float)
|
||||
fn parse_geo_point(input: Span) -> IResult<FilterCondition> {
|
||||
// we want to forbid space BEFORE the _geoPoint but not after
|
||||
tuple((
|
||||
multispace0,
|
||||
tag("_geoPoint"),
|
||||
// if we were able to parse `_geoPoint` we are going to return a Failure whatever happens next.
|
||||
cut(delimited(char('('), separated_list1(tag(","), ws(|c| recognize_float(c))), char(')'))),
|
||||
))(input)
|
||||
.map_err(|e| e.map(|_| Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoPoint"))))?;
|
||||
// if we succeeded we still return a `Failure` because geoPoints are not allowed
|
||||
Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::ReservedGeo("_geoPoint"))))
|
||||
}
|
||||
|
||||
/// primary = (WS* ~ "(" expression ")" ~ WS*) | geoRadius | condition | to
|
||||
fn parse_primary(input: Span) -> IResult<FilterCondition> {
|
||||
alt((
|
||||
// if we find a first parenthesis, then we must parse an expression and find the closing parenthesis
|
||||
delimited(
|
||||
ws(char('(')),
|
||||
cut(parse_expression),
|
||||
cut_with_err(ws(char(')')), |c| {
|
||||
Error::new_from_kind(input, ErrorKind::MissingClosingDelimiter(c.char()))
|
||||
}),
|
||||
),
|
||||
parse_geo_radius,
|
||||
parse_condition,
|
||||
parse_to,
|
||||
// the next lines are only for error handling and are written at the end to have the less possible performance impact
|
||||
parse_geo_point,
|
||||
))(input)
|
||||
// if the inner parsers did not match enough information to return an accurate error
|
||||
.map_err(|e| e.map_err(|_| Error::new_from_kind(input, ErrorKind::InvalidPrimary)))
|
||||
}
|
||||
|
||||
/// expression = or
|
||||
pub fn parse_expression(input: Span) -> IResult<FilterCondition> {
|
||||
parse_or(input)
|
||||
}
|
||||
|
||||
/// filter = expression ~ EOF
|
||||
pub fn parse_filter(input: Span) -> IResult<FilterCondition> {
|
||||
terminated(parse_expression, eof)(input)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
pub mod tests {
|
||||
use super::*;
|
||||
|
||||
/// Create a raw [Token]. You must specify the string that appear BEFORE your element followed by your element
|
||||
pub fn rtok<'a>(before: &'a str, value: &'a str) -> Token<'a> {
|
||||
// if the string is empty we still need to return 1 for the line number
|
||||
let lines = before.is_empty().then(|| 1).unwrap_or_else(|| before.lines().count());
|
||||
let offset = before.chars().count();
|
||||
// the extra field is not checked in the tests so we can set it to nothing
|
||||
unsafe { Span::new_from_raw_offset(offset, lines as u32, value, "") }.into()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse() {
|
||||
use FilterCondition as Fc;
|
||||
|
||||
let test_case = [
|
||||
// simple test
|
||||
(
|
||||
"channel = Ponce",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = ", "Ponce")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers = 12",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::Equal(rtok("subscribers = ", "12")),
|
||||
},
|
||||
),
|
||||
// test all the quotes and simple quotes
|
||||
(
|
||||
"channel = 'Mister Mv'",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = '", "Mister Mv")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"channel = \"Mister Mv\"",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = \"", "Mister Mv")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"'dog race' = Borzoi",
|
||||
Fc::Condition {
|
||||
fid: rtok("'", "dog race"),
|
||||
op: Condition::Equal(rtok("'dog race' = ", "Borzoi")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"\"dog race\" = Chusky",
|
||||
Fc::Condition {
|
||||
fid: rtok("\"", "dog race"),
|
||||
op: Condition::Equal(rtok("\"dog race\" = ", "Chusky")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"\"dog race\" = \"Bernese Mountain\"",
|
||||
Fc::Condition {
|
||||
fid: rtok("\"", "dog race"),
|
||||
op: Condition::Equal(rtok("\"dog race\" = \"", "Bernese Mountain")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"'dog race' = 'Bernese Mountain'",
|
||||
Fc::Condition {
|
||||
fid: rtok("'", "dog race"),
|
||||
op: Condition::Equal(rtok("'dog race' = '", "Bernese Mountain")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"\"dog race\" = 'Bernese Mountain'",
|
||||
Fc::Condition {
|
||||
fid: rtok("\"", "dog race"),
|
||||
op: Condition::Equal(rtok("\"dog race\" = \"", "Bernese Mountain")),
|
||||
},
|
||||
),
|
||||
// test all the operators
|
||||
(
|
||||
"channel != ponce",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::NotEqual(rtok("channel != ", "ponce")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT channel = ponce",
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "channel"),
|
||||
op: Condition::NotEqual(rtok("NOT channel = ", "ponce")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers < 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::LowerThan(rtok("subscribers < ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers > 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::GreaterThan(rtok("subscribers > ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers <= 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::LowerThanOrEqual(rtok("subscribers <= ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers >= 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::GreaterThanOrEqual(rtok("subscribers >= ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT subscribers < 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::GreaterThanOrEqual(rtok("NOT subscribers < ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT subscribers > 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::LowerThanOrEqual(rtok("NOT subscribers > ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT subscribers <= 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::GreaterThan(rtok("NOT subscribers <= ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT subscribers >= 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::LowerThan(rtok("NOT subscribers >= ", "1000")),
|
||||
},
|
||||
),
|
||||
(
|
||||
"subscribers 100 TO 1000",
|
||||
Fc::Condition {
|
||||
fid: rtok("", "subscribers"),
|
||||
op: Condition::Between {
|
||||
from: rtok("subscribers ", "100"),
|
||||
to: rtok("subscribers 100 TO ", "1000"),
|
||||
},
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT subscribers 100 TO 1000",
|
||||
Fc::Or(
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::LowerThan(rtok("NOT subscribers ", "100")),
|
||||
}
|
||||
.into(),
|
||||
Fc::Condition {
|
||||
fid: rtok("NOT ", "subscribers"),
|
||||
op: Condition::GreaterThan(rtok("NOT subscribers 100 TO ", "1000")),
|
||||
}
|
||||
.into(),
|
||||
),
|
||||
),
|
||||
(
|
||||
"_geoRadius(12, 13, 14)",
|
||||
Fc::GeoLowerThan {
|
||||
point: [rtok("_geoRadius(", "12"), rtok("_geoRadius(12, ", "13")],
|
||||
radius: rtok("_geoRadius(12, 13, ", "14"),
|
||||
},
|
||||
),
|
||||
(
|
||||
"NOT _geoRadius(12, 13, 14)",
|
||||
Fc::GeoGreaterThan {
|
||||
point: [rtok("NOT _geoRadius(", "12"), rtok("NOT _geoRadius(12, ", "13")],
|
||||
radius: rtok("NOT _geoRadius(12, 13, ", "14"),
|
||||
},
|
||||
),
|
||||
// test simple `or` and `and`
|
||||
(
|
||||
"channel = ponce AND 'dog race' != 'bernese mountain'",
|
||||
Fc::And(
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = ", "ponce")),
|
||||
}
|
||||
.into(),
|
||||
Fc::Condition {
|
||||
fid: rtok("channel = ponce AND '", "dog race"),
|
||||
op: Condition::NotEqual(rtok(
|
||||
"channel = ponce AND 'dog race' != '",
|
||||
"bernese mountain",
|
||||
)),
|
||||
}
|
||||
.into(),
|
||||
),
|
||||
),
|
||||
(
|
||||
"channel = ponce OR 'dog race' != 'bernese mountain'",
|
||||
Fc::Or(
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = ", "ponce")),
|
||||
}
|
||||
.into(),
|
||||
Fc::Condition {
|
||||
fid: rtok("channel = ponce OR '", "dog race"),
|
||||
op: Condition::NotEqual(rtok(
|
||||
"channel = ponce OR 'dog race' != '",
|
||||
"bernese mountain",
|
||||
)),
|
||||
}
|
||||
.into(),
|
||||
),
|
||||
),
|
||||
(
|
||||
"channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000",
|
||||
Fc::Or(
|
||||
Fc::And(
|
||||
Fc::Condition {
|
||||
fid: rtok("", "channel"),
|
||||
op: Condition::Equal(rtok("channel = ", "ponce")),
|
||||
}
|
||||
.into(),
|
||||
Fc::Condition {
|
||||
fid: rtok("channel = ponce AND '", "dog race"),
|
||||
op: Condition::NotEqual(rtok(
|
||||
"channel = ponce AND 'dog race' != '",
|
||||
"bernese mountain",
|
||||
)),
|
||||
}
|
||||
.into(),
|
||||
)
|
||||
.into(),
|
||||
Fc::Condition {
|
||||
fid: rtok(
|
||||
"channel = ponce AND 'dog race' != 'bernese mountain' OR ",
|
||||
"subscribers",
|
||||
),
|
||||
op: Condition::GreaterThan(rtok(
|
||||
"channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > ",
|
||||
"1000",
|
||||
)),
|
||||
}
|
||||
.into(),
|
||||
),
|
||||
),
|
||||
// test parenthesis
|
||||
(
|
||||
"channel = ponce AND ( 'dog race' != 'bernese mountain' OR subscribers > 1000 )",
|
||||
Fc::And(
|
||||
Fc::Condition { fid: rtok("", "channel"), op: Condition::Equal(rtok("channel = ", "ponce")) }.into(),
|
||||
Fc::Or(
|
||||
Fc::Condition { fid: rtok("channel = ponce AND ( '", "dog race"), op: Condition::NotEqual(rtok("channel = ponce AND ( 'dog race' != '", "bernese mountain"))}.into(),
|
||||
Fc::Condition { fid: rtok("channel = ponce AND ( 'dog race' != 'bernese mountain' OR ", "subscribers"), op: Condition::GreaterThan(rtok("channel = ponce AND ( 'dog race' != 'bernese mountain' OR subscribers > ", "1000")) }.into(),
|
||||
).into()),
|
||||
),
|
||||
(
|
||||
"(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, 13, 14)",
|
||||
Fc::And(
|
||||
Fc::Or(
|
||||
Fc::And(
|
||||
Fc::Condition { fid: rtok("(", "channel"), op: Condition::Equal(rtok("(channel = ", "ponce")) }.into(),
|
||||
Fc::Condition { fid: rtok("(channel = ponce AND '", "dog race"), op: Condition::NotEqual(rtok("(channel = ponce AND 'dog race' != '", "bernese mountain")) }.into(),
|
||||
).into(),
|
||||
Fc::Condition { fid: rtok("(channel = ponce AND 'dog race' != 'bernese mountain' OR ", "subscribers"), op: Condition::GreaterThan(rtok("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > ", "1000")) }.into(),
|
||||
).into(),
|
||||
Fc::GeoLowerThan { point: [rtok("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(", "12"), rtok("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, ", "13")], radius: rtok("(channel = ponce AND 'dog race' != 'bernese mountain' OR subscribers > 1000) AND _geoRadius(12, 13, ", "14") }.into()
|
||||
)
|
||||
)
|
||||
];
|
||||
|
||||
for (input, expected) in test_case {
|
||||
let result = Fc::parse(input);
|
||||
|
||||
assert!(
|
||||
result.is_ok(),
|
||||
"Filter `{:?}` was supposed to be parsed but failed with the following error: `{}`",
|
||||
expected,
|
||||
result.unwrap_err()
|
||||
);
|
||||
let filter = result.unwrap();
|
||||
assert_eq!(filter, expected, "Filter `{}` failed.", input);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn error() {
|
||||
use FilterCondition as Fc;
|
||||
|
||||
let test_case = [
|
||||
// simple test
|
||||
("channel = Ponce = 12", "Found unexpected characters at the end of the filter: `= 12`. You probably forgot an `OR` or an `AND` rule."),
|
||||
("channel = ", "Was expecting a value but instead got nothing."),
|
||||
("channel = 🐻", "Was expecting a value but instead got `🐻`."),
|
||||
("channel = 🐻 AND followers < 100", "Was expecting a value but instead got `🐻`."),
|
||||
("OR", "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` at `OR`."),
|
||||
("AND", "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` at `AND`."),
|
||||
("channel Ponce", "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` at `channel Ponce`."),
|
||||
("channel = Ponce OR", "Was expecting an operation `=`, `!=`, `>=`, `>`, `<=`, `<`, `TO` or `_geoRadius` but instead got nothing."),
|
||||
("_geoRadius", "The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`."),
|
||||
("_geoRadius = 12", "The `_geoRadius` filter expects three arguments: `_geoRadius(latitude, longitude, radius)`."),
|
||||
("_geoPoint(12, 13, 14)", "`_geoPoint` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance) built-in rule to filter on `_geo` coordinates."),
|
||||
("position <= _geoPoint(12, 13, 14)", "`_geoPoint` is a reserved keyword and thus can't be used as a filter expression. Use the `_geoRadius(latitude, longitude, distance) built-in rule to filter on `_geo` coordinates."),
|
||||
("position <= _geoRadius(12, 13, 14)", "The `_geoRadius` filter is an operation and can't be used as a value."),
|
||||
("channel = 'ponce", "Expression `\\'ponce` is missing the following closing delimiter: `'`."),
|
||||
("channel = \"ponce", "Expression `\\\"ponce` is missing the following closing delimiter: `\"`."),
|
||||
("channel = mv OR (followers >= 1000", "Expression `(followers >= 1000` is missing the following closing delimiter: `)`."),
|
||||
("channel = mv OR followers >= 1000)", "Found unexpected characters at the end of the filter: `)`. You probably forgot an `OR` or an `AND` rule."),
|
||||
];
|
||||
|
||||
for (input, expected) in test_case {
|
||||
let result = Fc::parse(input);
|
||||
|
||||
assert!(
|
||||
result.is_err(),
|
||||
"Filter `{}` wasn't supposed to be parsed but it did with the following result: `{:?}`",
|
||||
input,
|
||||
result.unwrap()
|
||||
);
|
||||
let filter = result.unwrap_err().to_string();
|
||||
assert!(filter.starts_with(expected), "Filter `{:?}` was supposed to return the following error:\n{}\n, but instead returned\n{}\n.", input, expected, filter);
|
||||
}
|
||||
}
|
||||
}
|
16
filter-parser/src/main.rs
Normal file
16
filter-parser/src/main.rs
Normal file
@ -0,0 +1,16 @@
|
||||
fn main() {
|
||||
let input = std::env::args().nth(1).expect("You must provide a filter to test");
|
||||
|
||||
println!("Trying to execute the following filter:\n{}\n", input);
|
||||
|
||||
match filter_parser::FilterCondition::parse(&input) {
|
||||
Ok(filter) => {
|
||||
println!("✅ Valid filter");
|
||||
println!("{:#?}", filter);
|
||||
}
|
||||
Err(e) => {
|
||||
println!("❎ Invalid filter");
|
||||
println!("{}", e.to_string());
|
||||
}
|
||||
}
|
||||
}
|
147
filter-parser/src/value.rs
Normal file
147
filter-parser/src/value.rs
Normal file
@ -0,0 +1,147 @@
|
||||
use nom::branch::alt;
|
||||
use nom::bytes::complete::{take_till, take_while, take_while1};
|
||||
use nom::character::complete::{char, multispace0};
|
||||
use nom::combinator::cut;
|
||||
use nom::sequence::{delimited, terminated};
|
||||
|
||||
use crate::error::NomErrorExt;
|
||||
use crate::{parse_geo_point, parse_geo_radius, Error, ErrorKind, IResult, Span, Token};
|
||||
|
||||
/// value = WS* ~ ( word | singleQuoted | doubleQuoted) ~ WS*
|
||||
pub fn parse_value(input: Span) -> IResult<Token> {
|
||||
// to get better diagnostic message we are going to strip the left whitespaces from the input right now
|
||||
let (input, _) = take_while(char::is_whitespace)(input)?;
|
||||
|
||||
// then, we want to check if the user is misusing a geo expression
|
||||
// This expression can’t finish without error.
|
||||
// We want to return an error in case of failure.
|
||||
if let Err(err) = parse_geo_point(input) {
|
||||
if err.is_failure() {
|
||||
return Err(err);
|
||||
}
|
||||
}
|
||||
match parse_geo_radius(input) {
|
||||
Ok(_) => return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::MisusedGeo))),
|
||||
// if we encountered a failure it means the user badly wrote a _geoRadius filter.
|
||||
// But instead of showing him how to fix his syntax we are going to tell him he should not use this filter as a value.
|
||||
Err(e) if e.is_failure() => {
|
||||
return Err(nom::Err::Failure(Error::new_from_kind(input, ErrorKind::MisusedGeo)))
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
|
||||
// singleQuoted = "'" .* all but quotes "'"
|
||||
let simple_quoted = take_till(|c: char| c == '\'');
|
||||
// doubleQuoted = "\"" (word | spaces)* "\""
|
||||
let double_quoted = take_till(|c: char| c == '"');
|
||||
// word = (alphanumeric | _ | - | .)+
|
||||
let word = take_while1(is_value_component);
|
||||
|
||||
// this parser is only used when an error is encountered and it parse the
|
||||
// largest string possible that do not contain any “language” syntax.
|
||||
// If we try to parse `name = 🦀 AND language = rust` we want to return an
|
||||
// error saying we could not parse `🦀`. Not that no value were found or that
|
||||
// we could note parse `🦀 AND language = rust`.
|
||||
// we want to remove the space before entering the alt because if we don't,
|
||||
// when we create the errors from the output of the alt we have spaces everywhere
|
||||
let error_word = take_till::<_, _, Error>(is_syntax_component);
|
||||
|
||||
terminated(
|
||||
alt((
|
||||
delimited(char('\''), cut(simple_quoted), cut(char('\''))),
|
||||
delimited(char('"'), cut(double_quoted), cut(char('"'))),
|
||||
word,
|
||||
)),
|
||||
multispace0,
|
||||
)(input)
|
||||
.map(|(s, t)| (s, t.into()))
|
||||
// if we found nothing in the alt it means the user specified something that was not recognized as a value
|
||||
.map_err(|e: nom::Err<Error>| {
|
||||
e.map_err(|_| Error::new_from_kind(error_word(input).unwrap().1, ErrorKind::ExpectedValue))
|
||||
})
|
||||
// if we found encountered a failure it means the user really tried to input a value, but had an unmatched quote
|
||||
.map_err(|e| {
|
||||
e.map_fail(|c| Error::new_from_kind(input, ErrorKind::MissingClosingDelimiter(c.char())))
|
||||
})
|
||||
}
|
||||
|
||||
fn is_value_component(c: char) -> bool {
|
||||
c.is_alphanumeric() || ['_', '-', '.'].contains(&c)
|
||||
}
|
||||
|
||||
fn is_syntax_component(c: char) -> bool {
|
||||
c.is_whitespace() || ['(', ')', '=', '<', '>', '!'].contains(&c)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
pub mod test {
|
||||
use nom::Finish;
|
||||
|
||||
use super::*;
|
||||
use crate::tests::rtok;
|
||||
|
||||
#[test]
|
||||
fn name() {
|
||||
let test_case = [
|
||||
("channel", rtok("", "channel")),
|
||||
(".private", rtok("", ".private")),
|
||||
("I-love-kebab", rtok("", "I-love-kebab")),
|
||||
("but_snakes_is_also_good", rtok("", "but_snakes_is_also_good")),
|
||||
("parens(", rtok("", "parens")),
|
||||
("parens)", rtok("", "parens")),
|
||||
("not!", rtok("", "not")),
|
||||
(" channel", rtok(" ", "channel")),
|
||||
("channel ", rtok("", "channel")),
|
||||
(" channel ", rtok(" ", "channel")),
|
||||
("'channel'", rtok("'", "channel")),
|
||||
("\"channel\"", rtok("\"", "channel")),
|
||||
("'cha)nnel'", rtok("'", "cha)nnel")),
|
||||
("'cha\"nnel'", rtok("'", "cha\"nnel")),
|
||||
("\"cha'nnel\"", rtok("\"", "cha'nnel")),
|
||||
("\" some spaces \"", rtok("\"", " some spaces ")),
|
||||
("\"cha'nnel\"", rtok("'", "cha'nnel")),
|
||||
("\"cha'nnel\"", rtok("'", "cha'nnel")),
|
||||
("I'm tamo", rtok("'m tamo", "I")),
|
||||
];
|
||||
|
||||
for (input, expected) in test_case {
|
||||
let input = Span::new_extra(input, input);
|
||||
let result = parse_value(input);
|
||||
|
||||
assert!(
|
||||
result.is_ok(),
|
||||
"Filter `{:?}` was supposed to be parsed but failed with the following error: `{}`",
|
||||
expected,
|
||||
result.unwrap_err()
|
||||
);
|
||||
let value = result.unwrap().1;
|
||||
assert_eq!(value, expected, "Filter `{}` failed.", input);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn diagnostic() {
|
||||
let test_case = [
|
||||
("🦀", "🦀"),
|
||||
(" 🦀", "🦀"),
|
||||
("🦀 AND crab = truc", "🦀"),
|
||||
("🦀_in_name", "🦀_in_name"),
|
||||
(" (name = ...", ""),
|
||||
];
|
||||
|
||||
for (input, expected) in test_case {
|
||||
let input = Span::new_extra(input, input);
|
||||
let result = parse_value(input);
|
||||
|
||||
assert!(
|
||||
result.is_err(),
|
||||
"Filter `{}` wasn’t supposed to be parsed but it did with the following result: `{:?}`",
|
||||
expected,
|
||||
result.unwrap()
|
||||
);
|
||||
// get the inner string referenced in the error
|
||||
let value = *result.finish().unwrap_err().context().fragment();
|
||||
assert_eq!(value, expected, "Filter `{}` was supposed to fail with the following value: `{}`, but it failed with: `{}`.", input, expected, value);
|
||||
}
|
||||
}
|
||||
}
|
@ -23,7 +23,8 @@ use milli::documents::DocumentBatchReader;
|
||||
use milli::update::UpdateIndexingStep::*;
|
||||
use milli::update::{IndexDocumentsMethod, Setting, UpdateBuilder};
|
||||
use milli::{
|
||||
obkv_to_json, CompressionType, FilterCondition, Index, MatchingWords, SearchResult, SortError,
|
||||
obkv_to_json, CompressionType, Filter as MilliFilter, FilterCondition, Index, MatchingWords,
|
||||
SearchResult, SortError,
|
||||
};
|
||||
use once_cell::sync::OnceCell;
|
||||
use rayon::ThreadPool;
|
||||
@ -735,31 +736,37 @@ async fn main() -> anyhow::Result<()> {
|
||||
search.query(query);
|
||||
}
|
||||
|
||||
let filters = match query.filters {
|
||||
let filters = match query.filters.as_ref() {
|
||||
Some(condition) if !condition.trim().is_empty() => {
|
||||
Some(FilterCondition::from_str(&rtxn, &index, &condition).unwrap())
|
||||
Some(MilliFilter::from_str(condition).unwrap())
|
||||
}
|
||||
_otherwise => None,
|
||||
};
|
||||
|
||||
let facet_filters = match query.facet_filters {
|
||||
let facet_filters = match query.facet_filters.as_ref() {
|
||||
Some(array) => {
|
||||
let eithers = array.into_iter().map(Into::into);
|
||||
FilterCondition::from_array(&rtxn, &index, eithers).unwrap()
|
||||
let eithers = array.iter().map(|either| match either {
|
||||
UntaggedEither::Left(l) => {
|
||||
Either::Left(l.iter().map(|s| s.as_str()).collect::<Vec<&str>>())
|
||||
}
|
||||
UntaggedEither::Right(r) => Either::Right(r.as_str()),
|
||||
});
|
||||
MilliFilter::from_array(eithers).unwrap()
|
||||
}
|
||||
_otherwise => None,
|
||||
};
|
||||
|
||||
let condition = match (filters, facet_filters) {
|
||||
(Some(filters), Some(facet_filters)) => {
|
||||
Some(FilterCondition::And(Box::new(filters), Box::new(facet_filters)))
|
||||
}
|
||||
(Some(condition), None) | (None, Some(condition)) => Some(condition),
|
||||
(Some(filters), Some(facet_filters)) => Some(FilterCondition::And(
|
||||
Box::new(filters.into()),
|
||||
Box::new(facet_filters.into()),
|
||||
)),
|
||||
(Some(condition), None) | (None, Some(condition)) => Some(condition.into()),
|
||||
_otherwise => None,
|
||||
};
|
||||
|
||||
if let Some(condition) = condition {
|
||||
search.filter(condition);
|
||||
search.filter(condition.into());
|
||||
}
|
||||
|
||||
if let Some(limit) = query.limit {
|
||||
|
@ -38,9 +38,7 @@ smallvec = "1.6.1"
|
||||
tempfile = "3.2.0"
|
||||
uuid = { version = "0.8.2", features = ["v4"] }
|
||||
|
||||
# facet filter parser
|
||||
pest = { git = "https://github.com/pest-parser/pest.git", rev = "51fd1d49f1041f7839975664ef71fe15c7dcaf67" }
|
||||
pest_derive = "2.1.0"
|
||||
filter-parser = { path = "../filter-parser" }
|
||||
|
||||
# documents words self-join
|
||||
itertools = "0.10.0"
|
||||
|
@ -7,7 +7,6 @@ use heed::{Error as HeedError, MdbError};
|
||||
use rayon::ThreadPoolBuildError;
|
||||
use serde_json::{Map, Value};
|
||||
|
||||
use crate::search::ParserRule;
|
||||
use crate::{CriterionError, DocumentId, FieldId, SortError};
|
||||
|
||||
pub type Object = Map<String, Value>;
|
||||
@ -59,8 +58,8 @@ pub enum UserError {
|
||||
DocumentLimitReached,
|
||||
InvalidDocumentId { document_id: Value },
|
||||
InvalidFacetsDistribution { invalid_facets_name: BTreeSet<String> },
|
||||
InvalidFilter(FilterError),
|
||||
InvalidGeoField { document_id: Value, object: Value },
|
||||
InvalidFilter(String),
|
||||
InvalidSortableAttribute { field: String, valid_fields: BTreeSet<String> },
|
||||
SortRankingRuleMissing,
|
||||
InvalidStoreFile,
|
||||
@ -74,13 +73,6 @@ pub enum UserError {
|
||||
UnknownInternalDocumentId { document_id: DocumentId },
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub enum FilterError {
|
||||
InvalidAttribute { field: String, valid_fields: BTreeSet<String> },
|
||||
ReservedKeyword { field: String, context: Option<String> },
|
||||
Syntax(pest::error::Error<ParserRule>),
|
||||
}
|
||||
|
||||
impl From<io::Error> for Error {
|
||||
fn from(error: io::Error) -> Error {
|
||||
// TODO must be improved and more precise
|
||||
@ -165,12 +157,6 @@ impl From<UserError> for Error {
|
||||
}
|
||||
}
|
||||
|
||||
impl From<FilterError> for Error {
|
||||
fn from(error: FilterError) -> Error {
|
||||
Error::UserError(UserError::InvalidFilter(error))
|
||||
}
|
||||
}
|
||||
|
||||
impl From<SerializationError> for Error {
|
||||
fn from(error: SerializationError) -> Error {
|
||||
Error::InternalError(InternalError::Serialization(error))
|
||||
@ -219,6 +205,7 @@ impl StdError for InternalError {}
|
||||
impl fmt::Display for UserError {
|
||||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
|
||||
match self {
|
||||
Self::InvalidFilter(error) => f.write_str(error),
|
||||
Self::AttributeLimitReached => f.write_str("A document cannot contain more than 65,535 fields."),
|
||||
Self::CriterionError(error) => write!(f, "{}", error),
|
||||
Self::DocumentLimitReached => f.write_str("Maximum number of documents reached."),
|
||||
@ -231,7 +218,6 @@ impl fmt::Display for UserError {
|
||||
name_list
|
||||
)
|
||||
}
|
||||
Self::InvalidFilter(error) => error.fmt(f),
|
||||
Self::InvalidGeoField { document_id, object } => {
|
||||
let document_id = match document_id {
|
||||
Value::String(id) => id.clone(),
|
||||
@ -293,40 +279,6 @@ ranking rules settings to use the sort parameter at search time.",
|
||||
}
|
||||
}
|
||||
|
||||
impl fmt::Display for FilterError {
|
||||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
|
||||
match self {
|
||||
Self::InvalidAttribute { field, valid_fields } => write!(
|
||||
f,
|
||||
"Attribute `{}` is not filterable. Available filterable attributes are: `{}`.",
|
||||
field,
|
||||
valid_fields
|
||||
.clone()
|
||||
.into_iter()
|
||||
.reduce(|left, right| left + "`, `" + &right)
|
||||
.unwrap_or_default()
|
||||
),
|
||||
Self::ReservedKeyword { field, context: Some(context) } => {
|
||||
write!(
|
||||
f,
|
||||
"`{}` is a reserved keyword and thus can't be used as a filter expression. {}",
|
||||
field, context
|
||||
)
|
||||
}
|
||||
Self::ReservedKeyword { field, context: None } => {
|
||||
write!(
|
||||
f,
|
||||
"`{}` is a reserved keyword and thus can't be used as a filter expression.",
|
||||
field
|
||||
)
|
||||
}
|
||||
Self::Syntax(syntax_helper) => {
|
||||
write!(f, "Invalid syntax for the filter parameter: `{}`.", syntax_helper)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl StdError for UserError {}
|
||||
|
||||
impl fmt::Display for FieldIdMapMissingEntry {
|
||||
|
@ -1,6 +1,3 @@
|
||||
#[macro_use]
|
||||
extern crate pest_derive;
|
||||
|
||||
#[macro_use]
|
||||
pub mod documents;
|
||||
|
||||
@ -20,6 +17,7 @@ use std::collections::{BTreeMap, HashMap};
|
||||
use std::convert::{TryFrom, TryInto};
|
||||
use std::hash::BuildHasherDefault;
|
||||
|
||||
pub use filter_parser::{Condition, FilterCondition};
|
||||
use fxhash::{FxHasher32, FxHasher64};
|
||||
pub use grenad::CompressionType;
|
||||
use serde_json::{Map, Value};
|
||||
@ -37,7 +35,7 @@ pub use self::heed_codec::{
|
||||
RoaringBitmapLenCodec, StrBEU32Codec, StrStrU8Codec,
|
||||
};
|
||||
pub use self::index::Index;
|
||||
pub use self::search::{FacetDistribution, FilterCondition, MatchingWords, Search, SearchResult};
|
||||
pub use self::search::{FacetDistribution, Filter, MatchingWords, Search, SearchResult};
|
||||
|
||||
pub type Result<T> = std::result::Result<T, error::Error>;
|
||||
|
||||
|
589
milli/src/search/facet/filter.rs
Normal file
589
milli/src/search/facet/filter.rs
Normal file
@ -0,0 +1,589 @@
|
||||
use std::fmt::{Debug, Display};
|
||||
use std::ops::Bound::{self, Excluded, Included};
|
||||
use std::ops::Deref;
|
||||
|
||||
use either::Either;
|
||||
pub use filter_parser::{Condition, Error as FPError, FilterCondition, Span, Token};
|
||||
use heed::types::DecodeIgnore;
|
||||
use log::debug;
|
||||
use roaring::RoaringBitmap;
|
||||
|
||||
use super::FacetNumberRange;
|
||||
use crate::error::{Error, UserError};
|
||||
use crate::heed_codec::facet::{
|
||||
FacetLevelValueF64Codec, FacetStringLevelZeroCodec, FacetStringLevelZeroValueCodec,
|
||||
};
|
||||
use crate::{distance_between_two_points, CboRoaringBitmapCodec, FieldId, Index, Result};
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct Filter<'a> {
|
||||
condition: FilterCondition<'a>,
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
enum FilterError<'a> {
|
||||
AttributeNotFilterable { attribute: &'a str, filterable: String },
|
||||
BadGeo(&'a str),
|
||||
BadGeoLat(f64),
|
||||
BadGeoLng(f64),
|
||||
Reserved(&'a str),
|
||||
InternalError,
|
||||
}
|
||||
impl<'a> std::error::Error for FilterError<'a> {}
|
||||
|
||||
impl<'a> Display for FilterError<'a> {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Self::AttributeNotFilterable { attribute, filterable } => write!(
|
||||
f,
|
||||
"Attribute `{}` is not filterable. Available filterable attributes are: `{}`.",
|
||||
attribute,
|
||||
filterable,
|
||||
),
|
||||
Self::Reserved(keyword) => write!(
|
||||
f,
|
||||
"`{}` is a reserved keyword and thus can't be used as a filter expression.",
|
||||
keyword
|
||||
),
|
||||
Self::BadGeo(keyword) => write!(f, "`{}` is a reserved keyword and thus can't be used as a filter expression. Use the _geoRadius(latitude, longitude, distance) built-in rule to filter on _geo field coordinates.", keyword),
|
||||
Self::BadGeoLat(lat) => write!(f, "Bad latitude `{}`. Latitude must be contained between -90 and 90 degrees. ", lat),
|
||||
Self::BadGeoLng(lng) => write!(f, "Bad longitude `{}`. Longitude must be contained between -180 and 180 degrees. ", lng),
|
||||
Self::InternalError => write!(f, "Internal error while executing this filter."),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> From<FPError<'a>> for Error {
|
||||
fn from(error: FPError<'a>) -> Self {
|
||||
Self::UserError(UserError::InvalidFilter(error.to_string()))
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> From<Filter<'a>> for FilterCondition<'a> {
|
||||
fn from(f: Filter<'a>) -> Self {
|
||||
f.condition
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> Filter<'a> {
|
||||
pub fn from_array<I, J>(array: I) -> Result<Option<Self>>
|
||||
where
|
||||
I: IntoIterator<Item = Either<J, &'a str>>,
|
||||
J: IntoIterator<Item = &'a str>,
|
||||
{
|
||||
let mut ands: Option<FilterCondition> = None;
|
||||
|
||||
for either in array {
|
||||
match either {
|
||||
Either::Left(array) => {
|
||||
let mut ors = None;
|
||||
for rule in array {
|
||||
let condition = Self::from_str(rule.as_ref())?.condition;
|
||||
ors = match ors.take() {
|
||||
Some(ors) => {
|
||||
Some(FilterCondition::Or(Box::new(ors), Box::new(condition)))
|
||||
}
|
||||
None => Some(condition),
|
||||
};
|
||||
}
|
||||
|
||||
if let Some(rule) = ors {
|
||||
ands = match ands.take() {
|
||||
Some(ands) => {
|
||||
Some(FilterCondition::And(Box::new(ands), Box::new(rule)))
|
||||
}
|
||||
None => Some(rule),
|
||||
};
|
||||
}
|
||||
}
|
||||
Either::Right(rule) => {
|
||||
let condition = Self::from_str(rule.as_ref())?.condition;
|
||||
ands = match ands.take() {
|
||||
Some(ands) => {
|
||||
Some(FilterCondition::And(Box::new(ands), Box::new(condition)))
|
||||
}
|
||||
None => Some(condition),
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Ok(ands.map(|ands| Self { condition: ands }))
|
||||
}
|
||||
|
||||
pub fn from_str(expression: &'a str) -> Result<Self> {
|
||||
let condition = match FilterCondition::parse(expression) {
|
||||
Ok(fc) => Ok(fc),
|
||||
Err(e) => Err(Error::UserError(UserError::InvalidFilter(e.to_string()))),
|
||||
}?;
|
||||
Ok(Self { condition })
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> Filter<'a> {
|
||||
/// Aggregates the documents ids that are part of the specified range automatically
|
||||
/// going deeper through the levels.
|
||||
fn explore_facet_number_levels(
|
||||
rtxn: &heed::RoTxn,
|
||||
db: heed::Database<FacetLevelValueF64Codec, CboRoaringBitmapCodec>,
|
||||
field_id: FieldId,
|
||||
level: u8,
|
||||
left: Bound<f64>,
|
||||
right: Bound<f64>,
|
||||
output: &mut RoaringBitmap,
|
||||
) -> Result<()> {
|
||||
match (left, right) {
|
||||
// If the request is an exact value we must go directly to the deepest level.
|
||||
(Included(l), Included(r)) if l == r && level > 0 => {
|
||||
return Self::explore_facet_number_levels(
|
||||
rtxn, db, field_id, 0, left, right, output,
|
||||
);
|
||||
}
|
||||
// lower TO upper when lower > upper must return no result
|
||||
(Included(l), Included(r)) if l > r => return Ok(()),
|
||||
(Included(l), Excluded(r)) if l >= r => return Ok(()),
|
||||
(Excluded(l), Excluded(r)) if l >= r => return Ok(()),
|
||||
(Excluded(l), Included(r)) if l >= r => return Ok(()),
|
||||
(_, _) => (),
|
||||
}
|
||||
|
||||
let mut left_found = None;
|
||||
let mut right_found = None;
|
||||
|
||||
// We must create a custom iterator to be able to iterate over the
|
||||
// requested range as the range iterator cannot express some conditions.
|
||||
let iter = FacetNumberRange::new(rtxn, db, field_id, level, left, right)?;
|
||||
|
||||
debug!("Iterating between {:?} and {:?} (level {})", left, right, level);
|
||||
|
||||
for (i, result) in iter.enumerate() {
|
||||
let ((_fid, level, l, r), docids) = result?;
|
||||
debug!("{:?} to {:?} (level {}) found {} documents", l, r, level, docids.len());
|
||||
*output |= docids;
|
||||
// We save the leftest and rightest bounds we actually found at this level.
|
||||
if i == 0 {
|
||||
left_found = Some(l);
|
||||
}
|
||||
right_found = Some(r);
|
||||
}
|
||||
|
||||
// Can we go deeper?
|
||||
let deeper_level = match level.checked_sub(1) {
|
||||
Some(level) => level,
|
||||
None => return Ok(()),
|
||||
};
|
||||
|
||||
// We must refine the left and right bounds of this range by retrieving the
|
||||
// missing part in a deeper level.
|
||||
match left_found.zip(right_found) {
|
||||
Some((left_found, right_found)) => {
|
||||
// If the bound is satisfied we avoid calling this function again.
|
||||
if !matches!(left, Included(l) if l == left_found) {
|
||||
let sub_right = Excluded(left_found);
|
||||
debug!(
|
||||
"calling left with {:?} to {:?} (level {})",
|
||||
left, sub_right, deeper_level
|
||||
);
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
left,
|
||||
sub_right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
if !matches!(right, Included(r) if r == right_found) {
|
||||
let sub_left = Excluded(right_found);
|
||||
debug!(
|
||||
"calling right with {:?} to {:?} (level {})",
|
||||
sub_left, right, deeper_level
|
||||
);
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
sub_left,
|
||||
right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
}
|
||||
None => {
|
||||
// If we found nothing at this level it means that we must find
|
||||
// the same bounds but at a deeper, more precise level.
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
left,
|
||||
right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn evaluate_operator(
|
||||
rtxn: &heed::RoTxn,
|
||||
index: &Index,
|
||||
numbers_db: heed::Database<FacetLevelValueF64Codec, CboRoaringBitmapCodec>,
|
||||
strings_db: heed::Database<FacetStringLevelZeroCodec, FacetStringLevelZeroValueCodec>,
|
||||
field_id: FieldId,
|
||||
operator: &Condition<'a>,
|
||||
) -> Result<RoaringBitmap> {
|
||||
// Make sure we always bound the ranges with the field id and the level,
|
||||
// as the facets values are all in the same database and prefixed by the
|
||||
// field id and the level.
|
||||
|
||||
let (left, right) = match operator {
|
||||
Condition::GreaterThan(val) => (Excluded(val.parse()?), Included(f64::MAX)),
|
||||
Condition::GreaterThanOrEqual(val) => (Included(val.parse()?), Included(f64::MAX)),
|
||||
Condition::LowerThan(val) => (Included(f64::MIN), Excluded(val.parse()?)),
|
||||
Condition::LowerThanOrEqual(val) => (Included(f64::MIN), Included(val.parse()?)),
|
||||
Condition::Between { from, to } => (Included(from.parse()?), Included(to.parse()?)),
|
||||
Condition::Equal(val) => {
|
||||
let (_original_value, string_docids) =
|
||||
strings_db.get(rtxn, &(field_id, &val.to_lowercase()))?.unwrap_or_default();
|
||||
let number = val.parse::<f64>().ok();
|
||||
let number_docids = match number {
|
||||
Some(n) => {
|
||||
let n = Included(n);
|
||||
let mut output = RoaringBitmap::new();
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
numbers_db,
|
||||
field_id,
|
||||
0,
|
||||
n,
|
||||
n,
|
||||
&mut output,
|
||||
)?;
|
||||
output
|
||||
}
|
||||
None => RoaringBitmap::new(),
|
||||
};
|
||||
return Ok(string_docids | number_docids);
|
||||
}
|
||||
Condition::NotEqual(val) => {
|
||||
let number = val.parse::<f64>().ok();
|
||||
let all_numbers_ids = if number.is_some() {
|
||||
index.number_faceted_documents_ids(rtxn, field_id)?
|
||||
} else {
|
||||
RoaringBitmap::new()
|
||||
};
|
||||
let all_strings_ids = index.string_faceted_documents_ids(rtxn, field_id)?;
|
||||
let operator = Condition::Equal(val.clone());
|
||||
let docids = Self::evaluate_operator(
|
||||
rtxn, index, numbers_db, strings_db, field_id, &operator,
|
||||
)?;
|
||||
return Ok((all_numbers_ids | all_strings_ids) - docids);
|
||||
}
|
||||
};
|
||||
|
||||
// Ask for the biggest value that can exist for this specific field, if it exists
|
||||
// that's fine if it don't, the value just before will be returned instead.
|
||||
let biggest_level = numbers_db
|
||||
.remap_data_type::<DecodeIgnore>()
|
||||
.get_lower_than_or_equal_to(rtxn, &(field_id, u8::MAX, f64::MAX, f64::MAX))?
|
||||
.and_then(|((id, level, _, _), _)| if id == field_id { Some(level) } else { None });
|
||||
|
||||
match biggest_level {
|
||||
Some(level) => {
|
||||
let mut output = RoaringBitmap::new();
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
numbers_db,
|
||||
field_id,
|
||||
level,
|
||||
left,
|
||||
right,
|
||||
&mut output,
|
||||
)?;
|
||||
Ok(output)
|
||||
}
|
||||
None => Ok(RoaringBitmap::new()),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn evaluate(&self, rtxn: &heed::RoTxn, index: &Index) -> Result<RoaringBitmap> {
|
||||
let numbers_db = index.facet_id_f64_docids;
|
||||
let strings_db = index.facet_id_string_docids;
|
||||
|
||||
match &self.condition {
|
||||
FilterCondition::Condition { fid, op } => {
|
||||
let filterable_fields = index.filterable_fields(rtxn)?;
|
||||
if filterable_fields.contains(&fid.to_lowercase()) {
|
||||
let field_ids_map = index.fields_ids_map(rtxn)?;
|
||||
if let Some(fid) = field_ids_map.id(&fid) {
|
||||
Self::evaluate_operator(rtxn, index, numbers_db, strings_db, fid, &op)
|
||||
} else {
|
||||
return Err(fid.as_external_error(FilterError::InternalError))?;
|
||||
}
|
||||
} else {
|
||||
match *fid.deref() {
|
||||
attribute @ "_geo" => {
|
||||
return Err(fid.as_external_error(FilterError::BadGeo(attribute)))?;
|
||||
}
|
||||
attribute if attribute.starts_with("_geoPoint(") => {
|
||||
return Err(fid.as_external_error(FilterError::BadGeo("_geoPoint")))?;
|
||||
}
|
||||
attribute @ "_geoDistance" => {
|
||||
return Err(fid.as_external_error(FilterError::Reserved(attribute)))?;
|
||||
}
|
||||
attribute => {
|
||||
return Err(fid.as_external_error(
|
||||
FilterError::AttributeNotFilterable {
|
||||
attribute,
|
||||
filterable: filterable_fields
|
||||
.into_iter()
|
||||
.collect::<Vec<_>>()
|
||||
.join(" "),
|
||||
},
|
||||
))?;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
FilterCondition::Or(lhs, rhs) => {
|
||||
let lhs = Self::evaluate(&(lhs.as_ref().clone()).into(), rtxn, index)?;
|
||||
let rhs = Self::evaluate(&(rhs.as_ref().clone()).into(), rtxn, index)?;
|
||||
Ok(lhs | rhs)
|
||||
}
|
||||
FilterCondition::And(lhs, rhs) => {
|
||||
let lhs = Self::evaluate(&(lhs.as_ref().clone()).into(), rtxn, index)?;
|
||||
let rhs = Self::evaluate(&(rhs.as_ref().clone()).into(), rtxn, index)?;
|
||||
Ok(lhs & rhs)
|
||||
}
|
||||
FilterCondition::Empty => Ok(RoaringBitmap::new()),
|
||||
FilterCondition::GeoLowerThan { point, radius } => {
|
||||
let filterable_fields = index.filterable_fields(rtxn)?;
|
||||
if filterable_fields.contains("_geo") {
|
||||
let base_point: [f64; 2] = [point[0].parse()?, point[1].parse()?];
|
||||
if !(-90.0..=90.0).contains(&base_point[0]) {
|
||||
return Err(
|
||||
point[0].as_external_error(FilterError::BadGeoLat(base_point[0]))
|
||||
)?;
|
||||
}
|
||||
if !(-180.0..=180.0).contains(&base_point[1]) {
|
||||
return Err(
|
||||
point[1].as_external_error(FilterError::BadGeoLng(base_point[1]))
|
||||
)?;
|
||||
}
|
||||
let radius = radius.parse()?;
|
||||
let rtree = match index.geo_rtree(rtxn)? {
|
||||
Some(rtree) => rtree,
|
||||
None => return Ok(RoaringBitmap::new()),
|
||||
};
|
||||
|
||||
let result = rtree
|
||||
.nearest_neighbor_iter(&base_point)
|
||||
.take_while(|point| {
|
||||
distance_between_two_points(&base_point, point.geom()) < radius
|
||||
})
|
||||
.map(|point| point.data)
|
||||
.collect();
|
||||
|
||||
Ok(result)
|
||||
} else {
|
||||
return Err(point[0].as_external_error(FilterError::AttributeNotFilterable {
|
||||
attribute: "_geo",
|
||||
filterable: filterable_fields.into_iter().collect::<Vec<_>>().join(" "),
|
||||
}))?;
|
||||
}
|
||||
}
|
||||
FilterCondition::GeoGreaterThan { point, radius } => {
|
||||
let result = Self::evaluate(
|
||||
&FilterCondition::GeoLowerThan { point: point.clone(), radius: radius.clone() }
|
||||
.into(),
|
||||
rtxn,
|
||||
index,
|
||||
)?;
|
||||
let geo_faceted_doc_ids = index.geo_faceted_documents_ids(rtxn)?;
|
||||
Ok(geo_faceted_doc_ids - result)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl<'a> From<FilterCondition<'a>> for Filter<'a> {
|
||||
fn from(fc: FilterCondition<'a>) -> Self {
|
||||
Self { condition: fc }
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use big_s::S;
|
||||
use either::Either;
|
||||
use heed::EnvOpenOptions;
|
||||
use maplit::hashset;
|
||||
|
||||
use super::*;
|
||||
use crate::update::Settings;
|
||||
use crate::Index;
|
||||
|
||||
#[test]
|
||||
fn from_array() {
|
||||
// Simple array with Left
|
||||
let condition = Filter::from_array(vec![Either::Left(["channel = mv"])]).unwrap().unwrap();
|
||||
let expected = Filter::from_str("channel = mv").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Simple array with Right
|
||||
let condition = Filter::from_array::<_, Option<&str>>(vec![Either::Right("channel = mv")])
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
let expected = Filter::from_str("channel = mv").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Array with Left and escaped quote
|
||||
let condition =
|
||||
Filter::from_array(vec![Either::Left(["channel = \"Mister Mv\""])]).unwrap().unwrap();
|
||||
let expected = Filter::from_str("channel = \"Mister Mv\"").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Array with Right and escaped quote
|
||||
let condition =
|
||||
Filter::from_array::<_, Option<&str>>(vec![Either::Right("channel = \"Mister Mv\"")])
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
let expected = Filter::from_str("channel = \"Mister Mv\"").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Array with Left and escaped simple quote
|
||||
let condition =
|
||||
Filter::from_array(vec![Either::Left(["channel = 'Mister Mv'"])]).unwrap().unwrap();
|
||||
let expected = Filter::from_str("channel = 'Mister Mv'").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Array with Right and escaped simple quote
|
||||
let condition =
|
||||
Filter::from_array::<_, Option<&str>>(vec![Either::Right("channel = 'Mister Mv'")])
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
let expected = Filter::from_str("channel = 'Mister Mv'").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Simple with parenthesis
|
||||
let condition =
|
||||
Filter::from_array(vec![Either::Left(["(channel = mv)"])]).unwrap().unwrap();
|
||||
let expected = Filter::from_str("(channel = mv)").unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// Test that the facet condition is correctly generated.
|
||||
let condition = Filter::from_array(vec![
|
||||
Either::Right("channel = gotaga"),
|
||||
Either::Left(vec!["timestamp = 44", "channel != ponce"]),
|
||||
])
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
let expected =
|
||||
Filter::from_str("channel = gotaga AND (timestamp = 44 OR channel != ponce)").unwrap();
|
||||
println!("\nExpecting: {:#?}\nGot: {:#?}\n", expected, condition);
|
||||
assert_eq!(condition, expected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn not_filterable() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let filter = Filter::from_str("_geoRadius(42, 150, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().starts_with(
|
||||
"Attribute `_geo` is not filterable. Available filterable attributes are: ``."
|
||||
));
|
||||
|
||||
let filter = Filter::from_str("dog = \"bernese mountain\"").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().starts_with(
|
||||
"Attribute `dog` is not filterable. Available filterable attributes are: ``."
|
||||
));
|
||||
drop(rtxn);
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_searchable_fields(vec![S("title")]);
|
||||
builder.set_filterable_fields(hashset! { S("title") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
|
||||
let filter = Filter::from_str("_geoRadius(-100, 150, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().starts_with(
|
||||
"Attribute `_geo` is not filterable. Available filterable attributes are: `title`."
|
||||
));
|
||||
|
||||
let filter = Filter::from_str("name = 12").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().starts_with(
|
||||
"Attribute `name` is not filterable. Available filterable attributes are: `title`."
|
||||
));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn geo_radius_error() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_searchable_fields(vec![S("_geo"), S("price")]); // to keep the fields order
|
||||
builder.set_filterable_fields(hashset! { S("_geo"), S("price") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
|
||||
// georadius have a bad latitude
|
||||
let filter = Filter::from_str("_geoRadius(-100, 150, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(
|
||||
error.to_string().starts_with(
|
||||
"Bad latitude `-100`. Latitude must be contained between -90 and 90 degrees."
|
||||
),
|
||||
"{}",
|
||||
error.to_string()
|
||||
);
|
||||
|
||||
// georadius have a bad latitude
|
||||
let filter = Filter::from_str("_geoRadius(-90.0000001, 150, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"Bad latitude `-90.0000001`. Latitude must be contained between -90 and 90 degrees."
|
||||
));
|
||||
|
||||
// georadius have a bad longitude
|
||||
let filter = Filter::from_str("_geoRadius(-10, 250, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(
|
||||
error.to_string().contains(
|
||||
"Bad longitude `250`. Longitude must be contained between -180 and 180 degrees."
|
||||
),
|
||||
"{}",
|
||||
error.to_string(),
|
||||
);
|
||||
|
||||
// georadius have a bad longitude
|
||||
let filter = Filter::from_str("_geoRadius(-10, 180.000001, 10)").unwrap();
|
||||
let error = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"Bad longitude `180.000001`. Longitude must be contained between -180 and 180 degrees."
|
||||
));
|
||||
}
|
||||
}
|
@ -1,929 +0,0 @@
|
||||
use std::collections::HashSet;
|
||||
use std::fmt::Debug;
|
||||
use std::ops::Bound::{self, Excluded, Included};
|
||||
use std::result::Result as StdResult;
|
||||
use std::str::FromStr;
|
||||
|
||||
use either::Either;
|
||||
use heed::types::DecodeIgnore;
|
||||
use log::debug;
|
||||
use pest::error::{Error as PestError, ErrorVariant};
|
||||
use pest::iterators::{Pair, Pairs};
|
||||
use pest::Parser;
|
||||
use roaring::RoaringBitmap;
|
||||
|
||||
use self::FilterCondition::*;
|
||||
use self::Operator::*;
|
||||
use super::parser::{FilterParser, Rule, PREC_CLIMBER};
|
||||
use super::FacetNumberRange;
|
||||
use crate::error::FilterError;
|
||||
use crate::heed_codec::facet::{
|
||||
FacetLevelValueF64Codec, FacetStringLevelZeroCodec, FacetStringLevelZeroValueCodec,
|
||||
};
|
||||
use crate::{
|
||||
distance_between_two_points, CboRoaringBitmapCodec, FieldId, FieldsIdsMap, Index, Result,
|
||||
};
|
||||
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub enum Operator {
|
||||
GreaterThan(f64),
|
||||
GreaterThanOrEqual(f64),
|
||||
Equal(Option<f64>, String),
|
||||
NotEqual(Option<f64>, String),
|
||||
LowerThan(f64),
|
||||
LowerThanOrEqual(f64),
|
||||
Between(f64, f64),
|
||||
GeoLowerThan([f64; 2], f64),
|
||||
GeoGreaterThan([f64; 2], f64),
|
||||
}
|
||||
|
||||
impl Operator {
|
||||
/// This method can return two operations in case it must express
|
||||
/// an OR operation for the between case (i.e. `TO`).
|
||||
fn negate(self) -> (Self, Option<Self>) {
|
||||
match self {
|
||||
GreaterThan(n) => (LowerThanOrEqual(n), None),
|
||||
GreaterThanOrEqual(n) => (LowerThan(n), None),
|
||||
Equal(n, s) => (NotEqual(n, s), None),
|
||||
NotEqual(n, s) => (Equal(n, s), None),
|
||||
LowerThan(n) => (GreaterThanOrEqual(n), None),
|
||||
LowerThanOrEqual(n) => (GreaterThan(n), None),
|
||||
Between(n, m) => (LowerThan(n), Some(GreaterThan(m))),
|
||||
GeoLowerThan(point, distance) => (GeoGreaterThan(point, distance), None),
|
||||
GeoGreaterThan(point, distance) => (GeoLowerThan(point, distance), None),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub enum FilterCondition {
|
||||
Operator(FieldId, Operator),
|
||||
Or(Box<Self>, Box<Self>),
|
||||
And(Box<Self>, Box<Self>),
|
||||
Empty,
|
||||
}
|
||||
|
||||
impl FilterCondition {
|
||||
pub fn from_array<I, J, A, B>(
|
||||
rtxn: &heed::RoTxn,
|
||||
index: &Index,
|
||||
array: I,
|
||||
) -> Result<Option<FilterCondition>>
|
||||
where
|
||||
I: IntoIterator<Item = Either<J, B>>,
|
||||
J: IntoIterator<Item = A>,
|
||||
A: AsRef<str>,
|
||||
B: AsRef<str>,
|
||||
{
|
||||
let mut ands = None;
|
||||
|
||||
for either in array {
|
||||
match either {
|
||||
Either::Left(array) => {
|
||||
let mut ors = None;
|
||||
for rule in array {
|
||||
let condition = FilterCondition::from_str(rtxn, index, rule.as_ref())?;
|
||||
ors = match ors.take() {
|
||||
Some(ors) => Some(Or(Box::new(ors), Box::new(condition))),
|
||||
None => Some(condition),
|
||||
};
|
||||
}
|
||||
|
||||
if let Some(rule) = ors {
|
||||
ands = match ands.take() {
|
||||
Some(ands) => Some(And(Box::new(ands), Box::new(rule))),
|
||||
None => Some(rule),
|
||||
};
|
||||
}
|
||||
}
|
||||
Either::Right(rule) => {
|
||||
let condition = FilterCondition::from_str(rtxn, index, rule.as_ref())?;
|
||||
ands = match ands.take() {
|
||||
Some(ands) => Some(And(Box::new(ands), Box::new(condition))),
|
||||
None => Some(condition),
|
||||
};
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Ok(ands)
|
||||
}
|
||||
|
||||
pub fn from_str(
|
||||
rtxn: &heed::RoTxn,
|
||||
index: &Index,
|
||||
expression: &str,
|
||||
) -> Result<FilterCondition> {
|
||||
let fields_ids_map = index.fields_ids_map(rtxn)?;
|
||||
let filterable_fields = index.filterable_fields(rtxn)?;
|
||||
let lexed = FilterParser::parse(Rule::prgm, expression).map_err(FilterError::Syntax)?;
|
||||
FilterCondition::from_pairs(&fields_ids_map, &filterable_fields, lexed)
|
||||
}
|
||||
|
||||
fn from_pairs(
|
||||
fim: &FieldsIdsMap,
|
||||
ff: &HashSet<String>,
|
||||
expression: Pairs<Rule>,
|
||||
) -> Result<Self> {
|
||||
PREC_CLIMBER.climb(
|
||||
expression,
|
||||
|pair: Pair<Rule>| match pair.as_rule() {
|
||||
Rule::greater => Ok(Self::greater_than(fim, ff, pair)?),
|
||||
Rule::geq => Ok(Self::greater_than_or_equal(fim, ff, pair)?),
|
||||
Rule::eq => Ok(Self::equal(fim, ff, pair)?),
|
||||
Rule::neq => Ok(Self::equal(fim, ff, pair)?.negate()),
|
||||
Rule::leq => Ok(Self::lower_than_or_equal(fim, ff, pair)?),
|
||||
Rule::less => Ok(Self::lower_than(fim, ff, pair)?),
|
||||
Rule::between => Ok(Self::between(fim, ff, pair)?),
|
||||
Rule::geo_radius => Ok(Self::geo_radius(fim, ff, pair)?),
|
||||
Rule::not => Ok(Self::from_pairs(fim, ff, pair.into_inner())?.negate()),
|
||||
Rule::prgm => Self::from_pairs(fim, ff, pair.into_inner()),
|
||||
Rule::term => Self::from_pairs(fim, ff, pair.into_inner()),
|
||||
_ => unreachable!(),
|
||||
},
|
||||
|lhs: Result<Self>, op: Pair<Rule>, rhs: Result<Self>| match op.as_rule() {
|
||||
Rule::or => Ok(Or(Box::new(lhs?), Box::new(rhs?))),
|
||||
Rule::and => Ok(And(Box::new(lhs?), Box::new(rhs?))),
|
||||
_ => unreachable!(),
|
||||
},
|
||||
)
|
||||
}
|
||||
|
||||
fn negate(self) -> FilterCondition {
|
||||
match self {
|
||||
Operator(fid, op) => match op.negate() {
|
||||
(op, None) => Operator(fid, op),
|
||||
(a, Some(b)) => Or(Box::new(Operator(fid, a)), Box::new(Operator(fid, b))),
|
||||
},
|
||||
Or(a, b) => And(Box::new(a.negate()), Box::new(b.negate())),
|
||||
And(a, b) => Or(Box::new(a.negate()), Box::new(b.negate())),
|
||||
Empty => Empty,
|
||||
}
|
||||
}
|
||||
|
||||
fn geo_radius(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
if !filterable_fields.contains("_geo") {
|
||||
return Err(FilterError::InvalidAttribute {
|
||||
field: "_geo".to_string(),
|
||||
valid_fields: filterable_fields.into_iter().cloned().collect(),
|
||||
}
|
||||
.into());
|
||||
}
|
||||
let mut items = item.into_inner();
|
||||
let fid = match fields_ids_map.id("_geo") {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
let parameters_item = items.next().unwrap();
|
||||
// We don't need more than 3 parameters, but to handle errors correctly we are still going
|
||||
// to extract the first 4 parameters
|
||||
let param_span = parameters_item.as_span();
|
||||
let parameters = parameters_item
|
||||
.into_inner()
|
||||
.take(4)
|
||||
.map(|param| (param.clone(), param.as_span()))
|
||||
.map(|(param, span)| pest_parse(param).0.map(|arg| (arg, span)))
|
||||
.collect::<StdResult<Vec<(f64, _)>, _>>()
|
||||
.map_err(FilterError::Syntax)?;
|
||||
if parameters.len() != 3 {
|
||||
return Err(FilterError::Syntax(PestError::new_from_span(
|
||||
ErrorVariant::CustomError {
|
||||
message: format!("The _geoRadius filter expect three arguments: _geoRadius(latitude, longitude, radius)"),
|
||||
},
|
||||
// we want to point to the last parameters and if there was no parameters we
|
||||
// point to the parenthesis
|
||||
parameters.last().map(|param| param.1.clone()).unwrap_or(param_span),
|
||||
)).into());
|
||||
}
|
||||
let (lat, lng, distance) = (¶meters[0], ¶meters[1], parameters[2].0);
|
||||
if !(-90.0..=90.0).contains(&lat.0) {
|
||||
return Err(FilterError::Syntax(PestError::new_from_span(
|
||||
ErrorVariant::CustomError {
|
||||
message: format!("Latitude must be contained between -90 and 90 degrees."),
|
||||
},
|
||||
lat.1.clone(),
|
||||
)))?;
|
||||
} else if !(-180.0..=180.0).contains(&lng.0) {
|
||||
return Err(FilterError::Syntax(PestError::new_from_span(
|
||||
ErrorVariant::CustomError {
|
||||
message: format!("Longitude must be contained between -180 and 180 degrees."),
|
||||
},
|
||||
lng.1.clone(),
|
||||
)))?;
|
||||
}
|
||||
Ok(Operator(fid, GeoLowerThan([lat.0, lng.0], distance)))
|
||||
}
|
||||
|
||||
fn between(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let (lresult, _) = pest_parse(items.next().unwrap());
|
||||
let (rresult, _) = pest_parse(items.next().unwrap());
|
||||
|
||||
let lvalue = lresult.map_err(FilterError::Syntax)?;
|
||||
let rvalue = rresult.map_err(FilterError::Syntax)?;
|
||||
|
||||
Ok(Operator(fid, Between(lvalue, rvalue)))
|
||||
}
|
||||
|
||||
fn equal(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let value = items.next().unwrap();
|
||||
let (result, svalue) = pest_parse(value);
|
||||
|
||||
let svalue = svalue.to_lowercase();
|
||||
Ok(Operator(fid, Equal(result.ok(), svalue)))
|
||||
}
|
||||
|
||||
fn greater_than(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let value = items.next().unwrap();
|
||||
let (result, _svalue) = pest_parse(value);
|
||||
let value = result.map_err(FilterError::Syntax)?;
|
||||
|
||||
Ok(Operator(fid, GreaterThan(value)))
|
||||
}
|
||||
|
||||
fn greater_than_or_equal(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let value = items.next().unwrap();
|
||||
let (result, _svalue) = pest_parse(value);
|
||||
let value = result.map_err(FilterError::Syntax)?;
|
||||
|
||||
Ok(Operator(fid, GreaterThanOrEqual(value)))
|
||||
}
|
||||
|
||||
fn lower_than(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let value = items.next().unwrap();
|
||||
let (result, _svalue) = pest_parse(value);
|
||||
let value = result.map_err(FilterError::Syntax)?;
|
||||
|
||||
Ok(Operator(fid, LowerThan(value)))
|
||||
}
|
||||
|
||||
fn lower_than_or_equal(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
item: Pair<Rule>,
|
||||
) -> Result<FilterCondition> {
|
||||
let mut items = item.into_inner();
|
||||
let fid = match field_id(fields_ids_map, filterable_fields, &mut items)? {
|
||||
Some(fid) => fid,
|
||||
None => return Ok(Empty),
|
||||
};
|
||||
|
||||
let value = items.next().unwrap();
|
||||
let (result, _svalue) = pest_parse(value);
|
||||
let value = result.map_err(FilterError::Syntax)?;
|
||||
|
||||
Ok(Operator(fid, LowerThanOrEqual(value)))
|
||||
}
|
||||
}
|
||||
|
||||
impl FilterCondition {
|
||||
/// Aggregates the documents ids that are part of the specified range automatically
|
||||
/// going deeper through the levels.
|
||||
fn explore_facet_number_levels(
|
||||
rtxn: &heed::RoTxn,
|
||||
db: heed::Database<FacetLevelValueF64Codec, CboRoaringBitmapCodec>,
|
||||
field_id: FieldId,
|
||||
level: u8,
|
||||
left: Bound<f64>,
|
||||
right: Bound<f64>,
|
||||
output: &mut RoaringBitmap,
|
||||
) -> Result<()> {
|
||||
match (left, right) {
|
||||
// If the request is an exact value we must go directly to the deepest level.
|
||||
(Included(l), Included(r)) if l == r && level > 0 => {
|
||||
return Self::explore_facet_number_levels(
|
||||
rtxn, db, field_id, 0, left, right, output,
|
||||
);
|
||||
}
|
||||
// lower TO upper when lower > upper must return no result
|
||||
(Included(l), Included(r)) if l > r => return Ok(()),
|
||||
(Included(l), Excluded(r)) if l >= r => return Ok(()),
|
||||
(Excluded(l), Excluded(r)) if l >= r => return Ok(()),
|
||||
(Excluded(l), Included(r)) if l >= r => return Ok(()),
|
||||
(_, _) => (),
|
||||
}
|
||||
|
||||
let mut left_found = None;
|
||||
let mut right_found = None;
|
||||
|
||||
// We must create a custom iterator to be able to iterate over the
|
||||
// requested range as the range iterator cannot express some conditions.
|
||||
let iter = FacetNumberRange::new(rtxn, db, field_id, level, left, right)?;
|
||||
|
||||
debug!("Iterating between {:?} and {:?} (level {})", left, right, level);
|
||||
|
||||
for (i, result) in iter.enumerate() {
|
||||
let ((_fid, level, l, r), docids) = result?;
|
||||
debug!("{:?} to {:?} (level {}) found {} documents", l, r, level, docids.len());
|
||||
*output |= docids;
|
||||
// We save the leftest and rightest bounds we actually found at this level.
|
||||
if i == 0 {
|
||||
left_found = Some(l);
|
||||
}
|
||||
right_found = Some(r);
|
||||
}
|
||||
|
||||
// Can we go deeper?
|
||||
let deeper_level = match level.checked_sub(1) {
|
||||
Some(level) => level,
|
||||
None => return Ok(()),
|
||||
};
|
||||
|
||||
// We must refine the left and right bounds of this range by retrieving the
|
||||
// missing part in a deeper level.
|
||||
match left_found.zip(right_found) {
|
||||
Some((left_found, right_found)) => {
|
||||
// If the bound is satisfied we avoid calling this function again.
|
||||
if !matches!(left, Included(l) if l == left_found) {
|
||||
let sub_right = Excluded(left_found);
|
||||
debug!(
|
||||
"calling left with {:?} to {:?} (level {})",
|
||||
left, sub_right, deeper_level
|
||||
);
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
left,
|
||||
sub_right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
if !matches!(right, Included(r) if r == right_found) {
|
||||
let sub_left = Excluded(right_found);
|
||||
debug!(
|
||||
"calling right with {:?} to {:?} (level {})",
|
||||
sub_left, right, deeper_level
|
||||
);
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
sub_left,
|
||||
right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
}
|
||||
None => {
|
||||
// If we found nothing at this level it means that we must find
|
||||
// the same bounds but at a deeper, more precise level.
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
db,
|
||||
field_id,
|
||||
deeper_level,
|
||||
left,
|
||||
right,
|
||||
output,
|
||||
)?;
|
||||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn evaluate_operator(
|
||||
rtxn: &heed::RoTxn,
|
||||
index: &Index,
|
||||
numbers_db: heed::Database<FacetLevelValueF64Codec, CboRoaringBitmapCodec>,
|
||||
strings_db: heed::Database<FacetStringLevelZeroCodec, FacetStringLevelZeroValueCodec>,
|
||||
field_id: FieldId,
|
||||
operator: &Operator,
|
||||
) -> Result<RoaringBitmap> {
|
||||
// Make sure we always bound the ranges with the field id and the level,
|
||||
// as the facets values are all in the same database and prefixed by the
|
||||
// field id and the level.
|
||||
let (left, right) = match operator {
|
||||
GreaterThan(val) => (Excluded(*val), Included(f64::MAX)),
|
||||
GreaterThanOrEqual(val) => (Included(*val), Included(f64::MAX)),
|
||||
Equal(number, string) => {
|
||||
let (_original_value, string_docids) =
|
||||
strings_db.get(rtxn, &(field_id, &string))?.unwrap_or_default();
|
||||
let number_docids = match number {
|
||||
Some(n) => {
|
||||
let n = Included(*n);
|
||||
let mut output = RoaringBitmap::new();
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
numbers_db,
|
||||
field_id,
|
||||
0,
|
||||
n,
|
||||
n,
|
||||
&mut output,
|
||||
)?;
|
||||
output
|
||||
}
|
||||
None => RoaringBitmap::new(),
|
||||
};
|
||||
return Ok(string_docids | number_docids);
|
||||
}
|
||||
NotEqual(number, string) => {
|
||||
let all_numbers_ids = if number.is_some() {
|
||||
index.number_faceted_documents_ids(rtxn, field_id)?
|
||||
} else {
|
||||
RoaringBitmap::new()
|
||||
};
|
||||
let all_strings_ids = index.string_faceted_documents_ids(rtxn, field_id)?;
|
||||
let operator = Equal(*number, string.clone());
|
||||
let docids = Self::evaluate_operator(
|
||||
rtxn, index, numbers_db, strings_db, field_id, &operator,
|
||||
)?;
|
||||
return Ok((all_numbers_ids | all_strings_ids) - docids);
|
||||
}
|
||||
LowerThan(val) => (Included(f64::MIN), Excluded(*val)),
|
||||
LowerThanOrEqual(val) => (Included(f64::MIN), Included(*val)),
|
||||
Between(left, right) => (Included(*left), Included(*right)),
|
||||
GeoLowerThan(base_point, distance) => {
|
||||
let rtree = match index.geo_rtree(rtxn)? {
|
||||
Some(rtree) => rtree,
|
||||
None => return Ok(RoaringBitmap::new()),
|
||||
};
|
||||
|
||||
let result = rtree
|
||||
.nearest_neighbor_iter(base_point)
|
||||
.take_while(|point| {
|
||||
distance_between_two_points(base_point, point.geom()) < *distance
|
||||
})
|
||||
.map(|point| point.data)
|
||||
.collect();
|
||||
|
||||
return Ok(result);
|
||||
}
|
||||
GeoGreaterThan(point, distance) => {
|
||||
let result = Self::evaluate_operator(
|
||||
rtxn,
|
||||
index,
|
||||
numbers_db,
|
||||
strings_db,
|
||||
field_id,
|
||||
&GeoLowerThan(point.clone(), *distance),
|
||||
)?;
|
||||
let geo_faceted_doc_ids = index.geo_faceted_documents_ids(rtxn)?;
|
||||
return Ok(geo_faceted_doc_ids - result);
|
||||
}
|
||||
};
|
||||
|
||||
// Ask for the biggest value that can exist for this specific field, if it exists
|
||||
// that's fine if it don't, the value just before will be returned instead.
|
||||
let biggest_level = numbers_db
|
||||
.remap_data_type::<DecodeIgnore>()
|
||||
.get_lower_than_or_equal_to(rtxn, &(field_id, u8::MAX, f64::MAX, f64::MAX))?
|
||||
.and_then(|((id, level, _, _), _)| if id == field_id { Some(level) } else { None });
|
||||
|
||||
match biggest_level {
|
||||
Some(level) => {
|
||||
let mut output = RoaringBitmap::new();
|
||||
Self::explore_facet_number_levels(
|
||||
rtxn,
|
||||
numbers_db,
|
||||
field_id,
|
||||
level,
|
||||
left,
|
||||
right,
|
||||
&mut output,
|
||||
)?;
|
||||
Ok(output)
|
||||
}
|
||||
None => Ok(RoaringBitmap::new()),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn evaluate(&self, rtxn: &heed::RoTxn, index: &Index) -> Result<RoaringBitmap> {
|
||||
let numbers_db = index.facet_id_f64_docids;
|
||||
let strings_db = index.facet_id_string_docids;
|
||||
|
||||
match self {
|
||||
Operator(fid, op) => {
|
||||
Self::evaluate_operator(rtxn, index, numbers_db, strings_db, *fid, op)
|
||||
}
|
||||
Or(lhs, rhs) => {
|
||||
let lhs = lhs.evaluate(rtxn, index)?;
|
||||
let rhs = rhs.evaluate(rtxn, index)?;
|
||||
Ok(lhs | rhs)
|
||||
}
|
||||
And(lhs, rhs) => {
|
||||
let lhs = lhs.evaluate(rtxn, index)?;
|
||||
let rhs = rhs.evaluate(rtxn, index)?;
|
||||
Ok(lhs & rhs)
|
||||
}
|
||||
Empty => Ok(RoaringBitmap::new()),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Retrieve the field id base on the pest value.
|
||||
///
|
||||
/// Returns an error if the given value is not filterable.
|
||||
///
|
||||
/// Returns Ok(None) if the given value is filterable, but is not yet ascociated to a field_id.
|
||||
///
|
||||
/// The pest pair is simply a string associated with a span, a location to highlight in
|
||||
/// the error message.
|
||||
fn field_id(
|
||||
fields_ids_map: &FieldsIdsMap,
|
||||
filterable_fields: &HashSet<String>,
|
||||
items: &mut Pairs<Rule>,
|
||||
) -> StdResult<Option<FieldId>, FilterError> {
|
||||
// lexing ensures that we at least have a key
|
||||
let key = items.next().unwrap();
|
||||
if key.as_rule() == Rule::reserved {
|
||||
return match key.as_str() {
|
||||
key if key.starts_with("_geoPoint") => {
|
||||
Err(FilterError::ReservedKeyword { field: "_geoPoint".to_string(), context: Some("Use the _geoRadius(latitude, longitude, distance) built-in rule to filter on _geo field coordinates.".to_string()) })
|
||||
}
|
||||
"_geo" => {
|
||||
Err(FilterError::ReservedKeyword { field: "_geo".to_string(), context: Some("Use the _geoRadius(latitude, longitude, distance) built-in rule to filter on _geo field coordinates.".to_string()) })
|
||||
}
|
||||
key =>
|
||||
Err(FilterError::ReservedKeyword { field: key.to_string(), context: None }),
|
||||
};
|
||||
}
|
||||
|
||||
if !filterable_fields.contains(key.as_str()) {
|
||||
return Err(FilterError::InvalidAttribute {
|
||||
field: key.as_str().to_string(),
|
||||
valid_fields: filterable_fields.into_iter().cloned().collect(),
|
||||
});
|
||||
}
|
||||
|
||||
Ok(fields_ids_map.id(key.as_str()))
|
||||
}
|
||||
|
||||
/// Tries to parse the pest pair into the type `T` specified, always returns
|
||||
/// the original string that we tried to parse.
|
||||
///
|
||||
/// Returns the parsing error associated with the span if the conversion fails.
|
||||
fn pest_parse<T>(pair: Pair<Rule>) -> (StdResult<T, pest::error::Error<Rule>>, String)
|
||||
where
|
||||
T: FromStr,
|
||||
T::Err: ToString,
|
||||
{
|
||||
let result = match pair.as_str().parse::<T>() {
|
||||
Ok(value) => Ok(value),
|
||||
Err(e) => Err(PestError::<Rule>::new_from_span(
|
||||
ErrorVariant::CustomError { message: e.to_string() },
|
||||
pair.as_span(),
|
||||
)),
|
||||
};
|
||||
|
||||
(result, pair.as_str().to_string())
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use big_s::S;
|
||||
use heed::EnvOpenOptions;
|
||||
use maplit::hashset;
|
||||
|
||||
use super::*;
|
||||
use crate::update::Settings;
|
||||
|
||||
#[test]
|
||||
fn string() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut map = index.fields_ids_map(&wtxn).unwrap();
|
||||
map.insert("channel");
|
||||
index.put_fields_ids_map(&mut wtxn, &map).unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_filterable_fields(hashset! { S("channel") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
// Test that the facet condition is correctly generated.
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let condition = FilterCondition::from_str(&rtxn, &index, "channel = Ponce").unwrap();
|
||||
let expected = Operator(0, Operator::Equal(None, S("ponce")));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
let condition = FilterCondition::from_str(&rtxn, &index, "channel != ponce").unwrap();
|
||||
let expected = Operator(0, Operator::NotEqual(None, S("ponce")));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
let condition = FilterCondition::from_str(&rtxn, &index, "NOT channel = ponce").unwrap();
|
||||
let expected = Operator(0, Operator::NotEqual(None, S("ponce")));
|
||||
assert_eq!(condition, expected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn number() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut map = index.fields_ids_map(&wtxn).unwrap();
|
||||
map.insert("timestamp");
|
||||
index.put_fields_ids_map(&mut wtxn, &map).unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_filterable_fields(hashset! { "timestamp".into() });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
// Test that the facet condition is correctly generated.
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let condition = FilterCondition::from_str(&rtxn, &index, "timestamp 22 TO 44").unwrap();
|
||||
let expected = Operator(0, Between(22.0, 44.0));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
let condition = FilterCondition::from_str(&rtxn, &index, "NOT timestamp 22 TO 44").unwrap();
|
||||
let expected =
|
||||
Or(Box::new(Operator(0, LowerThan(22.0))), Box::new(Operator(0, GreaterThan(44.0))));
|
||||
assert_eq!(condition, expected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parentheses() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_searchable_fields(vec![S("channel"), S("timestamp")]); // to keep the fields order
|
||||
builder.set_filterable_fields(hashset! { S("channel"), S("timestamp") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
// Test that the facet condition is correctly generated.
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let condition = FilterCondition::from_str(
|
||||
&rtxn,
|
||||
&index,
|
||||
"channel = gotaga OR (timestamp 22 TO 44 AND channel != ponce)",
|
||||
)
|
||||
.unwrap();
|
||||
let expected = Or(
|
||||
Box::new(Operator(0, Operator::Equal(None, S("gotaga")))),
|
||||
Box::new(And(
|
||||
Box::new(Operator(1, Between(22.0, 44.0))),
|
||||
Box::new(Operator(0, Operator::NotEqual(None, S("ponce")))),
|
||||
)),
|
||||
);
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
let condition = FilterCondition::from_str(
|
||||
&rtxn,
|
||||
&index,
|
||||
"channel = gotaga OR NOT (timestamp 22 TO 44 AND channel != ponce)",
|
||||
)
|
||||
.unwrap();
|
||||
let expected = Or(
|
||||
Box::new(Operator(0, Operator::Equal(None, S("gotaga")))),
|
||||
Box::new(Or(
|
||||
Box::new(Or(
|
||||
Box::new(Operator(1, LowerThan(22.0))),
|
||||
Box::new(Operator(1, GreaterThan(44.0))),
|
||||
)),
|
||||
Box::new(Operator(0, Operator::Equal(None, S("ponce")))),
|
||||
)),
|
||||
);
|
||||
assert_eq!(condition, expected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn reserved_field_names() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
|
||||
assert!(FilterCondition::from_str(&rtxn, &index, "_geo = 12").is_err());
|
||||
|
||||
assert!(FilterCondition::from_str(&rtxn, &index, r#"_geoDistance <= 1000"#).is_err());
|
||||
|
||||
assert!(FilterCondition::from_str(&rtxn, &index, r#"_geoPoint > 5"#).is_err());
|
||||
|
||||
assert!(FilterCondition::from_str(&rtxn, &index, r#"_geoPoint(12, 16) > 5"#).is_err());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn geo_radius() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_searchable_fields(vec![S("_geo"), S("price")]); // to keep the fields order
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_filterable_fields(hashset! { S("_geo"), S("price") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
// basic test
|
||||
let condition =
|
||||
FilterCondition::from_str(&rtxn, &index, "_geoRadius(12, 13.0005, 2000)").unwrap();
|
||||
let expected = Operator(0, GeoLowerThan([12., 13.0005], 2000.));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// basic test with latitude and longitude at the max angle
|
||||
let condition =
|
||||
FilterCondition::from_str(&rtxn, &index, "_geoRadius(90, 180, 2000)").unwrap();
|
||||
let expected = Operator(0, GeoLowerThan([90., 180.], 2000.));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// basic test with latitude and longitude at the min angle
|
||||
let condition =
|
||||
FilterCondition::from_str(&rtxn, &index, "_geoRadius(-90, -180, 2000)").unwrap();
|
||||
let expected = Operator(0, GeoLowerThan([-90., -180.], 2000.));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// test the negation of the GeoLowerThan
|
||||
let condition =
|
||||
FilterCondition::from_str(&rtxn, &index, "NOT _geoRadius(50, 18, 2000.500)").unwrap();
|
||||
let expected = Operator(0, GeoGreaterThan([50., 18.], 2000.500));
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// composition of multiple operations
|
||||
let condition = FilterCondition::from_str(
|
||||
&rtxn,
|
||||
&index,
|
||||
"(NOT _geoRadius(1, 2, 300) AND _geoRadius(1.001, 2.002, 1000.300)) OR price <= 10",
|
||||
)
|
||||
.unwrap();
|
||||
let expected = Or(
|
||||
Box::new(And(
|
||||
Box::new(Operator(0, GeoGreaterThan([1., 2.], 300.))),
|
||||
Box::new(Operator(0, GeoLowerThan([1.001, 2.002], 1000.300))),
|
||||
)),
|
||||
Box::new(Operator(1, LowerThanOrEqual(10.))),
|
||||
);
|
||||
assert_eq!(condition, expected);
|
||||
|
||||
// georadius don't have any parameters
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"The _geoRadius filter expect three arguments: _geoRadius(latitude, longitude, radius)"
|
||||
));
|
||||
|
||||
// georadius don't have any parameters
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius()");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"The _geoRadius filter expect three arguments: _geoRadius(latitude, longitude, radius)"
|
||||
));
|
||||
|
||||
// georadius don't have enough parameters
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius(1, 2)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"The _geoRadius filter expect three arguments: _geoRadius(latitude, longitude, radius)"
|
||||
));
|
||||
|
||||
// georadius have too many parameters
|
||||
let result =
|
||||
FilterCondition::from_str(&rtxn, &index, "_geoRadius(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error.to_string().contains(
|
||||
"The _geoRadius filter expect three arguments: _geoRadius(latitude, longitude, radius)"
|
||||
));
|
||||
|
||||
// georadius have a bad latitude
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius(-100, 150, 10)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error
|
||||
.to_string()
|
||||
.contains("Latitude must be contained between -90 and 90 degrees."));
|
||||
|
||||
// georadius have a bad latitude
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius(-90.0000001, 150, 10)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error
|
||||
.to_string()
|
||||
.contains("Latitude must be contained between -90 and 90 degrees."));
|
||||
|
||||
// georadius have a bad longitude
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius(-10, 250, 10)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error
|
||||
.to_string()
|
||||
.contains("Longitude must be contained between -180 and 180 degrees."));
|
||||
|
||||
// georadius have a bad longitude
|
||||
let result = FilterCondition::from_str(&rtxn, &index, "_geoRadius(-10, 180.000001, 10)");
|
||||
assert!(result.is_err());
|
||||
let error = result.unwrap_err();
|
||||
assert!(error
|
||||
.to_string()
|
||||
.contains("Longitude must be contained between -180 and 180 degrees."));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn from_array() {
|
||||
let path = tempfile::tempdir().unwrap();
|
||||
let mut options = EnvOpenOptions::new();
|
||||
options.map_size(10 * 1024 * 1024); // 10 MB
|
||||
let index = Index::new(options, &path).unwrap();
|
||||
|
||||
// Set the filterable fields to be the channel.
|
||||
let mut wtxn = index.write_txn().unwrap();
|
||||
let mut builder = Settings::new(&mut wtxn, &index, 0);
|
||||
builder.set_searchable_fields(vec![S("channel"), S("timestamp")]); // to keep the fields order
|
||||
builder.set_filterable_fields(hashset! { S("channel"), S("timestamp") });
|
||||
builder.execute(|_, _| ()).unwrap();
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
// Test that the facet condition is correctly generated.
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
let condition = FilterCondition::from_array(
|
||||
&rtxn,
|
||||
&index,
|
||||
vec![
|
||||
Either::Right("channel = gotaga"),
|
||||
Either::Left(vec!["timestamp = 44", "channel != ponce"]),
|
||||
],
|
||||
)
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
let expected = FilterCondition::from_str(
|
||||
&rtxn,
|
||||
&index,
|
||||
"channel = gotaga AND (timestamp = 44 OR channel != ponce)",
|
||||
)
|
||||
.unwrap();
|
||||
assert_eq!(condition, expected);
|
||||
}
|
||||
}
|
@ -1,11 +1,9 @@
|
||||
pub use self::facet_distribution::FacetDistribution;
|
||||
pub use self::facet_number::{FacetNumberIter, FacetNumberRange, FacetNumberRevRange};
|
||||
pub use self::facet_string::FacetStringIter;
|
||||
pub use self::filter_condition::{FilterCondition, Operator};
|
||||
pub(crate) use self::parser::Rule as ParserRule;
|
||||
pub use self::filter::Filter;
|
||||
|
||||
mod facet_distribution;
|
||||
mod facet_number;
|
||||
mod facet_string;
|
||||
mod filter_condition;
|
||||
mod parser;
|
||||
mod filter;
|
||||
|
@ -1,12 +0,0 @@
|
||||
use once_cell::sync::Lazy;
|
||||
use pest::prec_climber::{Assoc, Operator, PrecClimber};
|
||||
|
||||
pub static PREC_CLIMBER: Lazy<PrecClimber<Rule>> = Lazy::new(|| {
|
||||
use Assoc::*;
|
||||
use Rule::*;
|
||||
pest::prec_climber::PrecClimber::new(vec![Operator::new(or, Left), Operator::new(and, Left)])
|
||||
});
|
||||
|
||||
#[derive(Parser)]
|
||||
#[grammar = "search/facet/grammar.pest"]
|
||||
pub struct FilterParser;
|
@ -14,8 +14,7 @@ use meilisearch_tokenizer::{Analyzer, AnalyzerConfig};
|
||||
use once_cell::sync::Lazy;
|
||||
use roaring::bitmap::RoaringBitmap;
|
||||
|
||||
pub(crate) use self::facet::ParserRule;
|
||||
pub use self::facet::{FacetDistribution, FacetNumberIter, FilterCondition, Operator};
|
||||
pub use self::facet::{FacetDistribution, FacetNumberIter, Filter};
|
||||
pub use self::matching_words::MatchingWords;
|
||||
use self::query_tree::QueryTreeBuilder;
|
||||
use crate::error::UserError;
|
||||
@ -35,7 +34,8 @@ mod query_tree;
|
||||
|
||||
pub struct Search<'a> {
|
||||
query: Option<String>,
|
||||
filter: Option<FilterCondition>,
|
||||
// this should be linked to the String in the query
|
||||
filter: Option<Filter<'a>>,
|
||||
offset: usize,
|
||||
limit: usize,
|
||||
sort_criteria: Option<Vec<AscDesc>>,
|
||||
@ -97,7 +97,7 @@ impl<'a> Search<'a> {
|
||||
self
|
||||
}
|
||||
|
||||
pub fn filter(&mut self, condition: FilterCondition) -> &mut Search<'a> {
|
||||
pub fn filter(&mut self, condition: Filter<'a>) -> &mut Search<'a> {
|
||||
self.filter = Some(condition);
|
||||
self
|
||||
}
|
||||
|
@ -567,7 +567,7 @@ mod tests {
|
||||
|
||||
use super::*;
|
||||
use crate::update::{IndexDocuments, Settings};
|
||||
use crate::FilterCondition;
|
||||
use crate::Filter;
|
||||
|
||||
#[test]
|
||||
fn delete_documents_with_numbers_as_primary_key() {
|
||||
@ -667,7 +667,7 @@ mod tests {
|
||||
builder.delete_external_id("1_4");
|
||||
builder.execute().unwrap();
|
||||
|
||||
let filter = FilterCondition::from_str(&wtxn, &index, "label = sign").unwrap();
|
||||
let filter = Filter::from_str("label = sign").unwrap();
|
||||
let results = index.search(&wtxn).filter(filter).execute().unwrap();
|
||||
assert!(results.documents_ids.is_empty());
|
||||
|
||||
|
@ -526,7 +526,7 @@ mod tests {
|
||||
use super::*;
|
||||
use crate::error::Error;
|
||||
use crate::update::IndexDocuments;
|
||||
use crate::{Criterion, FilterCondition, SearchResult};
|
||||
use crate::{Criterion, Filter, SearchResult};
|
||||
|
||||
#[test]
|
||||
fn set_and_reset_searchable_fields() {
|
||||
@ -1068,7 +1068,8 @@ mod tests {
|
||||
wtxn.commit().unwrap();
|
||||
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
FilterCondition::from_str(&rtxn, &index, "toto = 32").unwrap_err();
|
||||
let filter = Filter::from_str("toto = 32").unwrap();
|
||||
let _ = filter.evaluate(&rtxn, &index).unwrap_err();
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
@ -1,5 +1,5 @@
|
||||
use either::{Either, Left, Right};
|
||||
use milli::{Criterion, FilterCondition, Search, SearchResult};
|
||||
use milli::{Criterion, Filter, Search, SearchResult};
|
||||
use Criterion::*;
|
||||
|
||||
use crate::search::{self, EXTERNAL_DOCUMENTS_IDS};
|
||||
@ -13,11 +13,7 @@ macro_rules! test_filter {
|
||||
let rtxn = index.read_txn().unwrap();
|
||||
|
||||
let filter_conditions =
|
||||
FilterCondition::from_array::<Vec<Either<Vec<&str>, &str>>, _, _, _>(
|
||||
&rtxn, &index, $filter,
|
||||
)
|
||||
.unwrap()
|
||||
.unwrap();
|
||||
Filter::from_array::<Vec<Either<Vec<&str>, &str>>, _>($filter).unwrap().unwrap();
|
||||
|
||||
let mut search = Search::new(&rtxn, &index);
|
||||
search.query(search::TEST_QUERY);
|
||||
|
Loading…
Reference in New Issue
Block a user