mpostma aa6c5df0bc Implement documents format
document reader transform

remove update format

support document sequences

fix document transform

clean transform

improve error handling

add documents! macro

fix transform bug

fix tests

remove csv dependency

Add comments on the transform process

replace search cli


review edits

fix http ui

fix clippy warnings

Revert "fix clippy warnings"

This reverts commit a1ce3cd96e603633dbf43e9e0b12b2453c9c5620.

fix review comments

remove smallvec in transform loop

review edits
2021-09-21 16:58:33 +02:00

36 lines
1.3 KiB

use UpdateIndexingStep::*;
#[derive(Debug, Clone, Copy)]
pub enum UpdateIndexingStep {
/// Remap document addition fields the one present in the database, adding new fields in to the
/// schema on the go.
RemapDocumentAddition { documents_seen: usize },
/// This step check the external document id, computes the internal ids and merge
/// the documents that are already present in the database.
ComputeIdsAndMergeDocuments { documents_seen: usize, total_documents: usize },
/// Extract the documents words using the tokenizer and compute the documents
/// facets. Stores those words, facets and documents ids on disk.
IndexDocuments { documents_seen: usize, total_documents: usize },
/// Merge the previously extracted data (words and facets) into the final LMDB database.
/// These extracted data are split into multiple databases.
MergeDataIntoFinalDatabase { databases_seen: usize, total_databases: usize },
impl UpdateIndexingStep {
pub const fn step(&self) -> usize {
match self {
RemapDocumentAddition { .. } => 0,
ComputeIdsAndMergeDocuments { .. } => 1,
IndexDocuments { .. } => 2,
MergeDataIntoFinalDatabase { .. } => 3,
pub const fn number_of_steps(&self) -> usize {