What is a pipeline?

Answer

A pipeline is a grouping of pre-defined data processing steps. There are two main categories of pipelines:

Query pipelines

Once a user has typed their query and pressed enter, it is sent to a query pipeline. You can think of the pipeline as a series of steps that compile smarter queries for you before searching the index.

Example steps:

  • Synonym step: applies each synonym. This creates an alternate query that runs in tandem to the original query. 

  • Spell-checking step: Applies spell-checking by using a probably matrix, finding the most likely correct words and phrases.

  • Boosting step: can boost certain results if they depending on the search query.

Record pipelines

Defines the way records and their associated information will be processed and transformed (modified or augmented) before the record is stored in the collection. 

Example steps:

  • index: defines what fields should be indexed.

  • train autocomplete : improves the autocomplete model by training it with new records.