9.5. Plugin components

A whole extra set of components can be added to the Data Ingestion Engine, these are the plugin components. A plugin is a piece of software written by a DISQOVER data scientist / admin user or a third-party that is embedded in the DISQOVER ecosystem. In the context of pipelines and components, a plugin can be a component type that can be used just as any other pipeline component.

We provide two kinds of component plugins, of which any number of custom component types can be derived. We say that someone implements a plugin type in order to create a specific component type. The two generic plugin types are ‘Batch Transform’ and ‘Batch Aggregate and Transform’. For example, based on the Batch Transform plugin type, you can implement a custom pipeline component that takes existing resources from a Class as input, and produces a set of new predicates for those resources. How these resources are transformed, is up to the implementation of the plugin component. Different Batch Transform components can transform the resources in very different ways. For example, a plugin component that simply manipulates strings is a very different from a plugin component that queries a third-party service for data for the input resources, but both can be derived from the Batch Transform plugin type.

A plugin component has options just as a regular component. The most prominent options include the Class(es) of the resources, and the input and output predicate names. You can also filter resources based on expressions written in the expression language you are familiar with from the regular components. A plugin component can have many more options, as defined by its implementation. We refer to these options as the ‘custom options’. Using these options, the behaviour of a plugin component can for example be slightly altered.

Below we describe the two plugin types: ‘Batch Transform’ and ‘Batch Aggregate and Transform’. Note that these are only generic plugin types, not concrete plugin component types.

If you want to implement a custom plugin component yourself, refer to the System Administrator manual for more information on plugin implementation and configuration.

9.5.1. Batch Transform

Represents a plugin that receives batches of resources, transforms the resources and returns the resulting batches of resources. How the resources are transformed depends on the specific plugin component implementation.

Description

The Batch Transform Plugin is intended for use cases where you require resources to be processed in a way that simply cannot be covered by the standard pipeline components of Data Ingestion Engine. An example could be the processing of resources with an in-house developed algorithm, or the need to call a third party API.

Options

Target Class

  • Class : Class containing the predicates to transform
  • Filter [Optional] : Boolean expression returning true for resources which should be included.

Custom Options

See the documentation for the plugin.

Warnings

  • Minimal count for warning “The plugin issued a warning when processing a resource.” [Optional] : Suppress warnings if the number of warnings is less than this number, for warnings of type “Error while processing a resource.” The default value is 1.

9.5.2. Batch Aggregate and Transform

Represents a plugin that receives batches of resources, aggregates the predicate values into variables placed in a ‘store’, and uses these variables to transform resources into batches of output predicates. How values are aggregated from the input resources, which variables are stored, and how these are used to create transformed output resources depends on the specific plugin component implementation.

Description

The Batch Aggregate and Transform Plugin can be used when it is necessary to accumulate predicate values over multiple resources first, before resources can be transformed. In that respect, the use cases are very similar to those of the native Aggregate and Transform component, but with the added benefit that third party APIs can be used, or very specific in-house algorithms.

A plugin component of this type operates in two phases: the aggregation and the transformation. The two phases can use resources in the same Class, or two different Classes. If the Class is identical, even the input predicate(s) for the aggregation phase can be same as the predicate(s) for the transformation phase.

Options

The options are divided into two sections, one for each phase. The plugin implementer can choose to put custom options under each of those default sections as well.

Options in section ‘Aggregation Phase’

  • Class : Class containing the predicates to aggregate
  • Filter [Optional] : Boolean expression returning true for resources which should be included.

Options in section ‘Transformation Phase’

  • Class : Class containing the predicates to transform
  • Filter [Optional] : Boolean expression returning true for resources which should be included.

Custom Options

See the documentation for the plugin.

Warnings

  • Minimal count for warning “The plugin issued a warning while aggregating resources.” [Optional] : Suppress warnings if the number of warnings is less than this number, for warnings of type “Error while aggregating resources.” The default value is 1.
  • Minimal count for warning “The plugin issued a warning while transforming resources.” [Optional] : Suppress warnings if the number of warnings is less than this number, for warnings of type “Error while transforming resources.” The default value is 1.