Modules Reference¶
Renderer¶
Module containing the render engine
-
class
renderer.ModuleMap¶ The main mapping that links modules to their name through the
getmethod.(Essentially an enum or dictionary)
-
classmethod
add_module(module_name, module)¶ Set a new module in the mapping.
Parameters: - module_name (str) – Name of the new module
- module (BaseModule) – Module class to add to the mapping
-
classmethod
get(module_type)¶ Returns the module corresponding to the name passed in argument.
Parameters: module_type (str) – The desired module type. Return type: Type[BaseModule]
-
classmethod
-
class
renderer.Renderer(module_list, template_dir)¶ The render engine that can build the DAG, check the integrity of the operation graph and generate the rendered Scala code.
Parameters: - module_list (List[Dict[str, Any]]) – The list of module specifications to be parsed and added to the operation graph.
- template_dir (str) – The path to the template directory.
-
check_integrity()¶ Check the integrity of the graph. Should be called after all the modules have been added to the graph (i.e. after initialization).
-
get_rendered()¶ Get the rendered code from the module list.
Base Module¶
The module containing the abstract base class for all operation modules used throughout the rest of the code.
-
class
modules.base_module.BaseModule(module, env, named_modules)¶ The abstract base class for modules. All modules are subclasses of
BaseModuleEvery module object passed to the constructor must contain the
moduleTypeandnamefields.All modules expose the following common API.
Parameters: - module (dict) – The
dictcontaining the specification of the module. Every module has this parameter that should contain the fields from all its parent classes. - env (jinja2.Environment) – The jinja environment where the templates can be retrieved.
- named_modules (Dict[str, Type[BaseModule]]) – A list of all the other modules of the DAG.
-
add_to_graph(graph)¶ A method for adding the module to a graphviz graph instance.
Parameters: graph (graphviz.dot.Digraph) – A graphviz Digraph object
-
check_integrity()¶ Performs some check on the upstream modules and types when necessary to ensure the integrity of the DAG.
-
get_out_type()¶ Returns the output type of the module as a list of strings.
Return type: List[str]
-
rendered_result()¶ Returns a pair of strings containing the rendered lines of codes and external classes or objects definitions.
Return type: Tuple[str,str]
-
to_graph_repr()¶ Generate the representation of the node in the form
Name Type: $moduleTypeUsed for pdf graph generation
Return type: str
- module (dict) – The
Adding Modules¶
To create new modules with new functionalities, one can subclass any of the following base classes:
- BaseModule: Works for any new module.
- FileImporter: For extractor modules with files as input.
- UnaryOperation: For modules that do work on one input data flow.
- BinaryOperation: For modules that implement an operation on two separate inputs.
- FileOutput: For output modules with files as output.
When implementing a new module, one should use the following template:
class MyModule(BaseModule): # Any of the base classes
""" Documentation of the module
Args:
module (dict): Description of the module dict to
be passed as argument.
"""
def __init__(self, module, env: Environment, named_modules):
super().__init__(module, env, named_modules)
self.template_path = # Path to template
self.template = self.env.get_template(self.template_path)
def rendered_result(self) -> Tuple[str, str]:
return self.template.render(
name=self.name,
# Other arguments
), '' # or ext template if applicable
def get_out_type(self):
# This function should return the output type of the module
# as a list of strings.
def check_integrity(self):
# This function performs integrity checks if applicable.
The module should have a scala template associated with it for generating the corresponding code.
// ===== My module {{name}} =====
// Insert code here
val {{name}} = // The Flink Dataset
Once the module is defined, it can be added to the rendering engine by adding it to the ModuleMap class directly for example.