Modules Reference¶
Renderer¶
Module containing the render engine
-
class
renderer.
ModuleMap
¶ The main mapping that links modules to their name through the
get
method.(Essentially an enum or dictionary)
-
classmethod
add_module
(module_name, module)¶ Set a new module in the mapping.
Parameters: - module_name (str) – Name of the new module
- module (BaseModule) – Module class to add to the mapping
-
classmethod
get
(module_type)¶ Returns the module corresponding to the name passed in argument.
Parameters: module_type (str) – The desired module type. Return type: Type
[BaseModule
]
-
classmethod
-
class
renderer.
Renderer
(module_list, template_dir)¶ The render engine that can build the DAG, check the integrity of the operation graph and generate the rendered Scala code.
Parameters: - module_list (List[Dict[str, Any]]) – The list of module specifications to be parsed and added to the operation graph.
- template_dir (str) – The path to the template directory.
-
check_integrity
()¶ Check the integrity of the graph. Should be called after all the modules have been added to the graph (i.e. after initialization).
-
get_rendered
()¶ Get the rendered code from the module list.
Base Module¶
The module containing the abstract base class for all operation modules used throughout the rest of the code.
-
class
modules.base_module.
BaseModule
(module, env, named_modules)¶ The abstract base class for modules. All modules are subclasses of
BaseModule
Every module object passed to the constructor must contain the
moduleType
andname
fields.All modules expose the following common API.
Parameters: - module (dict) – The
dict
containing the specification of the module. Every module has this parameter that should contain the fields from all its parent classes. - env (jinja2.Environment) – The jinja environment where the templates can be retrieved.
- named_modules (Dict[str, Type[BaseModule]]) – A list of all the other modules of the DAG.
-
add_to_graph
(graph)¶ A method for adding the module to a graphviz graph instance.
Parameters: graph (graphviz.dot.Digraph) – A graphviz Digraph object
-
check_integrity
()¶ Performs some check on the upstream modules and types when necessary to ensure the integrity of the DAG.
-
get_out_type
()¶ Returns the output type of the module as a list of strings.
Return type: List
[str
]
-
rendered_result
()¶ Returns a pair of strings containing the rendered lines of codes and external classes or objects definitions.
Return type: Tuple
[str
,str
]
-
to_graph_repr
()¶ Generate the representation of the node in the form
Name Type: $moduleType
Used for pdf graph generation
Return type: str
- module (dict) – The
Adding Modules¶
To create new modules with new functionalities, one can subclass any of the following base classes:
- BaseModule: Works for any new module.
- FileImporter: For extractor modules with files as input.
- UnaryOperation: For modules that do work on one input data flow.
- BinaryOperation: For modules that implement an operation on two separate inputs.
- FileOutput: For output modules with files as output.
When implementing a new module, one should use the following template:
class MyModule(BaseModule): # Any of the base classes
""" Documentation of the module
Args:
module (dict): Description of the module dict to
be passed as argument.
"""
def __init__(self, module, env: Environment, named_modules):
super().__init__(module, env, named_modules)
self.template_path = # Path to template
self.template = self.env.get_template(self.template_path)
def rendered_result(self) -> Tuple[str, str]:
return self.template.render(
name=self.name,
# Other arguments
), '' # or ext template if applicable
def get_out_type(self):
# This function should return the output type of the module
# as a list of strings.
def check_integrity(self):
# This function performs integrity checks if applicable.
The module should have a scala template associated with it for generating the corresponding code.
// ===== My module {{name}} =====
// Insert code here
val {{name}} = // The Flink Dataset
Once the module is defined, it can be added to the rendering engine by adding it to the ModuleMap class directly for example.