################ Developing Tasks ################ The `Using Tasks` chapter describes how to customize existing tasks by specifying parameter values and using compound tasks to compose tasks from a collection of sub-tasks. When adding a new tool or capability, it's likely that more fine-grained control will be needed. In these cases, it makes sense to provide a programming-language implementation for a task. Task Execution ============== It is most common to provide an implementation for a task's execution behavior. This is most-commonly done in Python, but shell scripts can also be used. Tasks provide the `run` and `shell` parameters to specify the implementation. External Pytask --------------- A `pytask` implementation for a task is provided by a Python async method that accepts input parameters from the DV Flow runtime system and returns data to the system. When the `pytask` implementation is external, the `run` parameter specifies the name of the Python method. .. code-block:: yaml package: name: my_tool tasks: - name: my_task uses: my_package.MyTask shell: pytask run: my_package.my_module.MyTask with: msg: type: str The task definition above specifies that a `pytask` implementation for the task is provided by a Python method named `my_package.my_module.MyTask`. .. code-block:: python3 async def MyTask(ctxt, input): print("Message: %s" % input.params.msg) See the :doc:`../pytask_api` documentation for more information about the Python API available to task implementations. This "external" implementation makes the most sense when the task implementation is moderately complex or lengthy. Inline Pytask ------------- When the task implementation is simple, the code can be in-lined within the YAML. .. code-block:: yaml package: name: my_tool tasks: - name: my_task uses: my_package.MyTask with: msg: type: str shell: pytask run: | print("Message: %s" % input.params.msg) When this task is executed, the body of the `run` entry will be evaluated as the body of an async Python method that has `ctxt`, and `input` parameters. Task-Graph Expansion ==================== Sometimes build flows need to run multiple variations of the same core step. For example, we may wish to run multiple UVM tests that only vary in the input arguments. The `matrix` strategy can work well in these cases. .. code-block:: yaml package: name: my_pkg tasks: - name: SayHi strategy: matrix: who: ["Adam", "Mary", "Joe"] body: - name: Output uses: std.Message with: msg: "Hello ${{ matrix.who }}!" The `matrix` strategy is only valid on compound tasks. The body tasks are evaluated once for each combination of matrix variables. Body-task parameters can reference the matrix variables. In this case, we would expect the `SayHi` task to look like this when expanded: .. mermaid:: flowchart TD A[SayHi.in] B[Hello Adam!] C[Hello Mary!] D[Hello Joe!] E[SayHi] A --> B A --> C A --> D B --> E C --> E D --> E Task-Graph Generation ===================== It is sometimes useful to generate task graphs programmatically instead of capturing them manually or generating them textually in YAML. A `generate` strategy can be provided to algorithmically define a task subgraph. Note that generation is done statically as part of graph elaboration. This means that the generated graph structure may only depend on values, such as parameter values, that are known during elaboration. The graph structure cannot be created using data conveyed as dataflow between tasks. .. code-block:: yaml package: name: my_pkg tasks: - name: SayHi with: count: type: int value: 1 strategy: generate: my_pkg.my_mod.GenGraph The `generate` strategy specifies that the containing task will be a compound task whose sub-tasks are provided by the specified generator. As with other task implementations, the generator code can be specified externally in a Python module or inline. .. code-block:: python3 def GenGraph(ctxt, input): count = input.params.count for i in range(count): ctxt.addTask(ctxt.mkTaskNode( "std.Message", with={"count": 1}) name=ctxt.mkName("SayHi%d" % i), msg="Hello World% %d!" % (i+1))) See the :doc:`../pytask_api` documentation for more information about the Python task-graph generation API. PyTask Class-Based API ====================== For more complex tasks, DV Flow Manager provides a class-based API using the ``PyTask`` base class. This approach provides better organization for tasks with substantial logic or state. Defining a PyTask ----------------- A PyTask is defined as a dataclass that inherits from ``dv_flow.mgr.PyTask``: .. code-block:: python from dv_flow.mgr import PyTask import dataclasses as dc @dc.dataclass class MyCompiler(PyTask): desc = "Compiles HDL sources" doc = """ This task compiles HDL sources using a configurable compiler. Supports multiple file types and optimization levels. """ @dc.dataclass class Params: sources: list = dc.field(default_factory=list) optimization: str = "O2" debug: bool = False async def __call__(self) -> str: # Access parameters via self.params print(f"Compiling {len(self.params.sources)} files") print(f"Optimization: {self.params.optimization}") # Access context via self._ctxt rundir = self._ctxt.rundir # Perform compilation work here # ... # Return None for pytask execution, or a string for shell execution return None The ``__call__`` method is the main entry point and receives the task context automatically through the ``_ctxt`` and ``_input`` fields. Using PyTask in YAML -------------------- Reference a PyTask class in your flow definition: .. code-block:: yaml package: name: my_tools tasks: - name: compile shell: pytask run: my_package.my_module.MyCompiler with: sources: - src/file1.v - src/file2.v optimization: "O3" debug: true The PyTask class provides several advantages: * **Type safety**: Parameters are defined with Python type hints * **Documentation**: Docstrings become part of the task documentation * **Organization**: Related logic stays together in a class * **Reusability**: Classes can inherit from other classes * **Testing**: Easier to unit test than inline code Returning Commands ------------------ A PyTask can return a shell command instead of executing directly: .. code-block:: python @dc.dataclass class MyTool(PyTask): @dc.dataclass class Params: input_file: str output_file: str async def __call__(self) -> str: # Generate command string cmd = f"my_tool -i {self.params.input_file} -o {self.params.output_file}" return cmd When a string is returned, DV Flow executes it as a shell command using the configured shell (default: pytask for Python execution). PyPkg Package Factory ===================== For advanced use cases, DV Flow supports defining packages entirely in Python using the ``PyPkg`` class. This enables dynamic package construction and programmatic task registration. Defining a PyPkg ---------------- .. code-block:: python from dv_flow.mgr import PyPkg, pypkg import dataclasses as dc @dc.dataclass class MyToolPackage(PyPkg): name = "mytool" @dc.dataclass class Params: version: str = "1.0" enable_debug: bool = False The ``@pypkg`` decorator registers tasks with the package: .. code-block:: python @pypkg(MyToolPackage) @dc.dataclass class Compile(PyTask): @dc.dataclass class Params: sources: list = dc.field(default_factory=list) async def __call__(self): # Task implementation pass @pypkg(MyToolPackage) @dc.dataclass class Link(PyTask): @dc.dataclass class Params: objects: list = dc.field(default_factory=list) async def __call__(self): # Task implementation pass PyPkg Benefits -------------- Using PyPkg provides several advantages: * **Code reuse**: Share common Python code across tasks * **Dynamic generation**: Programmatically create task definitions * **Type checking**: Full Python type checking for package definitions * **Version control**: Package and task versions managed together * **Testing**: Unit test entire packages in Python PyPkg packages can be distributed as Python packages and installed via pip, making them easy to share and version. Note: PyPkg is an advanced feature. For most use cases, YAML-based package definitions with PyTask implementations provide the right balance of simplicity and power.