Assets

What it is

Assets are elements within noob, which is primarily a Directed Acyclic Graph (DAG) processor, that give it an ability to handle cycles and persistence. It is in a sense a static node that does not process but instead holds objects, connections, data, and whatever else you’d like to persist longer than a node processing event. You can determine its lifespan with the scope setting.

Why we made it

When we have an object that needs to span multiple epochs, nodes that emit massive arrays, or when we want to define a connection that persists through some groups of nodes, we do not want to copy the object every single time it’s passed from one node to another, or wipe it out when we move onto the next epoch.

How it works

Basics

All Python callables that outputs a stateful object can produce an asset. Usually this would take the form of a function or a class. The way an asset persists depends on the scope of the asset.

Spec

The yaml specification for an asset is almost identical to its Node counterpart.

asset_id:
  type: absolute.python.path
  params:
    param1: ...
  scope: runner  # or process or node
  depends: node.signal

asset_id must be unique. type is the absolute Python path to the callable. The others have additional nuances.

Scopes

There are three different scopes for an asset: runner, process, and node.

Runner

A runner-scoped asset persists as long as the runner does. It will be able to portal between two consecutive epochs, while remaining stateful throughout the entire run.

        sequenceDiagram

    activate Assets
    Assets->>Epochs: Inject
    activate Epochs
    Epochs->>Assets: Update
    deactivate Epochs
    Assets->>Epochs: Inject
    activate Epochs
    Epochs->>Assets: Update
    deactivate Epochs
    Assets->>Epochs: Inject
    activate Epochs
    Epochs->>Assets: Update
    deactivate Epochs
    deactivate Assets
    
assets:
  db: # unique asset id
    type: noob.testing.array  # absolute Python path to initializer
    scope: runner
    depends: z.result  # exits the process noob from the last node

nodes:
  a:
    type: noob.testing.row_sum
    params:
      row_index: 0
    depends:
      - right: assets.db  # enters the process loop via the first node

  # ...

  z:
    type: noob.testing.multiply
    params:
      multiplier: 2
    depends:
      array: y.output  # takes the asset directly from the previous node
        flowchart LR
    asset_db -- "inject" --> node_a
    node_a --> ...
    ... --> node_z
    node_z -- "update" --> asset_db
    
    
Depends

The meaning of the depends entry of the asset spec is different from its equivalent in Node and is only used when scope: runner. Here, depends should point to the last node that changes the value of the asset. The idea is that the asset enters the processing loop through the first Node that modifies its value, travels through the rest of the nodes, and the last node to modify it puts it back where it came from. Therefore, an asset can only depend on a single node.

CAUTION

When running an asset in a runner scope, make sure the asset depends on the correct node (the last one to modify it), and be mindful of race-conditions, even in synchronous mode if your graph has branching / merging operations. For example, if nodes B, C below both modify an asset in-place, it can cause a nondeterministic result by the time it reaches node D.

        flowchart LR    
  A --> B
  A --> C
  B --> D
  C --> D
    

Process

A process-scoped asset persists for the duration of a runner’s process() method. It is recreated on every epoch and remains stateful within that epoch.

        sequenceDiagram

    Assets->>Epochs: Inject
    activate Assets
    activate Epochs
    Epochs-->Assets: No Update
    deactivate Epochs
    deactivate Assets
    Assets->>Epochs: Inject
    activate Assets
    activate Epochs
    Epochs-->Assets: No Update
    deactivate Epochs
    deactivate Assets
    Assets->>Epochs: Inject
    activate Epochs
    activate Assets
    Epochs->>Assets: Update
    deactivate Epochs
    deactivate Assets
    

Notice the missing update step here in contrast to the runner scoped asset.

        flowchart LR
    asset_db -- "inject" --> node_a
    node_a --> ...
    ... --> node_z
    
    

Node

A node-scoped asset serves a similar purpose to an input, whose value gets initialized on every call to a node’s process() method.

        flowchart LR
    asset_db -- "inject" --> node_a
    asset_db -- "inject" --> ...
    asset_db -- "inject" --> node_z
    node_a --> ...
    ... --> node_z
    
    

Function Assets

Here’s an example of an asset defined by a function:

def asset_func(x, y) -> np.ndarray:
    return np.random.random((x, y))

The spec below declares the output of the above function an asset:

assets:
  array: # Unique asset ID
    type: my_package.assets.asset_func  # Absolute Python path
    scope: runner
    params:
      x: 2
      y: 5
    depends: # Roundtrip endpoint
      node.signal

The way it’s scaffolded in spec is almost identical to the nodes. array is the unique ID (cannot duplicate with another asset.) type is the absolute Python path to the function. The rest, however, diverges from the spec of Node. All function parameters should strictly be defined in params spec.

Class Assets

Here’s an example of an asset defined by a class:

class AssetCls:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def some_method(self): ...

The spec below declares an instance of the above class an asset:

assets:
  array: # Unique asset ID
    type: my_package.assets.AssetCls  # Absolute Python path
    scope: runner
    params:
      x: 2
      y: 5
    depends: # Roundtrip endpoint
      node.signal

The format does not change much for a class-based Asset. Like its function-based twin, all __init__ parameters must be defined in the params section.

How do you use an Asset?

The defined asset then can simply become a depends input for any node and used like any regular Python object, like the following:

def use_func_asset(
        param, event, asset: np.ndarray
) -> Annotated[np.ndarray, Name("modified_asset")]:
    asset[0][0] = asset.size + param * event
    return asset


def use_cls_asset(
        param, event, asset: AssetCls
) -> Annotated[AssetCls, Name("modified_asset")]:
    asset.x += param + event * asset.y
    return asset
# just demonstrating use_cls_asset for brevity.

assets:
  array: # Unique asset ID
    type: my_package.assets.AssetCls  # Absolute Python path
    scope: runner
    params:
      x: 2
      y: 5
    depends: b.modified_asset  # Roundtrip endpoint 

nodes:
  a:
    type: my_package.nodes.constant  # a node that outputs a constant value
    params:
      value: 1
  b:
    type: my_package.nodes.use_cls_asset
    params:
      param: 2
    depends:
      - event: a.output
      - asset: assets.array

Here, node b will output an an instance of an AssetCls with an updated attribute x value 2 -> (2 + 1 * 5 = 9)