Related Projects¶
Streaming processors¶
-
Nice examples of what pipeline systems need, even if it’s not exactly how we’d do them.
See, e.g. “zip” vs “combine_latest”
-
good discussion of design, even if a very different application - massive scale, simple one-step processing replicated over many machines.
Batch-based¶
-
code locations interesting handling of multiple conflicting dep trees/environments in same pipeline: “just don’t specify the dependencies” (except in dagster+! where the cloud is the dependencies!) and specify the venv instead. each venv is independent and integrated over RPC
some analogies in division of labor: op -> node, graph -> tube, job -> tube + runner (not sure), i/o manager -> store…