DR-301i · Module 1
Scheduling & Orchestration
3 min read
The orchestrator is the conductor that coordinates the pipeline's five components. It manages the execution schedule — which collectors run when, how the normalizer processes the queue, when the analyzer runs its batch operations, and when the synthesizer produces its output. The orchestrator handles dependencies: the normalizer cannot run until the collector has finished; the analyzer cannot run until the normalizer has processed the current batch; the synthesizer runs on a cadence independent of collection, processing whatever analyzed data is available.
Do This
- Run collectors on source-specific schedules — match frequency to the source's update velocity
- Run the normalizer as a continuous process that drains the staging queue — not on a fixed schedule
- Run the synthesizer on the delivery cadence, not the collection cadence — synthesis should match consumer needs
- Implement dead letter queues for data that fails processing — investigate later, do not block the pipeline
Avoid This
- Run all pipeline components on the same schedule — they have different optimal cadences
- Block the pipeline when one source fails — isolate the failure and continue processing other sources
- Run synthesis on every collection cycle — synthesis is expensive and should match the delivery cadence
- Silently drop data that fails normalization — it might be the most important signal in the batch