DR-301i · Module 1

Scheduling & Orchestration

3 min read

The orchestrator is the conductor that coordinates the pipeline's five components. It manages the execution schedule — which collectors run when, how the normalizer processes the queue, when the analyzer runs its batch operations, and when the synthesizer produces its output. The orchestrator handles dependencies: the normalizer cannot run until the collector has finished; the analyzer cannot run until the normalizer has processed the current batch; the synthesizer runs on a cadence independent of collection, processing whatever analyzed data is available.

Do This

  • Run collectors on source-specific schedules — match frequency to the source's update velocity
  • Run the normalizer as a continuous process that drains the staging queue — not on a fixed schedule
  • Run the synthesizer on the delivery cadence, not the collection cadence — synthesis should match consumer needs
  • Implement dead letter queues for data that fails processing — investigate later, do not block the pipeline

Avoid This

  • Run all pipeline components on the same schedule — they have different optimal cadences
  • Block the pipeline when one source fails — isolate the failure and continue processing other sources
  • Run synthesis on every collection cycle — synthesis is expensive and should match the delivery cadence
  • Silently drop data that fails normalization — it might be the most important signal in the batch