DR-301a · Module 1

Scheduling & Trigger-Based Research

3 min read

Collection systems run on two clocks. The scheduled clock fires at fixed intervals — check this RSS feed every four hours, pull this API every morning at 6 AM, scrape this page every Monday. The event clock fires in response to external triggers — a competitor published a press release, a patent filing appeared, a pricing page changed. Scheduled collection ensures baseline coverage. Event-triggered collection ensures you do not miss time-sensitive intelligence that arrives between scheduled runs.

Scheduling frequency is a trade-off between freshness and cost. A four-hour collection cycle means you are never more than four hours behind on any source — but you are also making six API calls per source per day. Multiply that by a hundred sources and you are making six hundred calls daily. For most competitive intelligence use cases, a daily morning collection is sufficient for stable sources like blogs and annual reports, while a four-hour cycle is appropriate for fast-moving sources like news feeds and social media. Match the collection frequency to the source's update velocity, not to your anxiety about missing something.

  1. Scheduled Collection Fixed-interval polling using cron expressions. Daily for stable sources (industry reports, regulatory filings), every four hours for moderate sources (news feeds, blogs), hourly for fast-moving sources (social media, stock data). Each source's frequency is configured independently in the source registry.
  2. Event-Triggered Collection Webhooks or change-detection scripts that fire when specific events occur. A competitor's pricing page changes — trigger a full competitive analysis collection. A patent is filed in your technology area — trigger IP landscape collection. Events bypass the schedule and collect immediately.
  3. Hybrid Collection Scheduled baseline with event-triggered supplements. The schedule ensures nothing is missed. The events ensure time-sensitive intelligence is captured immediately. Both feed into the same storage layer and downstream pipeline. Most production systems use the hybrid approach.