MS Fabric RTI DATAPRODUCT RUN PATTERN

In a Data-as-a-Product operating model, the biggest orchestration mistake is treating the platform like a single “master scheduler.” It scales poorly, couples domains, and turns every change into a coordination exercise.

A cleaner approach is signal-driven triggering:

  • Upstream data products publish facts about readiness (not “please run job X at 02:00”).
  • Downstream data products react only when the contract is compatible and the data is actually usable.
  • The platform provides governance, observability, and safety rails—without taking ownership away from domains.

Microsoft Fabric’s Real-Time Intelligence (RTI) workload provides a pragmatic backbone for this pattern using:
Eventstream (ingestion) → Eventhouse (event storage + query) → Activator (rules + actions), complemented by User-Defined Functions in Eventhouse to standardize and enrich readiness signals.

Your data products remain linked through Output Ports implemented as Lakehouse assets (tables/files), which are consumed contract-first.

The pattern at a glance

Components and responsibilities

1) Data Products (Domain-owned)

  • Produce data and publish it to an Output Port (Lakehouse-backed).
  • Emit a “Snapshot Sealed / Output Ready” event when the output is consumable.

2) Output Ports (Contract boundary)

  • Stable identifier + version.
  • Clear semantic meaning and expectations (schema/quality/freshness).

3) Eventstream (Ingestion)

  • Captures readiness events from many domains/products and forwards them reliably into RTI.

4) Eventhouse (Event storage + query)

  • Persists events and enables fast querying.
  • Hosts user-defined functions to normalize events, compute derived fields, and keep rules consistent.

5) Activator (Rules + actions)

  • Watches for conditions (e.g., “Gold output sealed and quality passed”).
  • Triggers downstream actions (webhook, pipeline kickoff, notification, ticket, Service Bus message—whatever your platform standard is).

6) Platform Management (Central governance broker)

  • Owns the policy gates: allowed dependencies, no cycles, naming rules, entitlement rules.
  • Provides the canonical registry of products, output ports, versions, and subscriptions.
  • Optionally exposes a single “Trigger API” so Activator doesn’t need to know downstream implementation details.

End-to-end flow (what actually happens)

Step 1 — Upstream publishes to its Output Port (Lakehouse)

The upstream product writes its output (table/file) into the Lakehouse location representing the Output Port.

Key point: Publishing data and emitting the “ready” signal are separate concerns. The data is the product artifact; the event is the operational signal.

Step 2 — Upstream emits a “SnapshotSealed” event

Once data is successfully written and validated, the upstream emits an event such as:

{
  "eventType": "OutputPortSnapshotSealed",
  "eventId": "b7a9b2a0-7f2a-4b5f-9e74-3e9d4d2d1b1a",
  "occurredAt": "2025-12-18T10:15:42Z",
  "producer": {
    "dataProduct": "dp.customer.customer360",
    "domain": "customer"
  },
  "outputPort": {
    "key": "op.customer.customer360.profile",
    "version": "1.2.0",
    "stage": "prod"
  },
  "snapshot": {
    "watermark": "2025-12-18T10:00:00Z",
    "recordCount": 1842231,
    "schemaHash": "sha256:9c8e…",
    "quality": {
      "status": "passed",
      "ruleset": "gold-minimum",
      "score": 0.98
    }
  }
}

Practical advice:

  • Always include an idempotency key (eventId) and a stable snapshot identifier (watermark, batchId, partition).
  • Include schemaHash and contract version to enable safe downstream validation.

Step 3 — Eventstream ingests and routes into Eventhouse

Eventstream receives the event (from your chosen source) and lands it in Eventhouse for storage and querying.

Step 4 — Eventhouse normalizes via a User-Defined Function

Instead of duplicating transformation logic across rules, define a reusable function that standardizes events.

Example (illustrative) KQL-style function concept:

.create-or-alter function with (folder="dp-signals") NormalizeSnapshotSealed(e:dynamic)
{
  project
    EventId = tostring(e.eventId),
    OccurredAt = todatetime(e.occurredAt),
    ProducerDp = tostring(e.producer.dataProduct),
    OutputPortKey = tostring(e.outputPort.key),
    OutputPortVersion = tostring(e.outputPort.version),
    Stage = tostring(e.outputPort.stage),
    Watermark = tostring(e.snapshot.watermark),
    SchemaHash = tostring(e.snapshot.schemaHash),
    QualityStatus = tostring(e.snapshot.quality.status),
    RecordCount = tolong(e.snapshot.recordCount)
}

This gives you a consistent “signal table shape” for:

  • Activator rules
  • ad-hoc debugging queries
  • dashboards and SLO tracking (“event-to-ready”, “ready-to-consumed”)

Step 5 — Activator evaluates rules and triggers the next product

A typical rule is:
IF (OutputPortSnapshotSealed AND QualityStatus == passed AND Stage == prod AND Version compatible)
THEN trigger downstream product run for all subscribed consumers.

In practice, the cleanest implementation is:

  • Activator calls Platform Management Trigger API with the normalized event.
  • Platform Management resolves who is subscribed and what is allowed.
  • Platform Management publishes a downstream trigger message (for example onto your existing messaging backbone) or invokes a specific downstream job.

This keeps Activator simple and prevents it from becoming the new “central orchestrator.”

Step 6 — Downstream product executes and reads the Output Port

The downstream product runs its own pipelines/notebooks and reads the upstream Output Port from the Lakehouse using the contract identifiers (key + version + stage).

Why this pattern works for Data Products

It enforces product boundaries

Downstream teams depend on:

  • the Output Port contract
  • the readiness signal

They do not depend on upstream internals.

It reduces coupling and incident blast radius

Instead of chained pipelines with hidden dependencies, you get:

  • explicit subscriptions
  • observable edges in the product graph
  • policy-enforced constraints (including cycle prevention)

It unlocks real operational governance

Eventhouse becomes your operational truth:

  • which products publish reliably
  • how often outputs are “sealed”
  • how long downstream consumers take to react
  • where quality gates fail

Governance “must-haves” (so it stays safe)

  1. No cycles / no self-consumption
  • Platform Management should reject subscriptions that introduce cycles.
  • This is non-negotiable once you go event-driven.
  1. Version compatibility rules
  • Activator triggers should check contract version and schema hash before running downstream.
  • Treat breaking changes as explicit events, not surprises.
  1. Idempotent triggering
  • Assume events can be duplicated or arrive late.
  • Downstream jobs should ignore already-processed eventId/watermark.
  1. Stage isolation
  • Never mix DEV/Test/Prod signals.
  • Add stage to the event and enforce it in policies.
  1. SLOs and back-pressure
  • If an upstream publishes too frequently, downstream might lag.
  • Define “maximum trigger rate” and use buffering (Eventhouse) + controlled replay.

Minimal MVP implementation roadmap

MVP (2–3 weeks of effort in most platforms)

  • Define event schema + naming conventions.
  • Emit SnapshotSealed from one upstream product.
  • Land events in Eventhouse.
  • Implement one Activator rule that calls Platform Management “Trigger API”.
  • Trigger one downstream product and prove idempotency.

Scale-out

  • Add subscription registry in Platform Management.
  • Add policy gate (cycle detection, medallion rules, entitlements).
  • Add standard UDFs for normalization and derived metrics.
  • Add SLO dashboards (publish frequency, quality pass rate, lead time).

Leave a comment

About the author

I’m a data platform leader with 10+ years of experience in data modelling and Business Intelligence. Today, I lead the IT Data Platform at SWICA, working at the intersection of business needs and modern data engineering to turn complex data into reliable, valuable outcomes for the organization—and ultimately for our customers.

In my current role, I’m responsible for the operation and continuous evolution of a future-ready data platform. I focus on building scalable, cloud-based capabilities that enable teams to move faster while staying aligned with governance, security, and quality expectations. My strength lies in translating ambiguity into clear data products, robust pipelines, and BI solutions that people can trust.

Get updates

Spam-free subscription, we guarantee. This is just a friendly ping when new content is out.