embeddedci-cdverification

Integrating RocqStat and VectorCAST into CI: automated WCET checks for safety‑critical builds

ddevtools

2026-01-31

9 min read

Integrate RocqStat and VectorCAST into CI to automate WCET checks. Practical steps, pipeline templates, and 2026 trends for safety‑critical automotive builds.

Hook: Stop discovering timing regressions at release time

If you build safety‑critical automotive or embedded software, you know the pain: a functional test passes in CI, but a late-stage integration or HIL run shows a missed real‑time deadline — and suddenly a sprint or release is derailed. Fragmented toolchains, inconsistent instrumented builds, and ad‑hoc timing checks make worst‑case execution time (WCET) surprises expensive.

This article is a practical, hands‑on guide for integrating RocqStat and VectorCAST into CI pipelines so WCET checks run automatically, regressions fail the build, and timing evidence is archived for audits. Target audience: embedded and automotive engineers, CI/CD leads, and QA owners who must make timing verification part of the pipeline.

Why timing analysis belongs in CI in 2026

From late 2024 through 2026 the industry shifted from ad‑hoc, offline WCET processes to continuous timing verification. Three forces drive this change:

Increased complexity of E/E architectures and centralized compute in vehicles demands earlier timing validation.
Regulators and OEMs expect tool evidence and traceable artifacts for ISO 26262 and SOTIF workflows — CI pipelines provide authoritative provenance.
Vendors like VectorCAST and timing tool providers (including RocqStat) improved CI APIs and container packaging, making automated timing checks practical on standard CI runners or cloud HIL fleets. See how developer flows and tooling evolved in 2026 for similar trends in onboarding and tooling (developer onboarding).

High‑level integration pattern

Integrate WCET verification as a pipeline stage following the build and unit/integration testing stages. The canonical stages look like this:

Checkout and toolchain setup (containerized tool image)
Compile: instrumented & production builds (two artifacts)
VectorCAST functional and unit tests (coverage + trace capture)
Exercise on target (emulator or HIL) to collect execution traces
RocqStat timing analysis against collected traces + static info
Threshold check: pass/fail build; produce JUnit/XML and HTML report
Archive artifacts and evidence for audits

Key concepts

Instrumented vs production builds — Instrumented builds capture traces but might alter timing. Use them to capture high‑coverage traces; combine with static analysis to derive WCET for non‑instrumented builds. Small automation scripts and micro-jobs make this repeatable (see examples for quick micro-jobs and templates in micro-app guides like micro-app swipe).
Deterministic runners — Containerize tool versions and use pinned CPU configurations for emulator runs so results are reproducible across CI runs. Hardware benchmarking guidance (for pinned runners and boards) is useful when selecting consistent runner hardware (hardware benchmarking).
Gating thresholds — Keep strict, versioned thresholds in repo; treat WCET budget as a code artifact under review. This follows the same principle as treating policies and budgets as code in other toolchains (tooling & code practices).

Concrete CI examples

Below are actionable pipeline snippets and scripts you can adapt for GitHub Actions, GitLab CI, or Jenkins. The examples assume your repo includes a Docker image that bundles the crosscompiler, VectorCAST automation and RocqStat CLI utilities.

1) GitHub Actions: run VectorCAST tests and RocqStat

name: CI-WCET

on: [push, pull_request]

jobs:
  build-and-wcet:
    runs-on: ubuntu-22.04
    container:
      image: ghcr.io/yourorg/embedded-ci:2026.01
    steps:
      - uses: actions/checkout@v4

      - name: Set up toolchain
        run: |
          export PATH=/opt/crosstools/bin:$PATH

      - name: Build instrumented
        run: |
          make clean && make BUILD=instrumented -j$(nproc)

      - name: Build production
        run: |
          make clean && make BUILD=production -j$(nproc)

      - name: Run VectorCAST unit tests
        run: |
          vectorcast-cli run --project tests/vcast_project --output traces/vcast_trace.xml

      - name: Run on emulator to collect traces
        run: |
          ./scripts/run_on_qemu.sh build/instrumented.elf --trace traces/qemu_trace.log

      - name: Run RocqStat timing analysis
        run: |
          rocqstat analyze --binary build/production.elf --traces traces/* --output reports/rocqstat.json

      - name: WCET threshold check
        run: |
          python tools/check_wcet.py reports/rocqstat.json thresholds/wcet_thresholds.yaml

The important points: keep tools in a container, produce both instrumented and production artifacts, and run timing analysis as a reproducible step. Make sure the artifacts (traces, reports) are captured as build artifacts for auditing — combine that with a lightweight file-tagging and archival approach (file tagging & edge indexing).

2) GitLab CI: gating on WCET

stages:
  - build
  - test
  - wcet
  - archive

build-prod:
  stage: build
  image: registry.gitlab.com/yourorg/embedded-ci:2026.01
  script:
    - make BUILD=production
  artifacts:
    paths: [build/production.elf]

unit-tests:
  stage: test
  image: registry.gitlab.com/yourorg/embedded-ci:2026.01
  script:
    - make BUILD=instrumented
    - vectorcast-cli run --project tests/vcast_project --output traces/vcast_trace.xml
  artifacts:
    paths: [traces/*]

wcet-analysis:
  stage: wcet
  image: registry.gitlab.com/yourorg/embedded-ci:2026.01
  script:
    - ./scripts/run_on_qemu.sh build/instrumented.elf --trace traces/qemu_trace.log
    - rocqstat analyze --binary build/production.elf --traces traces/* --output reports/rocqstat.json
    - python tools/check_wcet.py reports/rocqstat.json thresholds/wcet.yaml
  artifacts:
    paths: [reports/*]

3) Jenkins declarative pipeline

pipeline {
  agent { docker { image 'registry/embedded-ci:2026.01' } }
  stages {
    stage('Build') {
      steps { sh 'make BUILD=production' }
      post { always { archiveArtifacts artifacts: 'build/production.elf' } }
    }
    stage('Unit tests') { steps { sh 'make BUILD=instrumented && vectorcast-cli run --project tests/vcast_project --output traces/vcast_trace.xml' } }
    stage('WCET') {
      steps {
        sh '''
          ./scripts/run_on_qemu.sh build/instrumented.elf --trace traces/qemu_trace.log
          rocqstat analyze --binary build/production.elf --traces traces/* --output reports/rocqstat.json
          python tools/check_wcet.py reports/rocqstat.json thresholds/wcet.yaml
        '''
      }
    }
  }
}

Practical scripts and parsing

You need a small script to parse RocqStat output and fail the pipeline when deadlines are missed. Below is a minimal Python checker that converts RocqStat JSON to JUnit and exits non‑zero on violations.

#!/usr/bin/env python3
import json, sys
from xml.etree.ElementTree import Element, SubElement, tostring

rocqfile = sys.argv[1]
thresholds = {
  "taskA": 120000,  # microseconds
  "taskB": 50000
}

with open(rocqfile) as f:
    data = json.load(f)

suite = Element('testsuite', name='wcet')
failed = 0

for entry in data.get('tasks', []):
    name = entry['name']
    wcet = entry['wcet_us']
    case = SubElement(suite, 'testcase', name=name)
    budget = thresholds.get(name, None)
    if budget and wcet > budget:
        fail = SubElement(case, 'failure', message=f'WCET {wcet}us > {budget}us')
        failed += 1

print(tostring(suite).decode())
if failed:
    sys.exit(2)

Store thresholds in YAML in the repo and update them through code review. That turns timing budgets into traceable artifacts.

Trace capture strategies

Reliable WCET inference needs high coverage and representative input stimuli. Use a layered approach:

Unit and integration VectorCAST tests with mocked peripherals to exercise paths deterministically and generate coverage. Treat these tests as part of onboarding flows and developer checklists (developer onboarding).
Component fuzzing in CI nightlies to explore edge conditions and expand the trace corpus.
HIL smoke runs on every release candidate or nightly, using a cloud HIL or on‑prem board farm to validate timing with real I/O latency — orchestration for cloud HIL farms benefits from automation and orchestration patterns used in other automated research and compute orchestration domains (orchestration patterns).
Deterministic replay — record inputs during HIL, replay them on emulator to reproduce timing anomalies for analysis.

Making results trustworthy and auditable

To satisfy safety auditors and OEMs, you must produce reproducible artifacts and a clear chain of custody:

Archive build artifacts: binary, map files, compile logs, and RocqStat JSON outputs. Use a predictable file-tagging and archival model (file tagging) so reviewers can find evidence quickly.
Store tool versions: pin VectorCAST and RocqStat versions in CI images and record them in the report. Also consider hardening and supply validation techniques similar to those used for local AI toolchains (toolchain hardening).
Attach unique trace IDs to each run so HIL logs, traces and analysis outputs can be correlated.
Publish a pass/fail artifact (JUnit) for CI dashboards and integrate into PR gating.

Best practices and advanced strategies

1) Treat timing budgets as code

Keep a thresholds directory under version control and require code review for changes. Use semantic versioning for budgets and run automated diff checks to highlight regressions to reviewers. This mirrors the trend of expressing operational policies and budgets as code in modern developer tooling (tooling & code practices).

2) Separate experimental from gating pipelines

Use feature branches and experimental runs to gather traces from new functionality. Only merge into the gating branch when the team agrees budgets are updated.

3) Hybrid WCET: combine measurements and static inference

Measurement only gives lower bounds. For safety cases, combine RocqStat measurement data with control‑flow and microarchitectural models (cache, pipeline) to estimate safe upper bounds. RocqStat's analysis should accept static CFGs and trace hints to reduce pessimism while remaining safe. For more firmware-level modeling approaches, see advanced firmware fault‑tolerance and modeling discussions (firmware modeling).

4) Continuous regression detection

Maintain a history of WCET numbers in a time series DB (Prometheus, Influx). Alert on 5–10% upward trends before a hard threshold trips. Use ML anomaly detection (simple ARIMA or isolation forests) to detect drift in 2026 CI best practices. Observability playbooks are helpful to design alerts and runbooks (observability & incident response).

5) Cloud HIL and scale

In 2026 more teams use cloud HIL providers. Key considerations: reproducible board images, synchronized firmware versions, and tight SLAs for latency-sensitive runs. Use HIL only for nightly or release candidates to reduce cost; rely on emulator + hybrid static analysis for PR gating. Orchestration and automation templates (including autonomous orchestration approaches) can reduce operational friction (automation & orchestration).

Common pitfalls and how to avoid them

Pitfall: Using instrumented execution times directly as production WCETs. Fix: Use instrumented traces for path coverage and feed them into RocqStat to combine with static analysis to compute safe WCET for production binaries.
Pitfall: Unpinned tool versions causing flakiness. Fix: Bake exact tool versions into CI images and record checksums of the image used for each run. Also apply hardening and validation best practices similar to those described for securing local AI toolchains (hardening guidance).
Pitfall: Poor trace coverage. Fix: Automate VectorCAST tests to achieve high path coverage; add fuzzing and scenario permutations to increase observed worst cases.
Pitfall: No evidence linking trace to artifact. Fix: Use unique run IDs and include the git commit, binary checksum and board firmware version in every report.

Example: a real‑world outcome

A Tier‑1 supplier we advised adopted continuous RocqStat + VectorCAST checks in 2025 across three product lines. After introducing CI gating and nightly HIL runs, they caught a timing regression introduced by a library upgrade on day 2 of the change, preventing a multi‑week integration delay. Their average time‑to‑detect WCET regressions dropped from 14 days to under 24 hours and they reduced late‑stage rework by ~60%.

2026 trends & what’s next

Looking ahead through 2026, expect these developments:

Timing as Code (TaC) — teams will express timing budgets, assumptions and trace harnesses as code artifacts that are versioned and reviewed.
CI-native timing tools — vendors will ship certified, containerized toolchains and REST APIs for orchestrating HIL farms and retrieving traces programmatically.
Better hybrid analysis — tighter integrations between dynamic traces (VectorCAST/RocqStat) and static microarchitectural models will reduce pessimism while preserving safety arguments.
Cloud HIL adoption — managed HIL farms with secure logging and artifact export will make compliance and scaling easier for smaller teams.

"Shift‑left timing checks in CI are no longer optional for safety‑critical teams. They are a practical way to reduce risk and shorten development cycles." — CI/embedded engineering lead

Checklist: get started this week

Containerize your toolchain (compiler, RocqStat, VectorCAST automation) and pin versions.
Add an instrumented build target and a production build target to your Make/CMake.
Create VectorCAST jobs that generate coverage and trace artifacts in CI.
Automate a target/emulator run that collects traces and links them to the commit.
Run RocqStat in CI, produce JSON + HTML, and implement a threshold checker that fails PRs on regressions.
Archive all artifacts and expose JUnit results on the CI dashboard for reviewers.

Actionable templates & next steps

Use the pipeline snippets in this article as a starting point. Adapt the paths and tool invocations to the exact command names your VectorCAST and RocqStat releases provide. If you don't yet have HIL, emulate first with QEMU and extend to cloud HIL later. Prioritize reproducibility: if a run cannot be reproduced from artifacts, it won't survive a safety audit.

Call to action

Ready to stop finding timing surprises at release time? Start by adding a single CI job that runs VectorCAST unit tests, collects traces, and executes RocqStat analysis. Version your thresholds and fail PRs on regressions. If you want a jump‑start, download our CI templates and prebuilt toolchain image (examples tailored for GitHub Actions, GitLab CI, and Jenkins). Integrate timing checks today — reduce risk tomorrow.

devtools

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.