# Gate Aggregation Scripts

![Gate Aggregation Scripts](/files/h6MiZUgu4ytvGBtW4u9d)

> **Intro:** The moment a pipeline uses more than one security tool, someone has to decide how the results become a single release decision. Gate aggregation scripts solve that problem explicitly: they ingest scanner outputs, normalize the findings, apply policy, and emit a final pass/fail decision that can block merge or release.
>
> **What this page includes**
>
> * why aggregation is better than scattered fail logic
> * bash and Python examples for multi-tool gate decisions
> * artifact patterns for GitLab release evidence and human review
> * practical rules for exceptions and new-code bias
>
> **Working assumptions**
>
> * every security tool has a different output format and different noise profile
> * release gates should be inspectable and reproducible, not buried in tribal knowledge

## Why aggregate at all?

Without aggregation:

* one job fails on any finding,
* another job only warns,
* Sonar waits asynchronously,
* DAST results live in another format,
* release logic becomes inconsistent and hard to explain.

With aggregation:

* jobs still generate native reports,
* a single policy job decides what counts as blocking,
* the pipeline leaves behind one summary artifact for humans and audits.

## Pattern: tool jobs are evidence producers, not final judges

A practical flow:

1. run Semgrep and Bandit for code findings;
2. query SonarQube quality gate status;
3. run SCA and DAST;
4. collect outputs in JSON, SARIF, XML, or text;
5. aggregate them into one summary;
6. fail the aggregation job only if policy says the release should stop.

## Example GitLab job for aggregation

```yaml
security_gate_aggregate:
  stage: security
  image: python:3.12-alpine
  needs:
    - semgrep_scan
    - bandit_scan
    - sonar_gate
    - dependency_check
    - zap_baseline
  script:
    - python3 snippets/ci/aggregate-security-gate.py
  artifacts:
    paths:
      - security-gate-summary.json
      - security-gate-summary.md
    expire_in: 30 days
```

## Bash example for a simple threshold gate

```bash
#!/usr/bin/env bash
set -euo pipefail

semgrep_file="${1:-semgrep.json}"
bandit_file="${2:-bandit.json}"

critical_count="$(jq '[.results[]? | select(.extra.severity == "ERROR" or .extra.severity == "CRITICAL")] | length' "$semgrep_file")"
high_bandit="$(jq '[.results[]? | select(.issue_severity == "HIGH")] | length' "$bandit_file")"

echo "semgrep_critical=${critical_count}"
echo "bandit_high=${high_bandit}"

if [ "$critical_count" -gt 0 ] || [ "$high_bandit" -gt 0 ]; then
  echo "Security gate failed"
  exit 1
fi

echo "Security gate passed"
```

This is good for a small team, but it becomes hard to maintain as soon as exceptions, changed-file logic, or multiple tools are added.

## Python example for a richer policy

```python
#!/usr/bin/env python3
import json
from pathlib import Path

SUMMARY = {
    "tools": {},
    "blocking_reasons": [],
    "status": "pass",
}

def load_json(path):
    p = Path(path)
    if not p.exists():
        return {}
    return json.loads(p.read_text())

def count_semgrep(data):
    results = data.get("results", [])
    high = sum(1 for r in results if str(r.get("extra", {}).get("severity", "")).upper() in {"ERROR", "HIGH", "CRITICAL"})
    med = sum(1 for r in results if str(r.get("extra", {}).get("severity", "")).upper() == "MEDIUM")
    return {"high_or_above": high, "medium": med}

def count_bandit(data):
    results = data.get("results", [])
    high = sum(1 for r in results if r.get("issue_severity") == "HIGH")
    med = sum(1 for r in results if r.get("issue_severity") == "MEDIUM")
    return {"high": high, "medium": med}

def sonar_status(path="sonar-gate.json"):
    data = load_json(path)
    return data.get("projectStatus", {}).get("status", "UNKNOWN")

semgrep = count_semgrep(load_json("semgrep.json"))
bandit = count_bandit(load_json("bandit.json"))
sonar = sonar_status()

SUMMARY["tools"]["semgrep"] = semgrep
SUMMARY["tools"]["bandit"] = bandit
SUMMARY["tools"]["sonar"] = {"status": sonar}

if semgrep["high_or_above"] > 0:
    SUMMARY["blocking_reasons"].append("Semgrep high-or-above findings detected")
if bandit["high"] > 0:
    SUMMARY["blocking_reasons"].append("Bandit high findings detected")
if sonar not in {"OK", "NONE"}:
    SUMMARY["blocking_reasons"].append(f"Sonar quality gate status is {sonar}")

if SUMMARY["blocking_reasons"]:
    SUMMARY["status"] = "fail"

Path("security-gate-summary.json").write_text(json.dumps(SUMMARY, indent=2))
Path("security-gate-summary.md").write_text(
    "# Security Gate Summary\n\n"
    f"- status: **{SUMMARY['status']}**\n"
    f"- blocking reasons: {', '.join(SUMMARY['blocking_reasons']) or 'none'}\n"
)

if SUMMARY["status"] == "fail":
    raise SystemExit(1)
```

## Policy ideas that age well

Good default rules:

* block **new** high/critical issues;
* allow existing debt to remain visible but not silently worsen;
* require explicit review of hotspots;
* allow exceptions only when ticketed and time-bound;
* fail closed when required reports are missing on protected release paths.

## Example exception file

```yaml
exceptions:
  - tool: semgrep
    rule_id: python.lang.security.audit.subprocess-shell-true
    scope: services/legacy-worker/**
    reason: "Legacy migration in progress; tracked in SEC-142"
    expires_on: "2026-06-30"
  - tool: dependency-check
    package: "org.example:legacy-xml"
    cve: "CVE-2024-99999"
    reason: "Fix requires vendor patch; compensating controls documented"
    expires_on: "2026-03-31"
```

The aggregator can read this file and reject expired exceptions automatically.

## GitLab job that publishes an MR-friendly summary

```yaml
security_gate_note:
  stage: security
  image: alpine:3.20
  needs: [security_gate_aggregate]
  script:
    - apk add --no-cache curl jq
    - |
      body="$(cat security-gate-summary.md)"
      echo "Would post note here with GitLab API or glab CLI"
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
```

## Combining DAST and SCA inputs

Example conventions:

* `zap-report.json` for ZAP baseline or full scan
* `burp-report.json` for exported Burp Suite results
* `dependency-check-report.json` or XML for SCA

A good aggregator does not need every tool to use the same schema. It only needs a documented adapter for each source.

## Release-focused behavior

Use stricter rules for:

* tags that represent releases,
* production-targeting deploy jobs,
* protected branches,
* component or platform repositories that affect many pipelines.

Example:

```yaml
security_gate_aggregate:
  rules:
    - if: '$CI_COMMIT_TAG'
      variables:
        GATE_MODE: release
    - if: '$CI_MERGE_REQUEST_ID'
      variables:
        GATE_MODE: mr
```

The script can interpret `GATE_MODE=release` as a stricter threshold.

## Practical outputs to preserve

Keep these artifacts:

* normalized JSON summary;
* human-readable markdown summary;
* raw tool reports;
* exception manifest used for the decision;
* version of the policy logic or component that made the decision.

That turns the gate into something a reviewer can actually reconstruct later.

## Cross-links

* [Security Quality Gates and Release Blocking](/devsecops-cicd-and-supply-chain/index-1/security-quality-gates-and-release-blocking.md)
* [GitLab Release Evidence](/devsecops-cicd-and-supply-chain/index-1/gitlab-release-evidence.md)
* [SAST Noise Reduction](/application-security-and-secure-sdlc/index-1/sast-noise-reduction.md)
* [GitLab Mock Interview Pack](/learning-labs-interview-and-templates/index-1/gitlab-mock-interview.md)

![Footer](/files/fQNzMAKOWjRP989toSYF)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.product-security.expert/devsecops-cicd-and-supply-chain/index-1/gate-aggregation-scripts.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
