# GitLab CI YAML Deep Dive

![GitLab Pipeline Control Plane](/files/xg9AI2mAKinQJWtbJLMK)

> **Intro:** The main GitLab pipeline file is usually called **`.gitlab-ci.yml`**. Treat it as the control plane for build, test, security scanning, packaging, and release behavior. A strong file explains not only **what runs**, but also **when the pipeline exists**, **which jobs are included**, **where they run**, and **what can block release**.
>
> **What this page includes**
>
> * the major top-level blocks that shape a GitLab pipeline
> * how `workflow`, `include`, `stages`, `rules`, and `needs` interact
> * a security-first example pipeline with comments and release gating
> * cross-links to runner isolation, protected environments, and reusable components
>
> **Working assumptions**
>
> * pipeline creation, job presence, and job order are separate concerns
> * delivery security should be explicit in YAML instead of hidden in runner-side scripts

## Mental model

Read `.gitlab-ci.yml` in this order:

1. **`workflow`** decides whether a pipeline is created at all.
2. **`include`** brings in shared templates or reusable components.
3. **global keys** such as `default`, `variables`, and `stages` establish baseline behavior.
4. **jobs** define the actual work.
5. **`rules`** decide whether each job exists in the current pipeline.
6. **`needs`** refines execution order into a DAG.
7. **artifacts, reports, environments, and release jobs** preserve outputs and shape deploy behavior.

In practice, this means **pipeline existence**, **job existence**, and **job ordering** are three different layers of logic.

## Key top-level blocks

| Block                     | What it does                                                        | Security relevance                                              |
| ------------------------- | ------------------------------------------------------------------- | --------------------------------------------------------------- |
| `workflow:`               | decides whether to create a pipeline for push, MR, schedule, or tag | blocks duplicate or unsafe pipeline paths                       |
| `include:`                | imports shared YAML or components                                   | can standardize gates, but must be pinned and reviewed          |
| `default:`                | sets base image, tags, retry, or hooks                              | makes runner use and execution defaults predictable             |
| `variables:`              | defines project-level non-secret settings                           | secrets belong in protected variables or external secret stores |
| `stages:`                 | broad execution phases                                              | easy-to-read release order                                      |
| `rules:`                  | determines when a job exists                                        | keeps expensive or privileged jobs away from unsafe contexts    |
| `needs:`                  | creates explicit job dependencies                                   | shortens feedback loops and makes gate relationships clear      |
| `artifacts:` / `reports:` | preserves outputs and scanner reports                               | supports evidence, auditability, and GitLab features            |
| `environment:`            | models deploy targets                                               | connects jobs to protected environments and approvals           |

## Broad order vs exact order

### `stages` give the broad order

```yaml
stages:
  - prepare
  - build
  - security
  - release
  - deploy
```

This answers the high-level question: **what phases exist?**

### `needs` give the exact order

```yaml
semgrep_scan:
  stage: security
  needs: ["build_app"]
  script:
    - semgrep scan --config p/default --json --output semgrep.json
```

This answers the more precise question: **what must finish before this job can start?**

## A commented example pipeline

```yaml
# Create only the pipeline types we actually want.
workflow:
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_TAG'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
    - when: never

# Import reusable scanner and evidence logic from a reviewed internal project.
include:
  - project: platform/ci-templates
    ref: v2.3.1
    file:
      - /security/common-gates.yml
      - /security/release-evidence.yml

default:
  image: alpine:3.20
  interruptible: true
  retry: 1
  tags:
    - ci-general

variables:
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
  # Use GitLab protected variables or CI/CD secrets for sensitive values.
  SONAR_HOST_URL: "https://sonarqube.example.com"

stages:
  - prepare
  - build
  - security
  - package
  - release
  - deploy

prepare:
  stage: prepare
  script:
    - echo "Preparing workspace"
  artifacts:
    paths: [.cache]
    expire_in: 1 day

build_app:
  stage: build
  needs: [prepare]
  script:
    - ./scripts/build.sh
  artifacts:
    paths:
      - dist/
    expire_in: 7 days

unit_tests:
  stage: security
  needs: [build_app]
  script:
    - ./scripts/run-tests.sh
  artifacts:
    reports:
      junit: junit.xml
    paths:
      - junit.xml

semgrep_scan:
  stage: security
  needs: [build_app]
  image: semgrep/semgrep:1.84.0
  script:
    - semgrep scan --config p/default --json --output semgrep.json
  artifacts:
    paths: [semgrep.json]

bandit_scan:
  stage: security
  needs: [build_app]
  image: python:3.12-alpine
  script:
    - pip install bandit
    - bandit -r app -f json -o bandit.json
  artifacts:
    paths: [bandit.json]

sonar_gate:
  stage: security
  needs: [build_app]
  image: sonarsource/sonar-scanner-cli:latest
  script:
    - sonar-scanner -Dsonar.qualitygate.wait=true
  artifacts:
    paths: [sonar-report.txt]

security_gate_aggregate:
  stage: security
  image: python:3.12-alpine
  needs:
    - semgrep_scan
    - bandit_scan
    - sonar_gate
  script:
    - python3 snippets/ci/aggregate-security-gate.py
  artifacts:
    paths:
      - security-gate-summary.json
      - security-gate-summary.md
    expire_in: 30 days

package_image:
  stage: package
  tags: [ci-build]
  needs:
    - build_app
    - security_gate_aggregate
  rules:
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH || $CI_COMMIT_TAG'
  script:
    - ./scripts/build-image.sh
  artifacts:
    paths: [image-digest.txt]

create_release:
  stage: release
  image: registry.gitlab.com/gitlab-org/cli:latest
  needs:
    - package_image
  rules:
    - if: '$CI_COMMIT_TAG'
  script:
    - glab release create "$CI_COMMIT_TAG" --ref "$CI_COMMIT_SHA" --notes-file CHANGELOG.md

deploy_production:
  stage: deploy
  tags: [ci-deploy-prod]
  needs: [create_release]
  rules:
    - if: '$CI_COMMIT_TAG =~ /^v\d+\.\d+\.\d+$/'
      when: manual
    - when: never
  environment:
    name: production
    deployment_tier: production
  script:
    - ./scripts/deploy-prod.sh
```

## Why this example works

* `workflow` prevents unwanted pipeline creation.
* `include` keeps shared logic centralized and versioned.
* `stages` make the broad lifecycle readable.
* `needs` keeps security jobs parallel where possible.
* `tags` route privileged jobs away from general runners.
* `environment` attaches the production deploy to protected-environment policy.

## `rules` patterns that matter

### Only run on merge requests

```yaml
rules:
  - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
```

### Only run on protected refs

```yaml
rules:
  - if: '$CI_COMMIT_REF_PROTECTED == "true"'
```

### Only run when a language or file type exists

```yaml
rules:
  - exists:
      - pyproject.toml
      - requirements.txt
```

### Never run on schedules unless explicitly allowed

```yaml
rules:
  - if: '$CI_PIPELINE_SOURCE == "schedule"'
    when: never
```

## `needs` for feedback speed

A common anti-pattern is to wait for the entire build stage before running scanners that only depend on one build artifact.

Better pattern:

```yaml
semgrep_scan:
  stage: security
  needs: [build_app]
```

This makes the pipeline behave more like a graph and less like a rigid waterfall.

## Where runner choice belongs

Use YAML to make runner routing explicit with tags or scoped runners, but keep the trust decision outside the file too.

Read next:

* [Runner Isolation and Trust Boundaries](/devsecops-cicd-and-supply-chain/index-1/runner-isolation-and-trust-boundaries.md)
* [Protected Environments and Deployment Approvals](/devsecops-cicd-and-supply-chain/index-1/protected-environments-and-deployment-approvals.md)

## Reuse without hiding behavior

A project should still be able to explain its release path even when it consumes shared pipeline logic.

Read next:

* [Reusable GitLab Includes and Components](/devsecops-cicd-and-supply-chain/index-1/reusable-gitlab-includes-and-components.md)

## Cross-links

* [Security Quality Gates and Release Blocking](/devsecops-cicd-and-supply-chain/index-1/security-quality-gates-and-release-blocking.md)
* [GitLab Release Evidence](/devsecops-cicd-and-supply-chain/index-1/gitlab-release-evidence.md)
* [Gate Aggregation Scripts](/devsecops-cicd-and-supply-chain/index-1/gate-aggregation-scripts.md)

![Footer](/files/fQNzMAKOWjRP989toSYF)

## Seven useful GitLab CI features that often clean up pipelines

These are not “security features” in isolation, but they frequently improve delivery hygiene and reduce weird CI behavior that later becomes security debt.

### 1) `resource_group`

Use it when only one deploy or stateful action should run at a time.

```yaml
deploy_prod:
  stage: deploy
  resource_group: production
  script:
    - ./deploy.sh
```

Good fit:

* production deploys
* schema migrations
* promotion steps that must not overlap

### 2) `allow_failure:exit_codes`

Useful when one tool exits with a special code for “findings exist” versus “the job is broken.”

```yaml
secret_scan:
  stage: security
  script:
    - ./scan-secrets.sh
  allow_failure:
    exit_codes: [3]
```

Use carefully. It should make semantics clearer, not hide real failures.

### 3) pipeline input ergonomics with variable options

Useful for manual or scheduled pipelines where reviewers should choose from a known set of targets rather than type free-form values.

### 4) `!reference`

Useful when several jobs share small fragments such as common `rules`, `before_script`, or scanner wrappers.

### 5) `coverage`

Still valuable for teams that want merge-request-visible test coverage without building a custom parser path for everything.

### 6) `parallel` and `parallel:matrix`

Useful for large test or validation fans, especially when different providers, regions, or service groups must be checked independently.

### 7) `needs` with artifact awareness

Useful when you want faster DAG execution without accidentally pulling every artifact from previous stages.

### Security-minded cautions

* do not use “clever YAML” to hide deploy logic reviewers cannot understand;
* keep privileged jobs visibly separate from low-trust jobs;
* be explicit about artifact flow when using `needs`;
* do not let pipeline optimization silently bypass review, evidence, or approval steps.

## Seven high-value GitLab YAML features that often stay underused

### `resource_group`

Use it when only one job should mutate a shared target at a time.

```yaml
tf_apply:
  stage: deploy
  resource_group: terraform-prod
  script:
    - terraform apply -auto-approve
```

### `allow_failure:exit_codes`

Useful when a tool has one exit code for “soft issue” and another for real breakage.

```yaml
smoke_check:
  script: ./smoke.sh
  allow_failure:
    exit_codes: [42]
```

### `parallel` and `parallel:matrix`

Useful when test or build expansion is predictable and mechanical.

### `!reference` and reuse patterns

Useful when one small piece of logic should be reused consistently across jobs or included files.

### `coverage`

Useful when you want test-coverage signal visible in merge requests without building custom parsing around it.

### `variables` with constrained manual choices

Useful for safer manual runs where the operator should pick from known-good values instead of typing arbitrary free-form strings.

See also [`../snippets/ci/gitlab/advanced-yaml-patterns.yml`](https://github.com/D3One/Product-Security-Gitbook/blob/main/snippets/ci/gitlab/advanced-yaml-patterns.yml).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.product-security.expert/devsecops-cicd-and-supply-chain/index-1/gitlab-ci-yaml-deep-dive.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
