# Istio / Linkerd mTLS Operations and Certificate Rotation

> **Intro:** Teams often succeed at turning mTLS on and then discover later that the real difficulty is operational: permissive mode never tightened, trust anchors near expiry, issuer rotation undocumented, and nobody knows whether app outages came from mesh policy or certificate lifecycle mistakes.
>
> **What this page includes**
>
> * the operational model for Istio and Linkerd mTLS
> * what rotates automatically and what still needs operator ownership
> * production-safe certificate hierarchy patterns
> * review questions and failure modes

## Start with the ownership model

| Layer                              | Who should usually own it                         |
| ---------------------------------- | ------------------------------------------------- |
| Workload certificates              | platform / mesh operations                        |
| Issuer / intermediate certificates | platform security + mesh operators                |
| Trust anchor / root                | security / PKI owners, with strong change control |
| Authorization policy               | platform + application owners                     |

## Istio

### What Istio automates well

* workload identity and X.509 issuance to workloads;
* key and certificate rotation for workload certificates via the agent / `istiod` flow;
* strict or permissive mTLS policy modes;
* policy attachment at mesh, namespace, or workload boundary.

### What still needs explicit operator design

* whether to keep the self-signed default CA or plug in an external CA;
* how trust anchors are managed across clusters;
* how strict mode rollout is staged;
* how issuer secrets are rotated and documented.

### Recommended hierarchy

Use an offline or tightly governed root CA and issue intermediates to cluster-local Istio CAs. Avoid treating the default self-signed root as a long-term production story.

```mermaid
flowchart TD
    A[Offline / Controlled Root CA] --> B[Cluster A Istio Intermediate]
    A --> C[Cluster B Istio Intermediate]
    B --> D[Workload Certs in Cluster A]
    C --> E[Workload Certs in Cluster B]
```

### Operational steps

1. define trust domain and cluster boundaries;
2. choose self-signed only for lab / low-risk cases;
3. load external CA material for production clusters;
4. move namespaces from permissive to strict mTLS intentionally;
5. test issuer rotation before expiry windows become urgent.

## Linkerd

### What Linkerd automates well

* automatic mTLS for meshed workloads;
* short-lived workload certificates;
* automatic rotation of workload certificates.

### What operators must still own

* trust anchor lifecycle;
* identity issuer certificate and key lifecycle;
* production-safe external certificate source;
* expiry monitoring and advance rotation rehearsals.

### Practical production note

Out-of-the-box Linkerd installs can generate static self-signed credentials, which are fine for quick start but not a production endpoint. Many teams use cert-manager or another external source for the issuer lifecycle. Trust anchor rotation still needs deliberate planning.

## Rotation runbook model

| Step                   | Istio                                    | Linkerd                                                 |
| ---------------------- | ---------------------------------------- | ------------------------------------------------------- |
| Workload cert rotation | mostly automatic                         | automatic                                               |
| Issuer rotation        | operator-owned workflow                  | operator-owned, often with cert-manager                 |
| Trust anchor rotation  | operator-owned high-risk change          | operator-owned high-risk change                         |
| Validation             | mesh health, cert expiry, authz behavior | `linkerd check`, workload and control-plane cert checks |

## Common failure modes

1. **permissive mode becomes permanent**
2. **trust anchors near expiry with no rehearsed rotation**
3. **issuer rotation known by one operator only**
4. **mesh mTLS assumed to replace application authorization**
5. **cross-environment trust bundle too wide**
6. **debugging bypass paths not documented**
7. **cert-manager integration added without ownership clarity**

## Review prompts

* what is the trust domain?
* where do workload private keys live?
* how long are workload certs valid?
* who owns issuer rotation?
* who owns trust anchor rotation?
* how is expiry monitored?
* what is the break-glass plan if mesh cert issuance fails during production hours?

## Read next

* [🪪 mTLS and Service Identity Deep Dive](/architecture-api-crypto-and-identity/index-2/mtls-and-service-identity-deep-dive.md)
* [🔐 Internal PKI for Microservices — mTLS, Certificate Automation, and Trust Distribution](/cloud-kubernetes-and-infrastructure-security/index/internal-pki-for-microservices-mtls-and-certificate-automation.md)
* [☸️ Container / Kubernetes / Platform Security — Images, Admission, RBAC, Pod Hardening, Isolation, and GitOps / Deployment Plane](https://github.com/D3One/Product-Security-Gitbook/blob/main/09-container-and-kubernetes-security/container-kubernetes-platform-security-images-admission-rbac-pod-hardening-isolation-and-gitops-deployment-plane.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.product-security.expert/cloud-kubernetes-and-infrastructure-security/index-1/istio-linkerd-mtls-operations-and-certificate-rotation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
