# Mastering GitLab Parent‑Child Pipelines: Scalable CI for Multi‑Module Maven Monorepos

# Background

Recently, I redesigned my team's pipeline running on a multi-module Maven monorepo using GitLab CI. It wasn't that the previous setup was broken, but my team faced a few persistent issues that I hoped to resolve to bring the pipeline into a more robust state. I was the one who did the initial setup as well, but that was done when I was still relatively new to GitLab CI. As I worked on this redesign, I learned quite a few lessons along the way that I hope to share with the "future me."

# Objective

I set out to resolve several key pain points:

*   **Double Pipelines**: Preventing both branch and MR pipelines from triggering for the same commit.
    
*   **Artifact Management**: Resolving issues where build artifacts weren't reliably uploaded or shared.
    
*   **Visibility**: Fixing cases where test coverage (JaCoCo) wasn't being correctly captured.
    

Beyond just fixing issues, I also wanted to implement several strategic improvements:

*   **Scalability**: Creating a more robust setup using `!reference` tags and reusable job templates.
    
*   **Security**: Integrating automated SAST scans.
    
*   **Performance**: Reducing overall pipeline runtime through optimized cache strategies and surgical artifact sharing.
    

Read on to see the architectural patterns and specific pipeline designs that resolved these pain points and enabled us to scale effectively.

# Pipeline design

## Project Structure

This is a sample setup, but quite similar to one that I do have. In the actual setup, there's around 10 - 12 modules.

```plaintext
. (Root Aggregator)
├── .gitlab-ci.yml           # Root Orchestrator
├── .gitlab-ci-base.yml      # Central CI Blueprint
├── pom.xml                  # Root Aggregator POM
├── parent-pom/              # Shared configuration & dependency management
│   └── pom.xml
└── project/                 # Functional module aggregator
    ├── pom.xml
    ├── mmm-security/        # Foundation module
    │   ├── .gitlab-ci.yml   # Child Pipeline
    │   └── pom.xml
    ├── mmm-core/            # Core application module
    │   ├── .gitlab-ci.yml   # Child Pipeline
    │   └── pom.xml
    ├── mmm-search/          # Search module
    │   ├── .gitlab-ci.yml   # Child Pipeline
    │   └── pom.xml
    └── mmm-report/          # Coverage reporter
        └── pom.xml
```

## Pipeline setup

Configure a parent-child pipeline with the following setup

*   Root Orchestrator (`.gitlab-ci.yml`)
    
*   Central Blueprint (`.gitlab-ci-base.yml`)
    
*   Independent Child Pipelines (`.gitlab-ci.yml` housed within each module)
    

### Overview

![](https://cdn.hashnode.com/uploads/covers/60e9553dc551f16b0f84748b/911bb00a-a2d6-4010-811d-36b3b0053f4e.svg align="center")

* * *

### Hands-On: The Interactive CI/CD Playbook

If you'd like to explore this architecture in a more visual and hands-on way, I've built an [**Interactive CI/CD Playbook**](https://bwgjoseph.github.io/tutorials/mastering-gitlab-ci/). It allows you to dive into the project structure, explore how we manage artifact sharing across parent-child pipelines, and see the code snippets in action.

[**Explore the Playbook →**](https://bwgjoseph.github.io/tutorials/mastering-gitlab-ci/)

If you prefer to dig into the implementation details and the "why" behind these design decisions, read on—the following best practices break down the architecture component by component.

* * *

# Best Practices

## Root Orchestrator

This is the main controller that defines the `workflow` rules, the stages available, and when to trigger the downstream child pipelines.

```yaml
workflow:
  rules:
    - if: $CI_FULL_PIPELINE == "true"                  # Manual override
    - if: $CI_PIPELINE_SOURCE == "parent_pipeline"
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS # Double-pipeline suppression
      when: never
    - if: $CI_COMMIT_BRANCH                             # All branch pushes (including main)
```

*   `CI_FULL_PIPELINE` allows me to manually trigger full pipeline run
    
*   `CI_COMMIT_BRANCH && CI_OPEN_MERGE_REQUESTS` prevents **Double Pipelines**: a common issue where GitLab triggers both a branch pipeline and an MR pipeline for the same commit.
    

```yaml
stages:
  - configuration
  - foundation
  - application
  - test # setup to run Gitlab built-in SAST scans
  - report

pre:
  stage: .pre
  script:
    - env
```

`.pre` is a built-in default stage that always run first. This is useful to see what are all the available environment variables available to the job for ease of troubleshooting.

> While `env` is great for debugging, it should be used with caution (or removed before production) to avoid accidentally logging environment metadata, even though GitLab masks secrets by default.

```yaml
.trigger-rules:
  rules:
    - if: "$CI_FULL_PIPELINE == 'true'"
      when: always
    # Merge Request: Accuracy check using compare_to
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      changes:
        compare_to: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
        paths:
          - $MODULE_PATH/**/*
          - parent-pom/**/*
    # Main Branch: Standard change detection
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH && $CI_COMMIT_BEFORE_SHA != "0000000000000000000000000000000000000000"
      changes:
        - $MODULE_PATH/**/*
        - parent-pom/**/*

trigger-core:
  stage: application
  extends: .trigger-rules
  needs:
    - job: trigger-security
      optional: true
    - job: deploy-parent-pom
      optional: true
  variables:
    MODULE_PATH: "project/mmm-core"
    PARENT_SOURCE: $CI_PIPELINE_SOURCE
    PARENT_BRANCH: $CI_COMMIT_BRANCH
  trigger:
    include: project/mmm-core/.gitlab-ci.yml
    strategy: mirror
```

*   `.trigger-rules`: Define reusable rules to determine when the job should be triggered. I used `compare_to: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME` to ensure change detection is calculated accurately against the target branch—eliminating the "rebase noise" that often plagues monorepo pipelines.
    
*   `extends: .trigger-rules`: By extending this template, it inherits the centralized change-detection logic. This ensures that only file changes under a specific module (or the parent-pom) trigger a pipeline run—preventing unnecessary resource waste.
    
*   `needs`: These are upstream jobs that this job depends on, and if there are changes to those modules, then this job has to wait until those are done before it can run. Having the `optional: true` is the secret sauce for monorepos: it prevents the pipeline from failing if an upstream module wasn't triggered due to lack of changes.
    
*   `variables`: This is especially important in parent-child pipelines as this allows us to pass context to the child pipeline in order to evaluate the rules correctly. When the child pipeline runs, the value for `$CI_PIPELINE_SOURCE` could be different from when it is run in the parent pipeline.
    
*   `trigger:mirror`: Ensures the parent pipeline reflects the downstream (child pipeline) status accurately. See [docs](https://docs.gitlab.com/ci/yaml/#triggerstrategy) for more detailed explanation.
    

This parent-child isolation means a failure in the "Search" module doesn't block the "Core" module's deployment—significantly reducing the "blast radius" of failures in a large monorepo.

## Central Blueprint

This is where all the global variables, job-templates, stages, cache strategy, and reusable snippets are declared.

### Parameterized Templates (`spec:inputs`)

I treated our CI templates like "functions" with a defined interface using `spec:inputs`. This allows the Root Orchestrator to pass specific configurations (like forcing a full pipeline) without relying on fragile global variables.

```yaml
# In .gitlab-ci-base.yml
spec:
  inputs:
    full_pipeline:
      default: "false"

# Usage in .gitlab-ci.yml
include:
  - local: '.gitlab-ci-base.yml'
    inputs:
      full_pipeline: $CI_FULL_PIPELINE
```

### Cache Strategy

```yaml
cache: 
  key: "maven-$CI_COMMIT_REF_SLUG"
  paths:
    - .m2/repository/
  policy: pull-push
```

Using `$CI_COMMIT_REF_SLUG` is a standard strategy to define a shared branch-level cache, allowing modules to share internal dependencies. This ensures that once the first job has pulled the necessary dependencies, they are cached locally on the runner (or remotely, like in Minio in my setup). Subsequent jobs or pipelines in the same branch will reuse this cache to prevent redundant downloads, saving bandwidth, preventing race conditions, and shaving time off every run.

```yaml
.deploy-snapshot-template:
  extends: .base-maven-job
  stage: release
  interruptible: false # Ensure deployment finishes once started
  cache:
    key: "maven-$CI_COMMIT_REF_SLUG"
    paths:
      - .m2/repository/
    policy: pull
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  script:
    - |
      echo "Deploying SNAPSHOT for $MODULE_PATH to GitLab Maven Registry..."
      mvn $MAVEN_CLI_OPTS deploy -pl $MODULE_PATH -am -DskipTests
```

It is important to know when to override the cache policy so that it will not unnecessarily push the updated `.m2/repository` back to the remote cache if I am sure this job only uses the dependencies to do its job.

### !reference tag

```yaml
.coverage-parser:
  script:
    - |
      echo "Extracting coverage percentage for GitLab UI..."
      REPORT_PATH=${JACOCO_XML_PATH:-"$MODULE_PATH/target/site/jacoco/jacoco.xml"}
      echo "DEBUG: Parsing JaCoCo report at: $REPORT_PATH"
      
      if [ -f "$REPORT_PATH" ]; then
        # Use grep -o to extract ONLY the matching tag, then tail -1 to get the aggregate
        # This works correctly even if the entire XML is on a single line.
        LINE_COUNTER=$(grep -o '<counter type="LINE"[^>]*/>' "$REPORT_PATH" | tail -1 || true)

        if [ -n "$LINE_COUNTER" ]; then
          MISSED=$(echo "$LINE_COUNTER" | sed -n 's/.*missed="\([0-9]*\)".*/\1/p')
          COVERED=$(echo "$LINE_COUNTER" | sed -n 's/.*covered="\([0-9]*\)".*/\1/p')
          TOTAL=$((MISSED + COVERED))

          echo "DEBUG: Found LINE counter: missed=$MISSED, covered=$COVERED, total=$TOTAL"

          if [ "$TOTAL" -gt 0 ]; then
            PERCENT=$(awk -v c="$COVERED" -v t="$TOTAL" 'BEGIN {printf "%.2f", (c / t) * 100}')
            echo "Coverage: $PERCENT%"
          else
            echo "Coverage: 0.00%"
          fi
        else
          echo "DEBUG: No LINE counter line found in $REPORT_PATH"
        fi
      fi

# Template for application modules
.application-template:
  extends: .base-maven-job
  stage: build
  variables:
    JACOCO_XML_PATH: "$MODULE_PATH/target/site/jacoco/jacoco.xml"
  script:
    - mvn $MAVEN_CLI_OPTS verify -pl $MODULE_PATH -am
    - !reference [.coverage-parser, script]
```

This is similar to YAML Anchors, which allows you to reuse snippets across jobs. While that is the case, I find `!reference` more developer-friendly, as it allows you to select specific keys (like `script`) to reuse.

Note that YAML Anchors are a native YAML feature and work outside of GitLab, while the `!reference` tag is a GitLab-specific feature.

### Variables

```yaml
variables:
  # Performance: Use a project-relative path for caching
  MAVEN_REPO_LOCAL: ".m2/repository"
  SONAR_USER_HOME: ".sonar"  # Defines the location of the analysis task cache
  
  # JVM Tuning for CI (Merged with user preferences)
  MAVEN_OPTS: >-
    -Dhttps.protocols=TLSv1.2
    -Dorg.slf4j.simpleLogger.showDateTime=true
    -Djava.awt.headless=true
    -Dfile.encoding=UTF-8
    -Xmx2048m
    -XX:+TieredCompilation
    -XX:TieredStopAtLevel=1

  # Maven CLI optimization (Merged with user preferences)
  MAVEN_CLI_OPTS: >-
    --batch-mode
    --errors
    --fail-at-end
    --show-version
    --no-transfer-progress
    --threads 1C
    -DinstallAtEnd=true
    -DdeployAtEnd=true
    -s .mvn/settings-ci.xml

  GIT_DEPTH: "0"  # Tells git to fetch all the branches of the project, required by the analysis task

  # GitLab FastZip and Compression
  FF_USE_FASTZIP: "true"
  ARTIFACT_COMPRESSION_LEVEL: "fast"
  CACHE_COMPRESSION_LEVEL: "fast"
```

I want to draw attention to `MAVEN_CLI_OPTS` where `-s .mvn/settings-ci.xml` is defined. This is the unsung hero of our pipeline: it maps `${env.CI_JOB_TOKEN}` to our GitLab Maven repository, allowing seamless, credential-free publishing and dependency resolution within the CI environment.

I also enabled `FF_USE_FASTZIP` and `fast` compression. In a monorepo with 10+ modules, the time saved zipping and unzipping artifacts and cache across dozens of jobs adds up to several minutes per pipeline.

### Artifacts

```yaml
.application-template:
  extends: .base-maven-job
  stage: build
  script:
    - mvn $MAVEN_CLI_OPTS verify -pl $MODULE_PATH -am
  artifacts:
    when: always
    paths:
      - "$MODULE_PATH/target/"
    exclude:
      - "$MODULE_PATH/target/*.jar"
    reports:
      junit:
        - "$MODULE_PATH/target/surefire-reports/TEST-*.xml"
        - "$MODULE_PATH/target/failsafe-reports/TEST-*.xml"
      coverage_report:
        coverage_format: jacoco
        path: "$MODULE_PATH/target/site/jacoco/jacoco.xml"
    expire_in: 1 hour
  coverage: '/Coverage: (\d+(?:\.\d+)?)%/'
```

*   `when`: Set to `always` so that artifacts are uploaded even on failure, allowing debugging via HTML reports.
    
*   `paths, exclude`: Target specific files for upload. Since fat-jar files can be massive, excluding them prevents `413 Request Entity Too Large` errors.
    
*   `reports`: Ensures all relevant reports are submitted to provide coverage data in the GitLab Pipeline UI and the MR widget.
    

Sharing artifacts across jobs ensures that build products (like `/classes`) can be reused in subsequent steps, such as `jib:build`. This significantly reduces container build times by skipping redundant source code re-compilation.

### Sharing artifacts across parent-child Pipeline

One of the biggest hurdles in Parent-Child pipelines is that **child pipelines run in isolated workspaces.** Standard `needs: artifacts: true` cannot pull files from a triggered child pipeline back into the root orchestrator.

To solve this for our aggregated reports, I implemented an **ID-based API Collection pattern**:

1.  **Bridge API**: Query the parent pipeline's bridges to find the `downstream_pipeline.id`.
    
2.  **Jobs API**: Query that child pipeline to find the specific `build` job ID.
    
3.  **Artifact Download**: Use a **Project Access Token** to download the artifact zip directly via the API.
    

**Why a Project Access Token (PAT)?** Because GitLab's standard `$CI_JOB_TOKEN` is restricted for security and often cannot cross the pipeline boundary. A PAT with `read_api` scope ensures our aggregator has the necessary authority.

```bash
# Simplified aggregation logic in root orchestrator
for module in mmm-core mmm-search; do
  # 1. Resolve Child Pipeline ID
  CHILD_PIPELINE_ID=$(curl --silent --header "PRIVATE-TOKEN: ${PAT_TOKEN}" \
    "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/pipelines/${CI_PIPELINE_ID}/bridges" \
    | jq -r ".[] | select(.name==\"trigger-${module#mmm-}\") | .downstream_pipeline.id")

  # 2. Resolve specific Job ID
  CHILD_JOB_ID=$(curl --silent --header "PRIVATE-TOKEN: ${PAT_TOKEN}" \
    "${CI_API_V4_URL}/projects/${CHILD_PIPELINE_ID}/jobs" \
    | jq -r ".[] | select(.name==\"build-${module#mmm-}\") | .id")

  # 3. Securely Download
  curl --location --header "PRIVATE-TOKEN: ${PAT_TOKEN}" \
    "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/jobs/${CHILD_JOB_ID}/artifacts" \
    --output "${module}.zip"
done
```

## Independent Child Pipelines

With `.gitlab-ci-base.yml` providing the blueprint for all child modules, the child pipelines are simple to set up and configure.

```yaml
# project/mmm-core/.gitlab-ci.yml
include:
  - local: '.gitlab-ci-base.yml'

variables:
  MODULE_PATH: "project/mmm-core"

build-core:
  extends: .application-template

deploy-core-snapshot:
  extends: .deploy-snapshot-template
  needs: ["build-core"]

deploy-core-image:
  extends: .deploy-image-template
  needs: ["build-core"]

deploy-manifests-prod:
  extends: .deploy-cluster-template
  needs: ["deploy-core-image"]
  resource_group: core-deploy-prod
  environment:
    name: prod
  variables:
    OVERLAY_PATH: "project/mmm-core/k8s/overlays/prod"
    DEPLOYMENT_NAME: "mmm-core"
```

### Deployment Safety (`resource_group`)

In a monorepo, multiple modules often deploy to the same namespace. If two pipelines trigger simultaneously, they might both try to run `kubectl apply` for the same module, leading to race conditions.

To prevent this, I implemented module-specific **Resource Groups** as a CI-level mutex:

```yaml
deploy-manifests-prod:
  extends: .deploy-cluster-template
  resource_group: core-deploy-prod
  environment:
    name: prod
```

By using the naming convention `[module]-deploy-[env]`:

*   **Serialization**: Only one pipeline can deploy `mmm-core` to `prod` at a time.
    
*   **Concurrency**: `mmm-core` and `mmm-search` can still deploy to `prod` simultaneously because they use different locks.
    

### Deployment Rollbacks (The Panic Button)

No matter how robust your pipeline is, things can still go wrong. To provide a safety net, I paired every deployment job with a manual **Rollback** job.

```yaml
rollback-prod:
  extends: .rollback-template
  needs: ["deploy-manifests-prod"]
  resource_group: core-deploy-prod
  variables:
    DEPLOYMENT_NAME: "mmm-core"
    K8S_NAMESPACE: "prod"
```

Using `kubectl rollout undo`, this job allows any team member to immediately revert a failed deployment to its previous stable revision with a single click in the GitLab UI. This "Panic Button" is essential for maintaining high availability in a fast-moving monorepo.

# **Shift Left Security**

GitLab provides comprehensive built-in security scanning templates that you can adopt quickly and easily.

> Shift left security means building security testing, compliance checks, and secure coding practices into the earliest phases of the software development life cycle (SDLC).
> 
> Source: https://about.gitlab.com/topics/devsecops/shift-left-security/

```yaml
# .gitlab-ci.yml
include:
  - local: '.gitlab-ci-base.yml'
  - template: Jobs/SAST.gitlab-ci.yml
  - template: Jobs/Secret-Detection.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml

stages:
  - ...
  - test # setup to run GitLab built-in scans
  - report
```

That's it! You could override the individual job if you want to, but it's optional and some could be overwritten using global variable.

```xml
<properties>
	<sonar.version>5.5.0.6356</sonar.version>
	<sonar.projectKey>bwgjoseph:${project.artifactId}</sonar.projectKey>
	<sonar.projectName>bwgjoseph:${project.artifactId}</sonar.projectName>
	<!-- Not necessary for self-hosted -->
	<sonar.organization>bwgjoseph</sonar.organization>
	<sonar.coverage.jacoco.xmlReportPaths>${project.build.directory}/site/jacoco/jacoco.xml</sonar.coverage.jacoco.xmlReportPaths>

	<!-- SonarQube Scanner Properties -->
	<sonar.scanner.skipSystemTruststore>true</sonar.scanner.skipSystemTruststore>
	<sonar.scanner.skipJreProvisioning>true</sonar.scanner.skipJreProvisioning>
	<!-- Java 25 support: Allow Sonar plugins (like IaC) to access restricted native methods -->
	<sonar.scanner.javaOpts>--enable-native-access=ALL-UNNAMED</sonar.scanner.javaOpts>
	<!-- Use CI project dir for scanner home if available, fallback to local .sonar -->
	<sonar.userHome>${env.CI_PROJECT_DIR}/.sonar</sonar.userHome>
</properties>
```

To ensure independent Quality Gates in SonarQube, I enforced unique `sonar.projectKey` values for each module (e.g., `bwgjoseph:${project.artifactId}`). Without this, every module analysis would overwrite the previous one in the Sonar dashboard!

# **The Extras - Expert Mode**

To take a pipeline from "Functional" to "Enterprise-Ready," I added these high-impact features:

*   **The Engineering Portal**: I used GitLab Pages to host a unified site for aggregated JaCoCo coverage at `/coverage` and Maven Documentation at `/site`. I added a "Gateway Index" in CI that dynamically builds a landing page to navigate between them.
    
*   **Human-Friendly Triggers**: I leveraged GitLab's `variables:options` to create a **dropdown menu** in the UI. Now, anyone on the team can manually trigger a full build without needing to remember CLI flags.
    
*   **The Ghost Aggregator**: I introduced a "code-free" module (`mmm-report`) purely for aggregation. It provides a clean target for our ID-based artifact collection and prevents our functional modules from being cluttered with aggregation logic.
    
*   **Inner-Loop Development**: Validate your complex parent-child YAMLs locally using [gitlab-ci-local](https://github.com/firecow/gitlab-ci-local). This shaves hours off the debugging cycle by allowing you to run jobs directly on your machine.
    

# Conclusion

At the end of the day, a CI/CD pipeline is only as good as the developer experience it provides. In this redesign, our goal was to make the monorepo feel "small" again—ensuring that a developer working on a single module isn't burdened by the weight of the entire project.

By automating the complex bits—like ID-based artifact collection and cross-pipeline reporting—and providing human-friendly tools like UI-driven triggers and local validation, I’ve built a system that stays out of the way while providing a safety net of "Shift Left" security.

Scaling a monorepo parent-child pipeline isn't just about build commands; it's about orchestration, visibility, and developer experience. Hopefully, these patterns help you (and the future me!) build better pipelines.

# **Source Code**

As usual, the full source code is available on [**GitHub**](https://github.com/bwgjoseph/tutorials/tree/main/mastering-gitlab-ci)
