Skip to content

Infrastructure standard

All HCS infrastructure deployments follow a declarative, config-driven model. Human operators do not run ad-hoc Azure CLI commands to create production resources. Every resource is defined in code, versioned, and deployed through a pipeline.


IaC tool selection

Use case Tool
Azure resource provisioning (VMs, VNets, RGs, Key Vaults, AKS) Bicep
Multi-cloud deployments or complex state management Terraform
OS configuration, role installation, domain join PowerShell (DSC or Invoke- scripts)
Large-scale configuration management across many nodes Ansible
Hybrid: Azure resources + OS configuration Bicep (Azure) + PowerShell (OS), orchestrated by ADO pipeline

Do not mix tools for the same resource. If Bicep provisions a VM, Bicep owns that VM's Azure-level configuration. PowerShell configures the OS inside the VM. Never split ownership of a single resource across two tools.


Deployment phases

All infrastructure deployments follow four phases in order. No phase is skipped, even for small changes.

Phase What happens Tool
Plan Dry run — show what will change without making changes az deployment group what-if / terraform plan
Provision Create Azure resources Bicep / Terraform
Configure Install roles, configure OS, join domain, set registry keys PowerShell
Validate Assert that the deployed state matches the expected state Pester contract tests

Every ADO pipeline that deploys to prd must implement all four phases and require a manual approval gate between Provision and Configure.

stages:
  - stage: Plan
    jobs:
      - job: WhatIf
        steps:
          - task: AzureCLI@2
            inputs:
              scriptType: pscore
              inlineScript: |
                az deployment group what-if `
                  --resource-group rg-hcs-platform-prd-eus-01 `
                  --template-file infra/main.bicep `
                  --parameters @infra/parameters.prd.json

  - stage: Provision
    dependsOn: Plan
    jobs:
      - job: Deploy
        steps:
          - task: AzureCLI@2
            inputs:
              scriptType: pscore
              inlineScript: |
                az deployment group create `
                  --resource-group rg-hcs-platform-prd-eus-01 `
                  --template-file infra/main.bicep `
                  --parameters @infra/parameters.prd.json

  - stage: Configure
    dependsOn: Provision
    jobs:
      - job: OSConfig
        steps:
          - task: PowerShell@2
            inputs:
              filePath: scripts/Invoke-NodeConfiguration.ps1
              arguments: -ConfigPath config/variables.yml

  - stage: Validate
    dependsOn: Configure
    jobs:
      - job: ContractTest
        steps:
          - task: PowerShell@2
            inputs:
              targetType: inline
              script: |
                Install-Module Pester -Force -SkipPublisherCheck
                $r = Invoke-Pester -Path tests/contract/ -PassThru -Output Detailed
                if ($r.FailedCount -gt 0) { exit 1 }

Bicep conventions

File structure

infra/
├── main.bicep                  # Entry point — orchestrates modules
├── modules/
│   ├── network.bicep
│   ├── compute.bicep
│   └── identity.bicep
└── parameters/
    ├── parameters.dev.json
    ├── parameters.stg.json
    └── parameters.prd.json

Parameter files

Never embed environment-specific values in .bicep files. All environment differences live in parameter files.

{
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "environment": { "value": "prd" },
    "location":    { "value": "eastus" },
    "instance":    { "value": "01" }
  }
}

Parameter files never contain secrets. Secret values are passed as securestring parameters resolved at pipeline runtime from ADO Variable Groups linked to Key Vault.

Required tags

Apply these tags to every resource group and resource via Bicep:

tags: {
  Owner:       'kris@hybridsolutions.cloud'
  Project:     projectName
  Environment: environment
  CostCenter:  'hcs-internal'
  ManagedBy:   'bicep'
}

Terraform conventions (when used)

State management

All Terraform state is stored in Azure Storage, never locally. State files are never committed to repos.

terraform {
  backend "azurerm" {
    resource_group_name  = "rg-hcs-platform-prd-eus-01"
    storage_account_name = "sthcsartifactsprdeus01"
    container_name       = "tfstate"
    key                  = "<project>/<env>/terraform.tfstate"
  }
}

Variable files

  • terraform.tfvars is gitignored — never committed
  • terraform.tfvars.example is committed with placeholder values
  • Secrets are passed as environment variables (TF_VAR_<name>) from ADO Variable Groups, never in .tfvars files

Config file contract

Every infrastructure repo ships a config/variables.example.yml that defines the complete configuration schema. Operators copy it to config/variables.yml and fill in environment-specific values before running scripts.

See variables standard for the keyvault:// URI scheme, snake_case key naming, and bootstrap policy.

The example file is the documentation. It must include a comment on every key explaining what the value controls and what format it expects:

compute:
  azure_local:
    # Name of the Azure Local cluster. Used as the resource name in Azure.
    # Format: lowercase-kebab-case, max 15 characters.
    cluster_name: REPLACE-WITH-YOUR-CLUSTER-NAME

    # Primary DNS server for cluster nodes.
    dns_servers:
      - REPLACE-WITH-DNS-SERVER-1
      - REPLACE-WITH-DNS-SERVER-2

Resource lifecycle

Action Requires
Create a new resource PR with Bicep/Terraform changes + plan output in PR comments
Modify an existing resource PR + what-if / plan output showing the diff
Delete a resource PR + explicit confirmation that deletion is intentional + approval gate in pipeline
Emergency manual change Allowed for P1 incidents only — must be followed within 24h by a PR that codifies the manual change

No resource is created manually without a follow-up PR. The pipeline is the record of truth.


Security baseline

Every deployed resource must conform to this minimum security baseline. Bicep modules in this (platform) repo enforce these as defaults.

Control Requirement
Diagnostic settings All resources send logs to la-hcs-prd-eus-01 (Log Analytics workspace)
Managed identity All compute resources use user-assigned managed identity — no client secrets for Azure auth
Key Vault access Access via RBAC, not legacy access policies
Storage Require HTTPS, disable public access, enable soft delete
Network NSGs on all subnets, no wildcard inbound rules to *
Tags All required tags present (Owner, Project, Environment, CostCenter, ManagedBy)