Skip to content

Configuring the Kubernetes Ingestor Backend Plugin#

This guide covers the configuration options available for the Kubernetes Ingestor backend plugin.

Configuration File#

The plugin is configured through your app-config.yaml. Here's a comprehensive example:

kubernetesIngestor:
  # Optional field to set the default owner of the ingested resources.
  defaultOwner: kubernetes-auto-ingested
  # Mappings of kubernetes resource metadata to backstage entity metadata
  # The list bellow are the default values when the mappings are not set in the app-config.yaml
  # The recommended values are:
  # namespaceModel: 'cluster' # cluster, namespace, default
  # nameModel: 'name-cluster' # name-cluster, name-namespace, name
  # titleModel: 'name' # name, name-cluster, name-namespace
  # systemModel: 'cluster-namespace' # cluster, namespace, cluster-namespace, default
  # referencesNamespaceModel: 'default' # default, same
  mappings:
    namespaceModel: 'cluster' # cluster, namespace, default
    nameModel: 'name-cluster' # name-cluster, name-namespace, name-kind, name
    titleModel: 'name' # name, name-cluster, name-namespace
    systemModel: 'namespace' # cluster, namespace, cluster-namespace, default
    referencesNamespaceModel: 'default' # default, same
  # Default owner for ingested entities when no owner annotation is set
  defaultOwner: 'kubernetes-auto-ingested'
  # A list of cluster names to ingest resources from. If empty, resources from all clusters under kubernetes.clusterLocatorMethods.clusters will be ingested.
  # allowedClusterNames:
  #   - my-cluster-name
  components:
    # Whether to enable creation of backstage components for Kubernetes workloads
    enabled: true
    # Whether to ingest Kubernetes workloads as Resource entities instead of Component entities (default: false)
    ingestAsResources: false
    taskRunner:
      # How often to query the clusters for data
      frequency: 10
      # Max time to process the data per cycle
      timeout: 600
    # Namespaces to exclude the resources from
    excludedNamespaces:
      - kube-public
      - kube-system
    # Custom Resource Types to also generate components for
    customWorkloadTypes:
      - group: pkg.crossplane.io
        apiVersion: v1
        plural: providers
        # singular: provider # explicit singular form - needed when auto-detection fails
    # By default all standard kubernetes workload types are ingested. This allows you to disable this behavior
    disableDefaultWorkloadTypes: false
    # Allows ingestion to be opt-in or opt-out by either requiring or not a dedicated annotation to ingest a resource (terasky.backstage.io/add-to-catalog or terasky.backstage.io/exclude-from-catalog)
    onlyIngestAnnotatedResources: false
  crossplane:
    # Whether to completely disable crossplane related code for both XRDs and Claims. defaults to enabled if not provided for backwards compatibility
    enabled: true
    # This section is relevant for crossplane v1 claims as well as Crossplane v2 XRs.
    # In the future when v1 and claims are deprecated this field will change names but currently 
    # for backwards compatibility will stay as is
    claims:
      # Whether to create components for all claim resources (v1) and XRs (v2) in your cluster
      ingestAllClaims: true
      # Whether to ingest claims and XRs as Resource entities instead of Component entities (default: false)
      ingestAsResources: false
    xrds:
      # Whether to ingest XRDs as API entities only without generating templates (default: false)
      ingestOnlyAsAPI: false
      # Settings related to the final steps of a software template
      publishPhase:
        # Base URLs of Git servers you want to allow publishing to
        allowedTargets: ['github.com', 'gitlab.com']
        # What to publish to. currently supports github, gitlab, bitbucket, bitbucketCloud and YAML (provides a link to download the file)
        target: github
        git:
          # Follows the backstage standard format which is github.com?owner=<REPO OWNER>&repo=<REPO NAME>
          repoUrl:
          targetBranch: main
        # Whether the user should be able to select the repo they want to push the manifest to or not
        allowRepoSelection: true
        # Whether to request user OAuth credentials when selecting a repository URL (defaults to false)
        requestUserCredentialsForRepoUrl: false
      # Whether to enable the creation of software templates for all XRDs
      enabled: true
      taskRunner:
        # How often to query the clusters for data
        frequency: 10
        # Max time to process the data per cycle
        timeout: 600
      # Allows ingestion to be opt-in or opt-out by either requiring or not a dedicated annotation to ingest a xrd (terasky.backstage.io/add-to-catalog or terasky.backstage.io/exclude-from-catalog)
      ingestAllXRDs: true
      # Will convert default values from the XRD into placeholders in the UI instead of always adding them to the generated manifest.
      convertDefaultValuesToPlaceholders: true
  genericCRDTemplates:
    # Whether to ingest CRDs as API entities only without generating templates (default: false)
    ingestOnlyAsAPI: false
    # Settings related to the final steps of a software template
    publishPhase:
      # Base URLs of Git servers you want to allow publishing to
      allowedTargets: ['github.com', 'gitlab.com']
      # What to publish to. currently supports github, gitlab, bitbucket, bitbucketCloud and YAML (provides a link to download the file)
      target: github
      git:
        # Follows the backstage static format which is github.com?owner=<REPO OWNER>&repo=<REPO NAME>
        repoUrl:
        targetBranch: main
      # Whether the user should be able to select the repo they want to push the manifest to or not
      allowRepoSelection: true
      # Whether to request user OAuth credentials when selecting a repository URL (defaults to false)
      requestUserCredentialsForRepoUrl: false
    crdLabelSelector:
      key: terasky.backstage.io/generate-form
      value: "true"
    crds:
      - certificates.cert-manager.io
  kro:
    # Whether to completely disable KRO related code for both RGDs and instances. defaults to disabled if not provided
    enabled: false
    # Whether to ingest KRO instances as Resource entities instead of Component entities (default: false)
    instances:
      ingestAsResources: false
    rgds:
      # Whether to ingest RGDs as API entities only without generating templates (default: false)
      ingestOnlyAsAPI: false
      # Settings related to the final steps of a software template
      publishPhase:
        # Base URLs of Git servers you want to allow publishing to
        allowedTargets: ['github.com', 'gitlab.com']
        # What to publish to. currently supports github, gitlab, bitbucket, bitbucketCloud and YAML (provides a link to download the file)
        target: github
        git:
          # Follows the backstage standard format which is github.com?owner=<REPO OWNER>&repo=<REPO NAME>
          repoUrl:
          targetBranch: main
        # Whether the user should be able to select the repo they want to push the manifest to or not
        allowRepoSelection: true
        # Whether to request user OAuth credentials when selecting a repository URL (defaults to false)
        requestUserCredentialsForRepoUrl: false
      # Whether to enable the creation of software templates for all RGDs
      enabled: true
      taskRunner:
        # How often to query the clusters for data
        frequency: 10
        # Max time to process the data per cycle
        timeout: 600
  # Whether to auto add the argo cd plugins annotation to the ingested components if the ingested resources have the ArgoCD tracking annotation added to them. defaults to false
  argoIntegration: true

Advanced Features#

Ingest Only as API#

The plugin supports ingesting Custom Resource Definitions (CRDs) as API entities only, without generating the corresponding Backstage templates. This is useful when you want to document your APIs but don't need users to create instances through Backstage.

XRDs (Crossplane Composite Resource Definitions)#

kubernetesIngestor:
  crossplane:
    xrds:
      enabled: true
      ingestOnlyAsAPI: false  # Default: false

When ingestOnlyAsAPI is set to true: - API entities will be created for each XRD - Software templates will NOT be generated - Users can view the API documentation but cannot create new claims through Backstage

KRO RGDs (ResourceGraphDefinitions)#

kubernetesIngestor:
  kro:
    enabled: true
    rgds:
      enabled: true
      ingestOnlyAsAPI: false  # Default: false

When ingestOnlyAsAPI is set to true: - API entities will be created for each RGD - Software templates will NOT be generated - Users can view the API documentation but cannot create new instances through Backstage

Generic CRDs#

kubernetesIngestor:
  genericCRDTemplates:
    ingestOnlyAsAPI: false  # Default: false
    crds:
      - certificates.cert-manager.io

When ingestOnlyAsAPI is set to true: - API entities will be created for each configured CRD - Software templates will NOT be generated - Users can view the API documentation but cannot create new resources through Backstage

Use Cases for API-Only Ingestion: 1. Documentation Only: You want to document your APIs in Backstage but don't want users to create instances through the platform 2. External Creation: Resources are created through other systems (CI/CD pipelines, GitOps, etc.) and you only want API visibility 3. Read-Only View: You want teams to browse available APIs without the ability to create new instances

Ingest as Resources#

The plugin supports ingesting Kubernetes objects as Backstage Resource entities instead of Component entities. In Backstage's data model, Resources represent infrastructure resources (databases, queues, storage) while Components represent software components (services, applications).

Components (Regular Kubernetes Resources)#

kubernetesIngestor:
  components:
    enabled: true
    ingestAsResources: false  # Default: false

When ingestAsResources is set to true: - Regular Kubernetes workloads (Deployments, StatefulSets, etc.) will be ingested as Resource entities - The entity kind will be Resource instead of Component - Resource entities do not support providesApis and consumesApis relations (Component-specific)

Crossplane Claims and Composite Resources (XRs)#

kubernetesIngestor:
  crossplane:
    claims:
      ingestAsResources: false  # Default: false

When ingestAsResources is set to true: - Both Crossplane claims and composite resources (XRs) will be ingested as Resource entities - The entity kind will be Resource instead of Component - Resource entities do not support consumesApis relations (Component-specific) - All Crossplane-specific annotations will still be present - This applies to both v1 claims/XRs and v2 claims/composites

Note: XRs use the same configuration as claims because they are tightly coupled - claims create XRs, so they should be treated consistently.

KRO Instances#

kubernetesIngestor:
  kro:
    instances:
      ingestAsResources: false  # Default: false

When ingestAsResources is set to true: - KRO instances will be ingested as Resource entities - The entity kind will be Resource instead of Component - Resource entities do not support consumesApis relations (Component-specific) - All KRO-specific annotations will still be present

Use Cases for Resource Ingestion: 1. Infrastructure Resources: When Kubernetes objects represent infrastructure resources (databases, message queues, storage) rather than applications 2. Clear Separation: To distinguish between applications (Components) and the infrastructure they depend on (Resources) 3. Catalog Organization: To organize your catalog with proper entity types that match Backstage's data model

Combining Options#

You can use both ingestOnlyAsAPI and ingestAsResources together. For example:

kubernetesIngestor:
  crossplane:
    claims:
      ingestAsResources: true  # Existing claims are Resources
    xrds:
      enabled: true
      ingestOnlyAsAPI: true  # Only document APIs, no templates

This configuration: - Documents the APIs without template generation - Represents existing claim/XR instances as infrastructure resources - Provides API documentation without the ability to create new instances through Backstage

API Auto-Registration from Workloads#

The plugin supports automatic registration of API entities from OpenAPI/Swagger definitions exposed by your workloads. This feature allows you to annotate Kubernetes resources (Deployments, Crossplane claims, KRO instances, etc.) to automatically fetch their API definitions and create corresponding API entities in the Backstage catalog.

When an API entity is auto-registered: - The API entity is created with the same name as the component - The API entity title is set to {Component Title} API (e.g., "Petstore API") - The component's providesApis field is automatically updated to reference the API - The API definition is stored in YAML format

Option 1: Direct URL#

Use this annotation when you have a direct URL to the OpenAPI/Swagger definition:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: petstore
  annotations:
    terasky.backstage.io/title: "Petstore"
    terasky.backstage.io/provides-api-from-url: "http://petstore.example.com/swagger/openapi.json"
spec:
  # ... deployment spec

The URL can point to: - OpenAPI 3.x specifications (JSON or YAML) - Swagger 2.x specifications (JSON or YAML)

The plugin will automatically: 1. Fetch the API definition from the URL 2. Convert JSON to YAML if necessary 3. Create an API entity named after the component 4. Link the API to the component via providesApis

Option 2: Kubernetes Resource Reference#

Use this annotation when the API endpoint information needs to be extracted from another Kubernetes resource (e.g., a Service's LoadBalancer IP):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: petstore
  annotations:
    terasky.backstage.io/title: "Petstore"
    terasky.backstage.io/provides-api-from-resource-ref: |
      {
        "kind": "Service",
        "name": "petstore-svc",
        "apiVersion": "v1",
        "path": "/swagger/openapi.json",
        "target-protocol": "http",
        "target-port": "80",
        "target-field": ".status.loadBalancer.ingress[0].ip"
      }
spec:
  # ... deployment spec

Resource Reference Fields:

Field Required Description
kind Yes The Kubernetes resource kind (e.g., "Service", "Ingress")
name Yes The name of the Kubernetes resource
apiVersion Yes The API version (e.g., "v1", "networking.k8s.io/v1")
namespace No The namespace (defaults to the annotated resource's namespace)
path Yes The path to append to the endpoint URL (e.g., "/swagger/openapi.json")
target-protocol Yes The protocol to use: "http" or "https"
target-port Yes The port number to use
target-field Yes JSONPath-like expression to extract the endpoint from the resource

Supported target-field Examples:

# Get LoadBalancer IP from a Service
"target-field": ".status.loadBalancer.ingress[0].ip"

# Get LoadBalancer hostname from a Service
"target-field": ".status.loadBalancer.ingress[0].hostname"

# Get cluster IP from a Service
"target-field": ".spec.clusterIP"

# Get external IP from a Service (first one)
"target-field": ".spec.externalIPs[0]"

# Get FQDN from Ingress
"target-field": ".spec.rules[0].host"

The plugin will: 1. Fetch the referenced Kubernetes resource 2. Extract the endpoint using the target-field expression 3. Construct the full URL: {target-protocol}://{extracted-endpoint}:{target-port}{path} 4. Fetch the API definition from the constructed URL 5. Create an API entity and link it to the component

Error Handling#

If the plugin fails to fetch the API definition (network error, invalid URL, resource not found, etc.): - A warning is logged with details about the failure - The component is still created without the API reference - The plugin continues processing other resources

This ensures that a failing API endpoint doesn't prevent the rest of your catalog from being ingested.

Use Cases#

  1. Microservices with Swagger UI: Annotate deployments to auto-register their OpenAPI specs
  2. External APIs: Reference APIs exposed via LoadBalancer services
  3. Internal Services: Use ClusterIP or service DNS for internal API documentation
  4. Crossplane-provisioned APIs: Auto-register APIs from infrastructure provisioned by Crossplane
  5. KRO Application Stacks: Document APIs from multi-resource applications managed by KRO

Mapping Models#

Namespace Model#

Controls how Kubernetes namespaces map to Backstage:
- cluster: Use cluster name
- namespace: Use namespace name
- default: Use default namespace

Name Model#

Determines entity name generation:
- name-cluster: Combine name and cluster
- name-namespace: Combine name and namespace
- name-kind: Combine name and resource kind
- name: Use resource name only

Title Model#

Controls entity title generation:
- name: Use resource name
- name-cluster: Combine name and cluster
- name-namespace: Combine name and namespace

System Model#

Defines system mapping:
- cluster: Use cluster name
- namespace: Use namespace name
- cluster-namespace: Combine both
- default: Use default system

Ownership#

Default Owner#

If a resource does not define an owner annotation, the ingestor uses kubernetesIngestor.defaultOwner.

Default: kubernetes-auto-ingested

kubernetesIngestor:
  defaultOwner: platform-engineering-team

Component Configuration#

Task Runner Settings#

taskRunner:
  # Run every 10 seconds
  frequency: 10

  # Allow up to 10 minutes per cycle
  timeout: 600

Resource Type Configuration#

components:
  # Custom resource types
  customWorkloadTypes:
    - group: apps.example.com
      apiVersion: v1
      plural: applications
      singular: application

  # Exclude system namespaces
  excludedNamespaces:
    - kube-system
    - kube-public

Crossplane Integration#

Claims Configuration#

crossplane:
  enabled: true
  claims:
    # Auto-ingest all claims
    ingestAllClaims: true

XRD Configuration#

xrds:
  # Template generation settings
  publishPhase:
    target: github
    allowRepoSelection: true
    requestUserCredentialsForRepoUrl: false
    git:
      repoUrl: github.com?owner=org&repo=templates
      targetBranch: main

  # Processing settings
  convertDefaultValuesToPlaceholders: true
  ingestAllXRDs: true

Repository Selection Options#

When allowRepoSelection is enabled, you can configure the repository selection user experience:

  • requestUserCredentialsForRepoUrl (default: false): When set to true, enables an enhanced repository picker that allows users to select repositories and organizations from dropdown lists instead of manually typing the repository name. This adds the requestUserCredentials option to the RepoUrlPicker with secretsKey: USER_OAUTH_TOKEN, which enables the dropdown pickers for a better user experience.

This is useful when you want to: - Provide a more user-friendly repository selection experience - Allow users to browse and select from their accessible repositories - Reduce errors from manual repository name entry - Enable organization/owner selection from dropdown lists

KRO Integration#

RGD Configuration#

kro:
  enabled: true
  rgds:
    # Template generation settings
    publishPhase:
      target: github
      allowRepoSelection: true
      requestUserCredentialsForRepoUrl: false
      git:
        repoUrl: github.com?owner=org&repo=templates
        targetBranch: main

    # Processing settings
    enabled: true
    taskRunner:
      frequency: 10
      timeout: 600

Repository Selection Options#

Similar to XRD configuration, when allowRepoSelection is enabled for KRO RGDs:

  • requestUserCredentialsForRepoUrl (default: false): When set to true, enables an enhanced repository picker that allows users to select repositories and organizations from dropdown lists instead of manually typing the repository name.

This provides the same improved user experience as described in the XRD configuration section, ensuring consistent repository selection behavior across both Crossplane and KRO resource templates.

Best Practices#

  1. Resource Mapping

    • Choose consistent mapping models
    • Use clear naming conventions
    • Consider namespace organization
    • Plan system boundaries
  2. Performance Tuning

    • Adjust task runner frequency
    • Set appropriate timeouts
    • Configure excluded namespaces
    • Optimize resource selection
  3. Template Management

    • Use version control
    • Maintain consistent structure
    • Document customizations
    • Test generated templates

For installation instructions, refer to the Installation Guide.