Photo by Thom Milkovic

Setting Up Azure Open AI Authentication for APIM with Terraform

Use policy and managed identity

Posted by Damien Aicheh on 05/30/2024 · 21 mins

In this tutorial you will configure the communication between API Management (APIM) and Azure Open AI using a managed identity and Terraform!

Use case

By default, to access Azure Open AI services, you need to use access key and endpoint. However, this mechanism can be exposed in the client application, which is a security risk. To avoid this, you can use a managed identity to authenticate with the backend service. In this tutorial, you will use Terraform to create an API Management resource, define an API based on OpenAI’s Swagger documentation, and assign a role to the APIM. You will also add a policy to the APIM to use its managed identity to authenticate with the Azure Open AI resource.

What do you need?

To be able to do this tutoriel you will need:

Init the project

In a new directory, let’s create a provider.tf file and declare the azurerm provider in it:

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "3.105.0"
    }
  }

  backend "local" {}
}

provider "azurerm" {
  features {
  }
}

Then, create a variables.tf file and declare the variables you will use in this tutorial:

variable "application" {
  default     = "app"
  description = "Name of the application"
  type        = string
}

variable "environment" {
  description = "The environment deployed"
  type        = string
  default     = "dev"
  validation {
    condition     = can(regex("(dev|stg|pro)", var.environment))
    error_message = "The environment value must be a valid."
  }
}

variable "region" {
  description = "Azure deployment region"
  type        = string
  default     = "we"
}

variable "location" {
  description = "Azure deployment location"
  type        = string
  default     = "westeurope"
}

variable "owner" {
  description = "The name of the project's owner"
  type        = string
  default     = "me"
}

variable "resource_group_name_suffix" {
  type        = string
  default     = "01"
  description = "The resource group name suffix"
  validation {
    condition     = can(regex("[0-9]{2}", var.resource_group_name_suffix))
    error_message = "The resource group name suffix value must be two digits."
  }
}

variable "tags" {
  type        = map(any)
  description = "The custom tags for all resources"
  default     = {}
}

Based on this you can create a locals.tf file and declare the locals you will use in this tutorial:

locals {
  resource_suffix           = [lower(var.environment), lower(var.region), substr(lower(var.application), 0, 3), substr(lower(var.owner), 0, 3), var.resource_group_name_suffix]
  resource_suffix_kebabcase = join("-", local.resource_suffix)

  chat_model_name               = "gpt-35-turbo"

  tags = merge(
    var.tags,
    tomap(
      {
        "Owner"       = var.owner,
        "Environment" = var.environment,
        "Region"      = var.region,
        "Application" = var.application
      }
    )
  )
}

As you can see, you have a resource_suffix and resource_suffix_kebabcase to generate a unique name for your resources. Also, because tagging resources is important, you have default tags merged with the tags inputs.

Create a rg.tf file and declare the resource group you will use to put all the resources in it:

resource "azurerm_resource_group" "this" {
  name     = format("rg-%s", local.resource_suffix_kebabcase)
  location = var.location
  tags     = local.tags
}

Deploy Azure API Management and Azure Open AI

Now, you have to create an API Management resource and an API based on OpenAI’s Swagger documentation.The identity block within this code is setting the identity type to SystemAssigned, which means Azure will automatically manage the identity of this API.

Inside an apim.tf file, you can declare the following:

resource "azurerm_api_management" "this" {
  name                 = format("apim-%s", local.resource_suffix_kebabcase)
  location             = azurerm_resource_group.this.location
  resource_group_name  = azurerm_resource_group.this.name
  publisher_name       = "Me"
  publisher_email      = "admin@me.io"
  sku_name             = "Developer_1"
  tags                 = local.tags

  identity {
    type = "SystemAssigned"
  }
}

Next, let’s define an API based on OpenAI’s Swagger documentation. The following block of Terraform code is setting up an API into the APIM. The specification file can be found in the official Azure Open AI Swagger documentation or in the GitHub repository of this tutorial.

Assuming the Swagger file is in an assets directory, you can use the following code to read the file and import it into the APIM.

data "template_file" "open_ai_spec" {
  template = file("${path.module}/assets/api_spec/open_ai_spec.json")
}

Then, you can add the API using the following code:

resource "azurerm_api_management_api" "open_ai" {
  name                = "api-azure-open-ai"
  resource_group_name = azurerm_resource_group.this.name
  api_management_name = azurerm_api_management.this.name
  revision            = "1"
  display_name        = "Azure Open API"
  path                = "openai"
  protocols           = ["https"]

  import {
    content_format = "openapi+json"
    content_value  =  data.template_file.open_ai_spec.rendered
  }
}

The API is now created and imported into the APIM. Let’s deploy the Azure Open AI resource. Put all resources related to Azure Open AI in a open_ai.tf file.

resource "azurerm_cognitive_account" "open_ai" {
  name                  = format("oai-%s", local.resource_suffix_kebabcase)
  location              = azurerm_resource_group.this.location
  resource_group_name   = azurerm_resource_group.this.name
  kind                  = "OpenAI"
  sku_name              = "S0"
  tags                  = local.tags
  custom_subdomain_name = format("oai-%s", local.resource_suffix_kebabcase)

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_cognitive_deployment" "chat_model" {
  name                 = local.chat_model_name
  cognitive_account_id = azurerm_cognitive_account.open_ai.id
  model {
    format  = "OpenAI"
    name    = local.chat_model_name
    version = "0301"
  }

  scale {
    type     = "Standard"
    capacity = 5
  }
}

While deploying the Azure Open AI service, it’s important to specify a custom domain to be able to access the service correctly. The custom_subdomain_name can be the same name as the Azure Open AI service.

Next step is to declare the Azure Open AI service as a backend service in the APIM:

resource "azurerm_api_management_backend" "open_ai" {
  name                = "azure-open-ai-backend"
  resource_group_name = azurerm_resource_group.this.name
  api_management_name = azurerm_api_management.this.name
  protocol            = "http"
  url                 = format("%sopenai", azurerm_cognitive_account.open_ai.endpoint)
}

As you can see, the url is the endpoint of the Azure Open AI service which end with a / combined with the openai path.

Authenticate with Managed Identity

Finally, you can assign the role of Cognitive Services OpenAI User to the managed identity of the APIM. This role gives to the APIM the permissions it needs to interact with the Azure Open AI service. Inside a role.tf file, you can declare the role assignment:

resource "azurerm_role_assignment" "this" {
  scope                = azurerm_cognitive_account.open_ai.id
  role_definition_name = "Cognitive Services OpenAI User"
  principal_id         = azurerm_api_management.this.identity[0].principal_id
}

Then you can add a policy to the API to use its managed identity to authenticate with the Azure Open AI service. This XML block bellow is setting up several policies for how requests and responses should be handled.

<policies>
    <inbound>
        <set-backend-service backend-id="azure-open-ai-backend" />
        <authentication-managed-identity resource="https://cognitiveservices.azure.com" output-token-variable-name="msi-access-token" ignore-error="false" />
        <set-header name="Authorization" exists-action="override">
            <value>@("Bearer " + (string)context.Variables["msi-access-token"])</value>
        </set-header>
        <base />
    </inbound>
    <backend>
        <forward-request />
    </backend>
    <outbound>
        <base />
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>

The set-backend-service policy is specifying the backend service to use for the API, in this scenario, the Azure Open AI service.

The authentication-managed-identity policy is specifying that your APIM should use its managed identity to authenticate with the Azure Open AI service. The set-header policy is adding an Authorization header to each request, using the access token obtained from the managed identity authentication. The base element in each section is including the default policies for those sections. For more details regarding this policy check the official documentation.

Assuming this policy is in the assets directory, you can use the following code to read the file and import it into the APIM.

data "template_file" "global_open_ai_policy" {
  template = file("${path.module}/assets/policies/global_open_ai_policy.xml")
}

Finally, you can apply this policy to the API using the following code:

resource "azurerm_api_management_api_policy" "global_open_ai_policy" {
  api_name            = azurerm_api_management_api.open_ai.name
  resource_group_name = azurerm_resource_group.this.name
  api_management_name = azurerm_api_management.this.name

  xml_content = data.template_file.global_open_ai_policy.rendered
}

Run Terraform

Let’s initialize Terraform and apply it:

terraform init
terraform plan --out=plan.out

Then:

terraform apply plan.out

This will take a few minutes to create all the resources. Make sure to setup the Terraform backend to save the state file remotely in a secure location such as an Azure Storage Account.

Testing the API

To test the API, you can use the Azure API Management, select the Open AI API and in the Test tab, you can test the API. If you use the Trace option, you can see the request and response details with a Bearer Token added to the request.

Here is an example of parameters to pass to test the API:

deployment-id: gpt-35-turbo api-version: 2024-02-01

{
    "temperature": 1,
    "top_p": 1,
    "stream": false,
    "stop": null,
    "max_tokens": 2000,
    "presence_penalty": 0,
    "frequency_penalty": 0,
    "logit_bias": {},
    "user": "user-1234",
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant that helps people find information"
        },
        {
            "role": "user",
            "content": "When Microsoft company was created?."
        }
    ],
    "n": 1
}

and you will get this kind of answers:

{
    "choices": [{
        "content_filter_results": {
            "hate": {
                "filtered": false,
                "severity": "safe"
            },
            "self_harm": {
                "filtered": false,
                "severity": "safe"
            },
            "sexual": {
                "filtered": false,
                "severity": "safe"
            },
            "violence": {
                "filtered": false,
                "severity": "safe"
            }
        },
        "finish_reason": "stop",
        "index": 0,
        "message": {
            "content": "Microsoft was created on April 4, 1975.",
            "role": "assistant"
        }
    }],
    "created": 1717078187,
    "id": "chatcmpl-9UalPYoIMtfCe2d0LSavtobQX3RUk",
    "model": "gpt-4",
    "object": "chat.completion",
    "prompt_filter_results": [{
        "prompt_index": 0,
        "content_filter_results": {
            "hate": {
                "filtered": false,
                "severity": "safe"
            },
            "self_harm": {
                "filtered": false,
                "severity": "safe"
            },
            "sexual": {
                "filtered": false,
                "severity": "safe"
            },
            "violence": {
                "filtered": false,
                "severity": "safe"
            }
        }
    }],
    "system_fingerprint": null,
    "usage": {
        "completion_tokens": 12,
        "prompt_tokens": 27,
        "total_tokens": 39
    }
}

Final touch

In this tutorial, you have learned how to use Terraform to create an APIM, an API based on OpenAI’s Swagger documentation, and assign a role to the APIM. You have also added a policy to the API to use the APIM managed identity to authenticate with the Azure Open AI resource. You will find the complete source code in this Github repository.

Do not hesitate to follow me on to not miss my next tutorial!