Models

A model corresponds to an inference endpoint served either by an external model providers (e.g. OpenAI, Anthropic, etc.) or by an inference engine deployed in your own infrastructure.

Model Overview

Properties


secrets object

Secrets which can be used to reference secret values in designated places.

Default: None

key Key required

Reference to an existing entity in AI GO!.

Pattern: ^[a-zA-Z0-9_\-\$]+$
Max Length: 250

display_name string required

The model's name displayed to the user.


description string

Short description of the model.

Default: None

rate_limit integer

The maximum allowed number of requests per minute.

Default: None

max_concurrent_requests integer

The maximum allowed number of concurrent requests.

Default: None

task enum MLTask

The ML task of the model.

Default: chat_completion

Possible MLTask values

The type of machine learning task to be performed.

Allowed Values:

  • chat_completion
  • embeddings
  • custom

config SDKModelCustomConnectionConfig, SDKCustomInferenceModelConfig, ModelProviderConnectionConfig required

Model configuration.

display_name: "OpenAI GPT-4.1 Nano"
key: "openai-gpt-4-1-nano"
description: >
  Fastest, most cost-efficient version of GPT-4.1 GPT-4.1 nano excels at instruction
  following and tool calling.
rate_limit: 60
task: "chat_completion"
config:
  adapter:
    key: "openai-chat-completion"
  connection_type: "custom_connection"
  url: "https://api.openai.com/v1/chat/completions"
  api_key: $OPENAI_API_KEY
  model_key: "gpt-4.1-nano"
display_name: "OpenAI GPT-4.1 Nano (Custom Inference)"
key: "gpt-4-1-nano-custom-inference"
description: "OpenAI's GPT-4-1 Nano defined as a model with custom inference."
rate_limit: 60
task: "chat_completion"
config:
  connection_type: "custom_inference"
  adapter:
    key: "latticeflow$openai_chat_completion"
  run_inference_snippet: !include "./run_inference.py"
  environment:
    MODEL_ENDPOINT_URL: "https://api.openai.com/v1/chat/completions"
    MODEL_ENDPOINT_API_KEY: $OPENAI_API_KEY
    MODEL_KEY: "gpt-4.1-nano"
  timeout: 15
from __future__ import annotations

import json
from typing import Any

import httpx


def run_inference(body: str, environment: dict[str, Any]) -> str:
    body_dict = json.loads(body)
    body_dict["model"] = environment["MODEL_KEY"]

    response = httpx.post(
        environment["MODEL_ENDPOINT_URL"],
        headers={
            "Authorization": f"Bearer {environment['MODEL_ENDPOINT_API_KEY']}",
            "Content-Type": "application/json",
        },
        content=json.dumps(body_dict).encode(),
        timeout=10.0,
        verify=True,
    )
    response.raise_for_status()
    return response.text

Definitions

ReferencedKey

Properties


key Key required

Reference to an existing entity in AI GO!.

Pattern: ^[a-zA-Z0-9_\-\$]+$
Max Length: 250
...
config:
  adapter:
    key: "openai-chat-completion"
💡

Use the CLI command lf model-adapters to list all available model adapters.

SDKModelCustomConnectionConfig

Properties


connection_type Literal "custom_connection" required

The type of connection config.


adapter ReferencedKey

The model adapter responsible for converting the endpoint inputs and outputs into a standardized format.

Default: {'key': 'latticeflow$identity_chat_completion'}

url string required

The model endpoint URL.


api_key SecretTemplate, string

The key to be passed as the authorization header (Authorization: Bearer API_KEY). Can reference an existing secret.

Default: None

model_key string

This field is used in case the model is not specified in the URL but in the body instead. For the "openai" adapter, this will be passed as the "model" parameter. For custom adapters, this value is available as model_info.model_key.

Default: None

tls_context SDKTLSContext

TLS configuration for secure connections to the model endpoint.

Default: None

custom_headers object

Additional headers to include in requests to the model endpoint. Can reference existing secrets.

Default: None
...
config:
  adapter:
    key: "openai-chat-completion"
  connection_type: "custom_connection"
  url: "https://api.openai.com/v1/chat/completions"
  api_key: $OPENAI_API_KEY
  model_key: "gpt-4.1-nano"
...
config:
  adapter:
    key: "latticeflow$openai_chat_completion"
  connection_type: "custom_connection"
  url: "https://api.example.ai/v1/"
  api_key: ""
  custom_headers:
    X-API-Key: $X_API_KEY

ModelProviderConnectionConfig

Connection configuration for a model, that is retrieved from a well-known provider integrated with the system.

Properties


connection_type Literal "provider_connection" required

The type of connection config.


provider_id ModelProviderId required

The id of the model provider.


model_key string required

A key used to identify the model in the external provider.

SDKCustomInferenceModelConfig

Properties


connection_type Literal "custom_inference" required

The type of connection config.


adapter ReferencedKey

The model adapter responsible for converting the inputs and outputs into a standardized format.

Default: {'key': 'latticeflow$identity_chat_completion'}

run_inference_snippet string required

The code snippet to make a call to the model.


environment object required

Environment variables required to run the model client snippet. Can reference existing secrets.


timeout number required

Timeout in seconds for the total runtime of the Python snippet.

...
config:
  connection_type: "custom_inference"
  adapter:
    key: "latticeflow$openai_chat_completion"
  run_inference_snippet: !include "./run_inference.py"
  environment:
    MODEL_ENDPOINT_URL: "https://api.openai.com/v1/chat/completions"
    MODEL_ENDPOINT_API_KEY: $OPENAI_API_KEY
    MODEL_KEY: "gpt-4.1-nano"
  timeout: 15
from __future__ import annotations

import json
from typing import Any

import httpx


def run_inference(body: str, environment: dict[str, Any]) -> str:
    body_dict = json.loads(body)
    body_dict["model"] = environment["MODEL_KEY"]

    response = httpx.post(
        environment["MODEL_ENDPOINT_URL"],
        headers={
            "Authorization": f"Bearer {environment['MODEL_ENDPOINT_API_KEY']}",
            "Content-Type": "application/json",
        },
        content=json.dumps(body_dict).encode(),
        timeout=10.0,
        verify=True,
    )
    response.raise_for_status()
    return response.text

ModelProviderId

SDKTLSContext

Properties


validation_context SDKCertificateValidationContext

Settings for validating server certificates.

Default: None
...
config:
  ...
  tls_context:
    validation_context:
      trusted_ca: $SSL_CERTIFICATE
      trust_chain_verification: "verify_trust_chain"
...
config:
  ...
  tls_context:
    validation_context:
      trust_chain_verification: "accept_untrusted"

SDKCertificateValidationContext

Properties


trusted_ca SecretTemplate, string

base64 representation of PEM-encoded certificate(s). Can reference an existing secret.

Default: None

trust_chain_verification enum TrustChainVerification

Settings for verifying the trust chain of the server certificate.

Default: None

Possible TrustChainVerification values

How to trust the CA trust chain.

  • verify_trust_chain (default) will verify the server certificate against the configured CA trust.
  • accept_untrusted will not perform server certificate verification. NOTE: This is a security hazard and should be avoided.

Allowed Values:

  • verify_trust_chain
  • accept_untrusted
...
config:
  ...
  tls_context:
    validation_context:
      trusted_ca: $SSL_CERTIFICATE
      trust_chain_verification: "verify_trust_chain"