Models
A model corresponds to an inference endpoint served either by an external model providers (e.g. OpenAI, Anthropic, etc.) or by an inference engine deployed in your own infrastructure.
Model Overview
Properties
secrets object
Secrets which can be used to reference secret values in designated places.
Default:Nonekey Key required
Reference to an existing entity in AI GO!.
Pattern:^[a-zA-Z0-9_\-\$]+$Max Length:
250display_name string required
The model's name displayed to the user.
description string
Short description of the model.
Default:Nonerate_limit integer
The maximum allowed number of requests per minute.
Default:Nonemax_concurrent_requests integer
The maximum allowed number of concurrent requests.
Default:Nonetask enum MLTask
The ML task of the model.
Default:chat_completionPossible MLTask values
The type of machine learning task to be performed.
Allowed Values:
chat_completionembeddingscustom
config SDKModelCustomConnectionConfig, SDKCustomInferenceModelConfig, ModelProviderConnectionConfig required
Model configuration.
display_name: "OpenAI GPT-4.1 Nano"
key: "openai-gpt-4-1-nano"
description: >
Fastest, most cost-efficient version of GPT-4.1 GPT-4.1 nano excels at instruction
following and tool calling.
rate_limit: 60
task: "chat_completion"
config:
adapter:
key: "openai-chat-completion"
connection_type: "custom_connection"
url: "https://api.openai.com/v1/chat/completions"
api_key: $OPENAI_API_KEY
model_key: "gpt-4.1-nano"display_name: "OpenAI GPT-4.1 Nano (Custom Inference)"
key: "gpt-4-1-nano-custom-inference"
description: "OpenAI's GPT-4-1 Nano defined as a model with custom inference."
rate_limit: 60
task: "chat_completion"
config:
connection_type: "custom_inference"
adapter:
key: "latticeflow$openai_chat_completion"
run_inference_snippet: !include "./run_inference.py"
environment:
MODEL_ENDPOINT_URL: "https://api.openai.com/v1/chat/completions"
MODEL_ENDPOINT_API_KEY: $OPENAI_API_KEY
MODEL_KEY: "gpt-4.1-nano"
timeout: 15from __future__ import annotations
import json
from typing import Any
import httpx
def run_inference(body: str, environment: dict[str, Any]) -> str:
body_dict = json.loads(body)
body_dict["model"] = environment["MODEL_KEY"]
response = httpx.post(
environment["MODEL_ENDPOINT_URL"],
headers={
"Authorization": f"Bearer {environment['MODEL_ENDPOINT_API_KEY']}",
"Content-Type": "application/json",
},
content=json.dumps(body_dict).encode(),
timeout=10.0,
verify=True,
)
response.raise_for_status()
return response.textDefinitions
ReferencedKey
ReferencedKeyProperties
key Key required
Reference to an existing entity in AI GO!.
Pattern:^[a-zA-Z0-9_\-\$]+$Max Length:
250...
config:
adapter:
key: "openai-chat-completion"Use the CLI command
lf model-adaptersto list all available model adapters.
SDKModelCustomConnectionConfig
SDKModelCustomConnectionConfigProperties
connection_type Literal "custom_connection" required
The type of connection config.
adapter ReferencedKey
The model adapter responsible for converting the endpoint inputs and outputs into a standardized format.
Default:{'key': 'latticeflow$identity_chat_completion'}url string required
The model endpoint URL.
api_key SecretTemplate, string
The key to be passed as the authorization header (Authorization: Bearer API_KEY). Can reference an existing secret.
Default:Nonemodel_key string
This field is used in case the model is not specified in the URL but in the body instead. For the "openai" adapter, this will be passed as the "model" parameter. For custom adapters, this value is available as model_info.model_key.
Default:Nonetls_context SDKTLSContext
TLS configuration for secure connections to the model endpoint.
Default:Nonecustom_headers object
Additional headers to include in requests to the model endpoint. Can reference existing secrets.
Default:None...
config:
adapter:
key: "openai-chat-completion"
connection_type: "custom_connection"
url: "https://api.openai.com/v1/chat/completions"
api_key: $OPENAI_API_KEY
model_key: "gpt-4.1-nano"...
config:
adapter:
key: "latticeflow$openai_chat_completion"
connection_type: "custom_connection"
url: "https://api.example.ai/v1/"
api_key: ""
custom_headers:
X-API-Key: $X_API_KEYModelProviderConnectionConfig
ModelProviderConnectionConfigConnection configuration for a model, that is retrieved from a well-known provider integrated with the system.
Properties
connection_type Literal "provider_connection" required
The type of connection config.
provider_id ModelProviderId required
The id of the model provider.
model_key string required
A key used to identify the model in the external provider.
SDKCustomInferenceModelConfig
SDKCustomInferenceModelConfigProperties
connection_type Literal "custom_inference" required
The type of connection config.
adapter ReferencedKey
The model adapter responsible for converting the inputs and outputs into a standardized format.
Default:{'key': 'latticeflow$identity_chat_completion'}run_inference_snippet string required
The code snippet to make a call to the model.
environment object required
Environment variables required to run the model client snippet. Can reference existing secrets.
timeout number required
Timeout in seconds for the total runtime of the Python snippet.
...
config:
connection_type: "custom_inference"
adapter:
key: "latticeflow$openai_chat_completion"
run_inference_snippet: !include "./run_inference.py"
environment:
MODEL_ENDPOINT_URL: "https://api.openai.com/v1/chat/completions"
MODEL_ENDPOINT_API_KEY: $OPENAI_API_KEY
MODEL_KEY: "gpt-4.1-nano"
timeout: 15from __future__ import annotations
import json
from typing import Any
import httpx
def run_inference(body: str, environment: dict[str, Any]) -> str:
body_dict = json.loads(body)
body_dict["model"] = environment["MODEL_KEY"]
response = httpx.post(
environment["MODEL_ENDPOINT_URL"],
headers={
"Authorization": f"Bearer {environment['MODEL_ENDPOINT_API_KEY']}",
"Content-Type": "application/json",
},
content=json.dumps(body_dict).encode(),
timeout=10.0,
verify=True,
)
response.raise_for_status()
return response.textModelProviderId
ModelProviderIdSDKTLSContext
SDKTLSContextProperties
validation_context SDKCertificateValidationContext
Settings for validating server certificates.
Default:None...
config:
...
tls_context:
validation_context:
trusted_ca: $SSL_CERTIFICATE
trust_chain_verification: "verify_trust_chain"...
config:
...
tls_context:
validation_context:
trust_chain_verification: "accept_untrusted"SDKCertificateValidationContext
SDKCertificateValidationContextProperties
trusted_ca SecretTemplate, string
base64 representation of PEM-encoded certificate(s). Can reference an existing secret.
Default:Nonetrust_chain_verification enum TrustChainVerification
Settings for verifying the trust chain of the server certificate.
Default:NonePossible TrustChainVerification values
How to trust the CA trust chain.
verify_trust_chain(default) will verify the server certificate against the configured CA trust.accept_untrustedwill not perform server certificate verification. NOTE: This is a security hazard and should be avoided.
Allowed Values:
verify_trust_chainaccept_untrusted
...
config:
...
tls_context:
validation_context:
trusted_ca: $SSL_CERTIFICATE
trust_chain_verification: "verify_trust_chain"