Testing Models

The lf test model command verifies a registered model end to end in seconds — without configuring a dataset, defining tasks, or launching a full evaluation. It walks through the model's complete request/response pipeline one step at a time and prints the result at each stage, making problems easy to spot before they show up mid-evaluation.

Run it immediately after registering a new model to confirm:

  • The model endpoint is reachable and authentication is working.
  • The input adapter correctly translates AI GO!'s model I/O format into the payload your endpoint expects.
  • The output adapter correctly maps the raw endpoint response back into AI GO!'s model I/O format.
💡

The same functionality is available in the AI GO! UI — navigate to your model and click Test Model.

What Is Tested

The command runs four phases in sequence and stops at the first failure:

PhaseDescription
1. Connection checkProbes the model endpoint and verifies it returns a successful response. Skipped for custom inference models, which have no fixed HTTP endpoint.
2. Input adapterApplies the adapter's Jinja input template to transform a sample AI GO! model input into the raw payload the endpoint expects.
3. InferenceSends the transformed payload to the endpoint and records the HTTP status code and headers.
4. Output adapterApplies the adapter's Jinja output template to map the raw response into AI GO!'s canonical model I/O format.

Usage

lf test model <key>
lf test model <key> --model-input ./input.json

For chat_completion models, the command uses a default input of {"messages": [{"role": "user", "content": "Hello!"}]} when no --model-input file is given. For other ML tasks, use --model-input to pass a JSON file in AI GO!'s model I/O format.

Custom Connection Model

The following example uses a model registered with the model-openai-chat-completion guide — an OpenAI-compatible chat_completion model with a custom Jinja adapter.

lf test model openai-gpt-4-1-nano

Phase 1: Connection Check

1. Checking connection to model.
- Key: openai-gpt-4-1-nano
- URL: https://api.openai.com/v1/chat/completions
- API key: ***
- Model key: gpt-4.1-nano
Successfully connected to model with key 'openai-gpt-4-1-nano'.

AI GO! prints the connection details — endpoint URL, masked API key, and model key — and confirms the endpoint is reachable.

Phase 2: Input Adapter

2. Transforming model input.
Model input in LatticeFlow AI format:
{"messages": [{"role": "user", "content": "Hello!"}]}
--------------------------------------------------------------
Jinja input transform (from adapter 'OpenAI Chat Completion'):
{
    "model": "{{ model_info.model_key }}",
    "messages": [
        {% for message in input.messages %}
        {
            "role": "{{ message.role }}",
            "content": {{ message.content | tojson }}
        }{% if not loop.last %},{% endif %}
        {% endfor %}
    ]
    ...
}
--------------------------------------------------------------
Model input in the format expected by the model (used for inference):
{
    "model": "gpt-4.1-nano",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        }
    ]
}

The output shows three things in sequence: the AI GO! model input, the Jinja template from the adapter, and the resulting JSON payload that will be sent to the endpoint. If the input adapter is misconfigured, you can see exactly where the transformation diverges from what you expect.

Phase 3: Inference

3. Running inference.
Request headers:
  Authorization: Bearer ***
  Content-Type: application/json
--------------------------------------------------------------
Status code: 200
Response headers:
  content-type: application/json
  x-ratelimit-remaining-requests: 29999
  x-ratelimit-remaining-tokens: 149999995
  ...

The request headers confirm which authentication scheme is in use (Authorization: Bearer *** for bearer auth). The HTTP status code and response headers are printed in full, to help with identifying issues.

Phase 4: Output Adapter

4. Transforming model output.
Model output in the format returned by the model:
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 9,
    ...
  }
}
--------------------------------------------------------------
Jinja output transform (from adapter 'OpenAI Chat Completion'):
{% if status_code != 200 %}
    {"error": "A non-200 status code ({{ status_code }}) was returned by the model."}
{% else %}
    ...
{% endif %}
--------------------------------------------------------------
Model output in the format expected by LatticeFlow AI:
output='{"choices":[{"message":{"role":"assistant","content":"Hello! How can I assist you today?"}}],"usage":{"num_completion_tokens":9,"num_prompt_tokens":9}}'
==============================================================
Successfully tested configuration of model with key 'openai-gpt-4-1-nano'.

The output adapter section mirrors the input adapter: raw endpoint response, the Jinja template, and the final model I/O-formatted result that AI GO! consumes in evaluations. Confirm that the choices and usage fields match the expected schema.

Custom Inference Models

For custom inference models the connection check is skipped — there is no fixed HTTP endpoint to probe since inference is handled by a Python snippet. The remaining three phases run identically to a custom connection model.

The following example uses a model registered with the model-custom-inference guide:

lf test model gpt-4-1-nano-custom-inference
1. Checking connection to model.
Skipping connection check for custom inference model.
==============================================================
2. Transforming model input.
Model input in LatticeFlow AI format:
{"messages": [{"role": "user", "content": "Hello!"}]}
...
==============================================================
3. Running inference.
--------------------------------------------------------------
Status code: 200
==============================================================
4. Transforming model output.
...
==============================================================
Successfully tested configuration of model with key 'gpt-4-1-nano-custom-inference'.

Note that phase 3 no longer shows request headers — the Python inference snippet manages the HTTP call directly, so AI GO! only surfaces the status code returned by the snippet.

Diagnosing Failures

The command stops at the first failing phase and prints an error with as much context as possible. Three common failure patterns are shown below.

Connection Failure

When the endpoint is unreachable or the API key is wrong, the command fails at phase 1. The error includes the HTTP status code and the full response body from the server:

lf test model openai-gpt-4-1-nano
1. Checking connection to model.
- Key: openai-gpt-4-1-nano
- URL: https://api.openai.com/v1/chat/completions
- API key: sk-invalid-key-for-testing
- Model key: gpt-4.1-nano

CLITestConfigurationError: Failed to test configuration of model with key 'openai-gpt-4-1-nano'.

CLIError: Connection to model with key 'openai-gpt-4-1-nano' was not successful.
Returned message: Failed to get a successful response from the model endpoint, status code was '401'. Response: {
  "error": {
    "message": "Incorrect API key provided: sk-inval**************ting.",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

The output includes the endpoint URL and the API key as configured, so you can immediately see whether the wrong secret was loaded (e.g. a missing environment variable).

Faulty Input Adapter

If the input adapter's Jinja template produces an invalid payload — for example, omitting the | tojson filter so that a string value is rendered without quotes — the endpoint rejects the request and the command fails at phase 3. The phase 2 output shows the rendered payload before it was sent, making the bug visible:

2. Transforming model input.
Model input in LatticeFlow AI format:
{"messages": [{"role": "user", "content": "Hello!"}]}
--------------------------------------------------------------
Jinja input transform (from adapter 'OpenAI Chat Completion'):
{
    "model": "{{ model_info.model_key }}",
    "messages": [
        {% for message in input.messages %}
        {
            "role": "{{ message.role }}",
            "content": {{ message.content }}
        }{% if not loop.last %},{% endif %}
        {% endfor %}
    ]
}
--------------------------------------------------------------
Model input in the format expected by the model (used for inference):
{
    "model": "gpt-4.1-nano",
    "messages": [
        {
            "role": "user",
            "content": Hello!
        }
    ]
}
==============================================================
3. Running inference.
CLITestConfigurationError: Failed to test configuration of model with key 'openai-gpt-4-1-nano'.

API error: 400 Bad Request: Model input is not a valid JSON for content type 'application/json'.
JSON decode error: Expecting value: line 7 column 24 (char 119)

The rendered payload ("content": Hello! instead of "content": "Hello!") pinpoints the bug: the | tojson filter is missing on the message.content variable in the input template.

Exception Raised in Custom Inference Snippet

For custom inference models, Python exceptions inside the run_inference function are surfaced with a traceback. In the example below the snippet tries to subscript the response string as a dictionary:

3. Running inference.
CLITestConfigurationError: Failed to test configuration of model with key 'gpt-4-1-nano-custom-inference'.

API error: 400 Bad Request: Model prediction failed: PythonCodeExecutionException: Code execution failed.

Traceback:
  line 25 (in function 'run_inference')
    return response.text["choices"]
  TypeError: string indices must be integers, not 'str'

The traceback points directly at the offending line in the snippet. The fix here is to parse the response body as JSON first: return json.loads(response.text)["choices"] — or more commonly, return response.text and let the output adapter handle the parsing.