Core Types

This page documents the public data types exported from latticeflow.core.dtypes. These are the shared types used across the AI GO! SDK — messages, traces, model inputs and outputs, solver outputs, scoring structures, and the enums and type aliases that tie them together. Every type below can be imported directly from the package:

from latticeflow.core.dtypes import Trace, SampleScore, SolverTrace, ModelResponse

Each section lists a type's fields and, where applicable, its public properties and methods. Enums list their allowed values, and the Type Aliases section maps each alias to its members.

Models

ActionRecord

A record of an action taken for a sample.

Properties


action ActionRuleAction required


rule_key string required

The key of the action rule that created the action record.

Pattern: ^[a-zA-Z0-9_\-]+$

Max Length: 250

AssistantMessage

A message with role assistant.

Properties


type Literal "message"

The type of the message. Always set to message.

Default: message


id string required

The unique ID of the message.


status MessageStatus required


role Literal "assistant"

Role

Default: assistant


content array[InputTextContent, OutputTextContent, TextContent, SummaryTextContent, ReasoningTextContent, RefusalContent, InputImageContent, InputFileContent, InputVideoContent] required

The content of the message

BaseTraceEvent

Base class for all trace events, providing common metadata fields.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None

ChatCompletionInput

Properties


messages array[ChatCompletionInputMessage, ChatCompletionOutputMessage] required

Messages


response_format ChatCompletionResponseFormatJSONSchema, ChatCompletionResponseFormatText

Response Format

Default: None

ChatCompletionInputMessage

Properties


role string required

Role


content string, array[FileContentItem] required

Content

ChatCompletionJSONSchema

Properties


name string required

The name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.


description string

A description of what the response format is for, used by the model to determine how to respond in the format.

Default: None


schema object

The schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.

Default: None


strict boolean required

Whether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the schema field. Only a subset of JSON Schema is supported when strict is true. To learn more, read the Structured Outputs guide.

ChatCompletionJudgeInput

Properties


sample object required

Sample


solver_output SingleSolverOutput, GroupedSolverOutput, SolverTrace, GroupedSolverTrace required

Solver Output


messages array[ChatCompletionInputMessage, Any, array[string], ChatCompletionOutputMessage, array[array[number]]] required

Messages


model_output OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any required

Model Output


input_prompt string required

Input Prompt

ChatCompletionModelOutput

Properties


choices array[ChatCompletionModelOutputChoice] required

Choices


usage ModelUsage

Default: None

ChatCompletionModelOutputChoice

Properties


message ChatCompletionOutputMessage required

ChatCompletionOutputMessage

Properties


role string required

Role


content string required

Content


refusal string

Refusal

Default: None

ChatCompletionResponseFormatJSONSchema

Properties


type Literal "json_schema" required

The type of response format being defined. Always json_schema.


json_schema ChatCompletionJSONSchema required

Structured Outputs configuration options, including a JSON Schema.

ChatCompletionResponseFormatText

Properties


type Literal "text" required

The type of response format being defined. Always text.

CompactionEvent

Records a compaction boundary where the conversation context was shortened.

After this event the next ModelCallEvent.input_context will reflect the
compacted context rather than the full history.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "compaction"

Type

Default: compaction


strategy string required

Strategy


tokens_before integer

Tokens Before

Default: None


tokens_after integer

Tokens After

Default: None

CustomEvent

A fallback event type for arbitrary structured data.

Serves as an escape hatch for event types not yet modelled (e.g.
SandboxEvent, ApprovalEvent from inspect-ai), custom solver
instrumentation, and forward compatibility with new external event types.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "custom"

Type

Default: custom


name string required

Name


data object required

Data

CustomTaskInputMessage

A custom task input item carrying an opaque user-defined payload.

Properties


type Literal "custom_task_input_message"

Type

Default: custom_task_input_message


content Any required

The opaque user-defined payload.

CustomTaskOutputMessage

A custom task output item carrying an opaque user-defined payload.

Properties


type Literal "custom_task_output_message"

Type

Default: custom_task_output_message


content Any required

The opaque user-defined payload.

DatasetProgressState

Properties


num_total_samples integer required

Num Total Samples


num_samples_generated integer required

Num Samples Generated

DirectModelIO

Raw model endpoint I/O before/after adapter conversion.

Properties


direct_input ModelEndpointInput required

The raw request sent to the model endpoint.


direct_output ModelEndpointOutput required

The raw response received from the model endpoint.

EmbeddingsModelOutput

Properties


embeddings array[array[number]] required

Embeddings


usage ModelUsage

Default: None

ErrorEvent

Records an error that occurred during execution.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "error"

Type

Default: error


message string required

Message


traceback string

Traceback

Default: None

FieldMetadata

Metadata for a single field in a tabular evidence table.

Properties


display_name string

Display name for the field.

Default: None


description string

Description of the field semantics.

Default: None


primary boolean

Whether the field is directly relevant to the understanding of the main correctness result.

Default: True

FileContentItem

A file content item within an input message.

Properties


type Literal "file" required

Type


file FileRef required

FileRef

A reference to a file by its identifier.

Properties


file_id string required

The identifier of the file.

FunctionCall

A function tool call that was generated by the model.

Properties


type Literal "function_call"

The type of the item. Always function_call.

Default: function_call


id string required

The unique ID of the function call item.


call_id string required

The unique ID of the function tool call that was generated.


name string required

The name of the function that was called.


arguments string required

The arguments JSON string that was generated.


status FunctionCallStatus required

FunctionCallEvent

Records a tool call lifecycle as a single event.

Absorbs what was previously two items (FunctionCall + FunctionCallOutput)
into one event with execution metadata.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "function_call_event"

Type

Default: function_call_event


call_id string required

Call Id


function string required

Function


arguments string required

Arguments


result string, array[InputTextContent, InputImageContent, InputFileContent] required

Result


status FunctionCallStatus required


working_time number

Working Time

Default: None


error string

Error

Default: None


agent string

Agent

Default: None


agent_span_id string

Agent Span Id

Default: None


model_call_id string

Model Call Id

Default: None

FunctionCallOutput

A function tool call output that was returned by the tool.

Properties


type Literal "function_call_output"

The type of the function tool call output. Always function_call_output.

Default: function_call_output


id string required

The unique ID of the function tool call output. Populated when this item is returned via API.


call_id string required

The unique ID of the function tool call generated by the model.


output string, array[InputTextContent, InputImageContent, InputFileContent] required

Output


status FunctionCallOutputStatusEnum required

GroupedSolverOutput

Properties


solver_outputs array[SingleSolverOutput], object required

Solver Outputs

GroupedSolverTrace

Grouped solver output using Open Responses types.

Produced when GroupedSingleTurnSolver runs with message_format="open_responses".
Each element of solver_outputs corresponds to one sub-call made during solving.

Properties


solver_outputs array[SolverTrace], object required

Solver Outputs

InputFileContent

A file input to the model.

Properties


type Literal "input_file"

The type of the input item. Always input_file.

Default: input_file


filename string

The name of the file to be sent to the model.

Default: None


file_url string

The URL of the file to be sent to the model.

Default: None

InputImageContent

An image input to the model. Learn about image inputs.

Properties


type Literal "input_image"

The type of the input item. Always input_image.

Default: input_image


image_url string

Image Url

Default: None


detail ImageDetail required

InputTextContent

A text input to the model.

Properties


type Literal "input_text"

The type of the input item. Always input_text.

Default: input_text


text string required

The text input to the model.

JudgeInput

Properties


sample object required

Sample

LFBaseModel

A BaseModel which excludes unset fields by default when serialising.

No properties defined.

LogProb

The log probability of a token.

Properties


token string required

Token


logprob number required

Logprob


bytes array[integer] required

Bytes


top_logprobs array[TopLogProb] required

Top Logprobs

Message

A message to or from the model.

Properties


type Literal "message"

The type of the message. Always set to message.

Default: message


id string required

The unique ID of the message.


status MessageStatus required


role MessageRole required


content array[InputTextContent, OutputTextContent, TextContent, SummaryTextContent, ReasoningTextContent, RefusalContent, InputImageContent, InputFileContent, InputVideoContent] required

The content of the message

MessageEvent

Records that a conversation item was added to the trace.

This is the incremental conversation record — each message (user, assistant,
system) gets its own event.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "message_event"

Type

Default: message_event


item Message, CustomTaskInputMessage, CustomTaskOutputMessage required

Item


model_call_id string

Model Call Id

Default: None

MetricData

An object that contains the metric scores.

Properties


values object required

Values


metric_key string required

The key of the metric.


metric_type string

The type of the metric. Present for newly computed results and may be missing for legacy results.

Default: None


scorer_name string

The display name of the scorer to which the metric belongs. Present only for benchmark tasks.

Default: None


scorer_key string

The key of the scorer to which the metric belongs. Present only for benchmark tasks.

Default: None


scorer_purpose ScorerPurpose

The purpose of the scorer. Present only for benchmark tasks.

Default: score


reason string

A freeform explanation for the metric value. Present only for system tasks.

Default: None

ModelCallEvent

Records a model API call with the full input context, output, and metadata.

input_context captures the actual context window at each model call.

Properties


id string

Id


span_id string

Span Id

Default: None


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "model_call_event"

Type

Default: model_call_event


model string required

Model


input_context array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Input Context


output_items array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Output Items


usage ModelUsage

Default: None


tools array[string]

Tools

Default: None


total_time number

Total Time

Default: None


error string

Error

Default: None

ModelEndpointInput

The raw request body sent to the model endpoint after adapter conversion.

Properties


body string required

The raw request body as a JSON string.

ModelEndpointOutput

The raw response received from the model endpoint before adapter conversion.

Properties


body string required

The raw response body as a JSON string.


status_code integer required

The HTTP status code of the response.


headers object

The HTTP response headers.

Default: None

ModelResponse

A useful container for a model response that contains the raw model output
as well as the trace items derived from that model output.

Properties


raw_output OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any required

Raw Output


items array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Items

Computed Properties


text str

Text content of the last assistant message.

Raises:
ValueError: If no assistant message is present in items.

ModelUsage

Token usage statistics for a model inference call.

Properties


num_completion_tokens integer

The number of completion tokens used.

Default: None


num_prompt_tokens integer

The number of prompt tokens used.

Default: None

OpenResponsesModelOutput

Model output in Open Responses format.

Used when the model returns a response object containing
a sequence of OpenResponse output items (assistant messages, function calls,
function call outputs, etc.) from a single model.predict() call.

Properties


items array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Items


usage ModelUsage

Default: None

OutputTextContent

A text output from the model.

Properties


type Literal "output_text"

The type of the output text. Always output_text.

Default: output_text


text string required

The text output from the model.


annotations array[UrlCitationBody, TextCitationBody] required

The annotations of the text output.


logprobs array[LogProb]

Logprobs

Default: None

PolicyRuleMetricsInput

Properties


evaluation_key string required

Evaluation Key


evaluation_id string required

Evaluation Id


task_result_id string required

Task Result Id


task_specification_key string required

Task Specification Key


task_specification_display_name string required

Task Specification Display Name


scorer_key string required

Scorer Key


metric_key string required

Metric Key


values object required

Values

RAGCompletionModelOutputChoice

Properties


message ChatCompletionOutputMessage required


references array[RAGReference] required

References

RAGCompletionOutput

Properties


choices array[RAGCompletionModelOutputChoice] required

Choices


usage ModelUsage

Default: None

RAGReference

A reference retrieved from the knowledge base during RAG inference.

Properties


content string required

The text content of the reference.

ReasoningTextContent

Reasoning text from the model.

Properties


type Literal "reasoning_text"

The type of the reasoning text. Always reasoning_text.

Default: reasoning_text


text string required

The reasoning text from the model.

RefusalContent

A refusal from the model.

Properties


type Literal "refusal"

The type of the refusal. Always refusal.

Default: refusal


refusal string required

The refusal explanation from the model.

RunEvidence

Properties


index integer required

Index


metrics array[MetricData] required

The metrics. If an error occurred, the metrics will be None.


samples array[SampleEvidence]

The sample evidence (as produced by the given run of a task).

Default: None


errors array[TaskResultError]

A list of task-level errors.

Default: []


failures TaskResultFailures

Default: None

SampleData

The raw sample data for a single evaluated sample.

Properties


data object required

The sample's field values.

SampleEvidence

Properties


sample_id string, integer required

Sample Id


sample SampleData required

Sample data. Only present for legacy evidence or when computing repeatability, otherwise present in the trials.


solver SolverData required

Solver data. Only present for legacy evidence, otherwise present in the trials.


scores array[ScoresData] required

Scores data. Only present for legacy evidence, or when score aggregation occurred (such as when using multiple trials with score aggregation or when assessing repeatability).


action_records array[ActionRecord]

Action Records

Default: None


errors array[TaskResultError] required

Errors


trials array[SampleTrialEvidence]

Trials

Methods


build classmethod

build(model_input: LFModelInput | None, model_output: LFModelOutput | None, score_values: LFBaseModel | ScoreValues, score_metadata: LFBaseModel | dict[str, Any] | None = None, sample_id: int | str | None = None, sample_data: dict[str, Any] | None = None, scorer_key: str | None = None, solver_model_direct_input: ModelEndpointInput | None = None, solver_model_direct_output: ModelEndpointOutput | None = None, scorer_model_direct_input: ModelEndpointInput | None = None, scorer_model_direct_output: ModelEndpointOutput | None = None, message_format: TraceFormat = 'open_responses') -> SampleEvidence

build_with_1_trial classmethod

build_with_1_trial(*, sample_id: str | int, sample: SampleData | None, solver: SolverData | None, scores: list[ScoresData] | None, errors: list[TaskResultError]) -> SampleEvidence

SampleScore

Properties


values object required

Values


metadata object

Metadata

Default: None


direct_ios array[DirectModelIO]

Direct Ios

Default: []

SampleTrialEvidence

Properties


index integer required

Index


sample_id string, integer required

Sample Id


sample SampleData required


solver SolverData required


scores array[ScoresData] required

Scores


errors array[TaskResultError] required

Errors

ScoresData

The score values produced by a single scorer for a single sample.

Properties


scorer_key string required

The key identifying the scorer.


scorer_purpose ScorerPurpose

Default: score


scorer_name string

Optional display name of the scorer.

Default: None


values object required

The score values produced by the scorer.


metadata object

Optional metadata associated with the scorer output.

Default: None


direct_ios array[DirectModelIO]

Raw model endpoint I/O for each prediction call (if scorer uses a model).

Default: []

Secret

An object representing a secret.

Properties


name string required

The name of the secret.

Pattern: ^[A-Za-z0-9_-]+$

SingleSolverJudgeInput

Properties


sample object required

Sample


solver_output SingleSolverOutput, GroupedSolverOutput, SolverTrace, GroupedSolverTrace required

Solver Output


messages array[ChatCompletionInputMessage, Any, array[string], ChatCompletionOutputMessage, array[array[number]]] required

Messages


model_output OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any required

Model Output

SingleSolverOutput

Properties


messages array[ChatCompletionInputMessage, Any, array[string], ChatCompletionOutputMessage, array[array[number]]] required

Messages


output OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any required

Output


direct_ios array[DirectModelIO]

Direct Ios

Default: []

Methods


from_input_and_output classmethod

from_input_and_output(model_input: LFModelInput, model_output: LFModelOutput, direct_model_input: ModelEndpointInput | None = None, direct_model_output: ModelEndpointOutput | None = None) -> SingleSolverOutput

SolverData

Properties


output SingleSolverOutput, GroupedSolverOutput, SolverTrace, GroupedSolverTrace required

Output

SolverJudgeInput

Properties


sample object required

Sample


solver_output SingleSolverOutput, GroupedSolverOutput, SolverTrace, GroupedSolverTrace required

Solver Output

SolverTrace

Solver output using Open Responses types with structured trace.

Produced when the solver's message_format is "open_responses".
Wraps a full :class:Trace and preserves the raw model outputs for
each model.predict() call made during solving.

Properties


trace Trace required


raw_outputs array[OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any] required

Raw Outputs


direct_ios array[DirectModelIO]

Direct Ios

Default: []

Computed Properties


items list[TraceItem]

Shorthand for self.trace.items.


messages list[LFMessage]

Legacy SingleSolverOutput-style view of the trace as LFMessages.


output LFModelOutput | None

Legacy SingleSolverOutput-style view of the last raw output.

Methods


add_model_response method

add_model_response(response: ModelResponse) -> None

Record a model response: extend the trace items and append the raw output.


append method

append(item: TraceItem) -> None

Append a single item to the trace.


append_custom_task_input_message method

append_custom_task_input_message(content: Any) -> None

Append a custom task input item with an opaque payload.


append_system_message method

append_system_message(content: str) -> None

Append a system message with plain text content.


append_user_message method

append_user_message(content: str) -> None

Append a user message with plain text content.


extend method

extend(items: list[TraceItem]) -> None

Extend the trace with a list of items.

SolverTraceJudgeInput

Properties


sample object required

Sample


trace Trace required


model_outputs array[OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any] required

Model Outputs


solver_output SingleSolverOutput, GroupedSolverOutput, SolverTrace, GroupedSolverTrace

Solver Output

Default: None


messages array[ChatCompletionInputMessage, Any, array[string], ChatCompletionOutputMessage, array[array[number]]]

Messages

Default: None


model_output OpenResponsesModelOutput, RAGCompletionOutput, ChatCompletionModelOutput, EmbeddingsModelOutput, Any

Model Output

Default: None


input_prompt string

Input Prompt

Default: None

SpanBeginEvent

Marks the beginning of a named execution span.

Spans define hierarchical boundaries for agents, tools, and other execution
phases.

Properties


id string

Id


span_id string required

Span Id


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "span_begin"

Type

Default: span_begin


parent_span_id string

Parent Span Id

Default: None


name string required

Name


span_type string

Span Type

Default: None

SpanEndEvent

Marks the end of a named execution span.

Properties


id string

Id


span_id string required

Span Id


timestamp string

Timestamp

Default: None


metadata object

Metadata

Default: None


type Literal "span_end"

Type

Default: span_end

SummaryTextContent

A summary text from the model.

Properties


type Literal "summary_text"

The type of the object. Always summary_text.

Default: summary_text


text string required

A summary of the reasoning output from the model so far.

SystemMessage

A message with role system.

Properties


type Literal "message"

The type of the message. Always set to message.

Default: message


id string required

The unique ID of the message.


status MessageStatus required


role Literal "system"

Role

Default: system


content array[InputTextContent, OutputTextContent, TextContent, SummaryTextContent, ReasoningTextContent, RefusalContent, InputImageContent, InputFileContent, InputVideoContent] required

The content of the message

SystemTaskMetricEntry

A single metric entry produced by a system task's compute_evidence function.

Properties


value number, integer required

The numeric value of the metric.


reason string

A freeform explanation for the metric value. Mapped to the reason field in MetricData.

Default: None

SystemTaskOutput

The expected return value of the compute_evidence function in a system task snippet.

Properties


metrics object required

A mapping from metric key to the metric entry.

TLSContext

Defines the TLS context.

Properties


validation_context CertificateValidationContext

Settings for validating server certificates.

Default: None

TaskExecution

Timing and resource usage information for a task execution.

Properties


runtime number required

The runtime of the task in seconds.


started_at integer required

A Unix timestamp in seconds.


ended_at integer required

A Unix timestamp in seconds.


model_usage ModelUsageStats

Default: None

TaskProgressState

Properties


num_total_samples integer required

Num Total Samples


num_processed_samples integer required

Num Processed Samples


num_samples_with_errors integer required

Num Samples With Errors

TaskResultError

Properties


error_type string required

The type of the error.


message string required

The specific error message that occurred during evaluation.


hint string

The suggestion to try out to fix the issue.

Default: None


stage TaskResultErrorStage

Default: None

TaskResultEvidence

Properties


metrics array[MetricData] required

The metrics. If an error occurred, the metrics will be None.


samples array[SampleEvidence]

The sample evidence (as produced by tasks).

Default: None


runs array[RunEvidence]

Per-run evidence for repeatability task results. None when repeatability is not assessed.

Default: None


errors array[TaskResultError]

A list of task-level errors.

Default: []


failures TaskResultFailures

Default: None

Methods


adapt_metrics_if_needed classmethod

adapt_metrics_if_needed(value: Any) -> Any

build_flat_metrics_dict method

build_flat_metrics_dict() -> MetricValues

TaskResultFailures

Properties


num_errors integer required

Num Errors


num_total integer required

Num Total

TaskResultLog

Properties


format_version Literal "v1" required

Format Version


app_version string required

The version of AI GO that computed this task result log.


status string required

Status


evidence TaskResultEvidence required


specification TaskResultSpecification required


execution TaskExecution required


errors array[TaskResultError] required

Errors

TaskResultSpecification

The task specification stored inside a TaskResultLog, capturing what was evaluated and how.

Properties


display_name string required

The display name of the evaluation.


task StoredTask required


config object required

Task configuration used for this evaluation.


evaluated_entity StoredDataset, StoredModel

The dataset or model that was evaluated. Present only for benchmark tasks.

Default: None


run_config EvaluationConfig required


repeatability_config RepeatabilityConfig

Default: None

TextCitationBody

A citation referencing a plain-text source (e.g. a retrieved knowledge-base chunk).

Properties


type Literal "text_citation"

The type of the text citation. Always text_citation.

Default: text_citation


content string required

The text content of the cited source.

TopLogProb

The top log probability of a token.

Properties


token string required

Token


logprob number required

Logprob


bytes array[integer] required

Bytes

Trace

Represents a conversation trace between a user and an agent.

A trace stores a sequence of items in the Open Responses format,
including user messages, assistant messages, function calls, and
function call outputs. It provides helper methods to extract
individual turns, find function calls, and inspect the conversation.

The preamble property exposes everything before the first user
message (system messages, assistant greetings, initial function calls,
etc.). Everything from the first user message onward is accessible via
turns.

For multi-agent traces, an optional events field provides a richer
execution record with span markers encoding agent hierarchy. Use
Trace.from_events() to construct event-based traces; items is
derived automatically.

Properties


FORMAT string

Format

Default: open_responses


items array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Items


metadata TraceMetadata

Default: None


events array[MessageEvent, FunctionCallEvent, ModelCallEvent, SpanBeginEvent, SpanEndEvent, CompactionEvent, ErrorEvent, CustomEvent]

Events

Default: None


span_id string

Span Id

Default: None


span_name string

Span Name

Default: None


span_type string

Span Type

Default: None

Computed Properties


assistant_messages list[AssistantMessage]

Return all assistant messages across the trace (excluding preamble).


conversation_items list[TraceItem]

Return items from the first user message onward (excludes preamble).


function_calls list[FunctionCall]

Return all function calls across the trace (excluding preamble).


function_outputs list[FunctionCallOutput]

Return all function call outputs across the trace (excluding preamble).


preamble list[TraceItem]

Return all items before the first user message.


system_messages list[SystemMessage]

Return all system messages in the preamble in order.


turns list[Turn]

Extract individual conversation turns.


user_messages list[UserMessage]

Return all user messages across the entire trace.

Methods


from_events classmethod

from_events(events: list[TraceEvent], *, span_id: str | None = None, **kwargs: Any) -> Trace

Construct a Trace from an event stream.

span_id identifies which span the Trace represents. Defaults to
None for a root Trace; pass a span id when constructing a Trace for
a specific span.

The item-derivation strategy is picked from the event stream itself:

  • If any MessageEvent is present, items come from direct-span
    MessageEvent and FunctionCallEvent values (native LF traces).
  • Otherwise, items are reconstructed from direct-span ModelCallEvent.
    This path handles event streams imported from inspect-ai-style event streams
    that don't produce MessageEvents.

from_items classmethod

from_items(items: list[TraceItem], **kwargs: Any) -> Trace

Construct a Trace from conversation items only (no events).

Use this for simple or legacy single-agent traces where no execution
metadata is needed.


get_first_system_prompt method

get_first_system_prompt() -> str | None

Return the text of the first system message, if any.


get_function_call_arguments method

get_function_call_arguments(call: FunctionCall) -> dict

Parse and return the JSON arguments of a function call.


get_function_call_pairs method

get_function_call_pairs() -> list[tuple[FunctionCall, FunctionCallOutput | None]]

Return all (function_call, function_output) pairs matched by call_id.


get_function_calls_by_name method

get_function_calls_by_name(name: str) -> list[FunctionCall]

Return all function calls with the given function name.


get_function_output_for_call method

get_function_output_for_call(call_id: str) -> FunctionCallOutput | None

Return the function output matching a given call_id, if any.


get_function_output_text method

get_function_output_text(output: FunctionCallOutput) -> str

Extract the text content from a function call output.


get_last_assistant_text method

get_last_assistant_text() -> str | None

Return the text content of the last assistant message, if any.


get_last_user_text method

get_last_user_text() -> str | None

Return the text of the last user message, if any.


spans method

spans() -> list[Trace]

Return immediate child spans as Trace objects, or an empty list if there
are no child spins.

Each child span is a Trace with its own items, events, and span metadata.
Spans are returned in chronological order (matching the event stream order).
Call .spans() on a child recursively to get sub-sub-agent spans.

TraceMetadata

Trace-level metadata capturing identity, provenance, and summary information.

All fields are optional. Only the trace data itself (items/events) is required.
Metadata enriches the trace for filtering, grouping, and analysis.

Properties


trace_id string

Trace Id

Default: None


source_type string

Source Type

Default: None


source_uri string

Source Uri

Default: None


agent string

Agent

Default: None


model string

Model

Default: None


tags array[string]

Tags

Default: None


created_at string

Created At

Default: None


total_time number

Total Time

Default: None


total_tokens integer

Total Tokens

Default: None


message_count integer

Message Count

Default: None


error string

Error

Default: None


extra object

Extra

Default: None

Turn

A single conversational turn initiated by a user message.

A turn starts with a user message and includes all subsequent items
until the next user message (or end of trace). This typically includes:

  • The user message itself
  • Zero or more assistant actions (function calls, function outputs,
    assistant messages) that form the response to the user message.

Properties


user_message UserMessage required


assistant_items array[Message, FunctionCall, FunctionCallOutput, CustomTaskInputMessage, CustomTaskOutputMessage] required

Assistant Items

Computed Properties


assistant_messages list[AssistantMessage]

Return all assistant messages in this turn in order.


function_call_pairs list[tuple[FunctionCall, FunctionCallOutput | None]]

Return pairs of (function_call, function_output) matched by call_id.


function_calls list[FunctionCall]

Return all function calls in this turn in order.


function_outputs list[FunctionCallOutput]

Return all function call outputs in this turn in order.

UrlCitationBody

A citation for a web resource used to generate a model response.

Properties


type Literal "url_citation"

The type of the URL citation. Always url_citation.

Default: url_citation


url string required

The URL of the web resource.


start_index integer required

The index of the first character of the URL citation in the message.


end_index integer required

The index of the last character of the URL citation in the message.


title string required

The title of the web resource.

UserMessage

A message with role user.

Properties


type Literal "message"

The type of the message. Always set to message.

Default: message


id string required

The unique ID of the message.


status MessageStatus required


role Literal "user"

Role

Default: user


content array[InputTextContent, OutputTextContent, TextContent, SummaryTextContent, ReasoningTextContent, RefusalContent, InputImageContent, InputFileContent, InputVideoContent] required

The content of the message

Enums

FunctionCallOutputStatusEnum

Similar to FunctionCallStatus. All three options are allowed here for compatibility, but because in practice these items will be provided by developers, only completed should be used.

Allowed Values:

  • in_progress
  • completed
  • incomplete

FunctionCallStatus

Allowed Values:

  • in_progress
  • completed
  • incomplete

MessageRole

Allowed Values:

  • user
  • assistant
  • system
  • developer

MessageStatus

Allowed Values:

  • in_progress
  • completed
  • incomplete

TaskResultDataStatus

The execution status of a task result.

Allowed Values:

  • pending
  • cancelled
  • success
  • failed

Type Aliases


ChatCompletionResponseFormat = ChatCompletionResponseFormatJSONSchema | ChatCompletionResponseFormatText


ConversationItem = Message | CustomTaskInputMessage | CustomTaskOutputMessage


DType = Type | Tuple


InputMessageContent = str | List


JSONType = NoneType | int | str | bool | float | List | Mapping


LFInputMessage = ChatCompletionInputMessage | Any | list


LFMessage = ChatCompletionInputMessage | Any | list | ChatCompletionOutputMessage | list


LFModelInput = ChatCompletionInput | list | Any


LFModelOutput = OpenResponsesModelOutput | RAGCompletionOutput | ChatCompletionModelOutput | EmbeddingsModelOutput | Any


LFOutputMessage = ChatCompletionOutputMessage | Any | list


ResultDType = pd.DataFrame | NoneType | int | str | bool | float | List | Mapping | BaseModel


RuleDefinition = ExistsRuleDefinition | ThresholdRuleDefinition


RuleScope = PolicyRuleSimpleScope | PolicyRuleFinegrainedScope


TraceEvent = MessageEvent | FunctionCallEvent | ModelCallEvent | SpanBeginEvent | SpanEndEvent | CompactionEvent | ErrorEvent | CustomEvent


TraceItem = Message | FunctionCall | FunctionCallOutput | CustomTaskInputMessage | CustomTaskOutputMessage

Supporting Types

ActionRule

Properties


key string required

Key: 1-250 chars, allowed: a-z A-Z 0-9 _ -

Pattern: ^[a-zA-Z0-9_\-]+$

Max Length: 250


action ActionRuleAction required

The action to be applied to samples that match the filter.


filter FilterComparison, FilterMembership, FilterUnary required

The filter that determines which samples the action applies to.

ActionRuleAction

Allowed Values:

  • exclude_from_metrics

BenchmarkTaskDefinitionTemplate

Properties


type Literal "benchmark_task" required

The type of task definition.


evaluated_entity_type EvaluatedEntityType required


dataset TaskDatasetTemplate

The dataset used by this task

Default: None


solver TaskSolverTemplate

The solver used by this task

Default: None


scorers array[TaskScorerTemplate] required

The scorers used by this task


trials TrialsDefinitionTemplate

Default: None


actions array[ActionRule]

The actions used by this task

Default: None

BooleanParameterSpec

Properties


type Literal "boolean" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value boolean

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

CachePolicy

The caching policy to use for the task results in the evaluation. Supported values:

  • reuse - Use a cached task result if one is available (the default). Partial task
    results are also reused automatically - if a task is the same as another, completed
    task for all of its configuration except the scorers configuration, then only the
    scores, metrics and errors and failures related to them will be recomputed. This saves
    queries to the model during the solver part of the evaluation.
  • update - Do not use cached task results, but cache the results of the execution.
  • no-cache - Do not use cached task results and do not cache the results of the execution.

Allowed Values:

  • reuse
  • update
  • no-cache

CategoricalParameterSpec

Properties


type Literal "categorical" required

The type of parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


allowed_values array[string] required

Allowed Values


multiple boolean

Whether the parameter can have multiple values.

Default: False


default_value string

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False


values_mapping object

A mapping over the categorical values.

Default: None

CertificateValidationContext

Defines how server certificates should be validated.

Properties


trusted_ca string, Secret

base64 representation of PEM-encoded certificate(s).

Provide a raw base64 string or reference a secret.

For example: cat cert.pem \| base64 -w 0

Default: None


trust_chain_verification TrustChainVerification

Settings for verifying the trust chain of the server certificate.

Default: None

ConfigurationDatasetGenerationError

Properties


stage Literal "configuration" required

Stage


error_type string required

The type of the error.


message string required

The specific error message that occurred during generation.

CustomInferenceModelConfig

Client configuration for a model, that is provided manually by the user.

Properties


adapter_id string required

The ID of the model adapter to be used with this model.


connection_type Literal "custom_inference" required

The type of connection config.


run_inference_snippet string required

The code snippet to make a call to the model.


environment object required

Environment variables required to run the model client snippet. Values may reference secrets.


timeout number required

Timeout in seconds for the total runtime of the Python snippet.

DataSourceDatasetGenerationError

Properties


stage Literal "data_source" required

Stage


error_type string required

The type of the error.


message string required

The specific error message that occurred during generation.


iteration integer required

The iteration number of the data source generation that caused the error.

DatasetColumnParameterSpec

Properties


type Literal "dataset_column" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value string

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

DatasetGenerationDebugOptions

Properties


enabled boolean required

When true, the response will include a full pipeline trace for each source sample, which contains the source sample itself and the input and output at each synthesizer stage.


include_io boolean required

When true, the model input and output are included in the trace for each synthesizer call that produced I/O. Has no effect when enabled is false.

DatasetGenerationMetadata

Dataset generation metadata.

Properties


dataset_generator_id string

Dataset Generator Id

Default: None


execution_status ExecutionStatus required


dataset_generation_id string required

The dataset generation ID.


dataset_generation_request DatasetGenerationRequest required

The dataset generation request.


progress ExecutionProgress

Default: None


result_status ResultStatus

Default: None


errors array[ConfigurationDatasetGenerationError, SynthesizerDatasetGenerationError, DataSourceDatasetGenerationError]

List of errors that occurred during dataset generation.

Default: None

DatasetGenerationRequest

Properties


dataset_generator_config object required

The configuration used by the dataset generator.


num_samples integer required

The number of samples to generate. At least 1 sample must be requested.


debug DatasetGenerationDebugOptions

Default: None

DatasetMetadata

Dataset metadata.

Properties


num_rows integer required

Num Rows


columns array[string] required

Columns


download_url string required

URL to download the dataset in JSONL format.


data_version string required

Data Version

DatasetParameterSpec

Properties


type Literal "dataset" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value string

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

DictParameterSpec

Properties


type Literal "dict" required

The type of the parameter.


value_dtype ScalarDtype required

The data type of the values in the dict.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value object

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

EvaluatedEntityType

Allowed Values:

  • dataset
  • model

EvaluationConfig

Parameters required when starting an evaluation.

Properties


num_samples integer

The number of samples to evaluate. If not specified, all samples will be evaluated.

Default: None


subsampling Subsampling

Default: None


cache_policy CachePolicy

The caching policy to use for the task results in the evaluation. Supported values:

  • reuse - Use a cached task result if one is available (the default). Partial task
    results are also reused automatically - if a task is the same as another, completed
    task for all of its configuration except the scorers configuration, then only the
    scores, metrics and errors and failures related to them will be recomputed. This saves
    queries to the model during the solver part of the evaluation.
  • update - Do not use cached task results, but cache the results of the execution.
  • no-cache - Do not use cached task results and do not cache the results of the execution.

Default: reuse


trials_config TrialsConfig

Default: None

ExecutionProgress

Properties


progress number required

A progress indicator for the task result.


num_total_samples integer

The total number of samples to be processed for this task result.

Default: None


num_processed_samples integer

The number of samples already processed for this task result.

Default: None


num_samples_with_errors integer

The number of samples for which an error occurred for this task result.

Default: None

ExecutionStatus

Allowed Values:

  • not_started
  • pending
  • cancelled
  • finished

FilterComparison

Properties


op FilterComparisonOp required


expression string required

An expression encoding what to compare against the value.

Depending on the context, it can refer to different variables:

  • When filtering a dataset: it can refer to the sample and use dot or bracket
    notation to access the columns.
    If filtering a dataset with column names that are illegal under jinja
    substitution rules (e.g. containing spaces), use bracket notation to access
    the column.
  • When used within a task action: it can refer to the
    sample, the solver_output or the scores (which is a mapping between scorer
    keys and their corresponding score values dict).

value string, number, integer, boolean required

The value against which the expression is compared.

FilterComparisonOp

The comparison operator to apply.

Allowed Values:

  • equals
  • not_equals
  • greater_than
  • less_than
  • greater_or_equal
  • less_or_equal

FilterMembership

Properties


op FilterMembershipOp required


expression string required

An expression encoding what to check membership against the values.

Depending on the context, it can refer to different variables:

  • When filtering a dataset: it can refer to column values by name (ex: {{ category }}).
  • When used within a task action: it can refer to the
    sample, the solver_output or the scores (which is a mapping between scorer
    keys and their corresponding score values dict).

values array[string, number, boolean] required

The set of values to test membership against.

FilterMembershipOp

The membership operator to apply.

Allowed Values:

  • in
  • not_in

FilterUnary

Properties


op FilterUnaryOp required


expression string required

An expression encoding what to apply the unary operator to.

Depending on the context, it can refer to different variables:

  • When filtering a dataset: it can refer to column values by name (ex: {{ category }}).
  • When used within a task action: it can refer to the
    sample, the solver_output or the scores (which is a mapping between scorer
    keys and their corresponding score values dict).

FilterUnaryOp

The unary operator to apply.

Allowed Values:

  • exists
  • not_exists
  • is_true
  • is_false

FloatParameterSpec

Properties


type Literal "float" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


min number

The minimum value of the parameter.

Default: None


max number

The maximum value of the parameter.

Default: None


default_value number

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

ImageDetail

Allowed Values:

  • low
  • high
  • auto

InputVideoContent

A content block representing a video input to the model.

Properties


type Literal "input_video"

The type of the input content. Always input_video.

Default: input_video


video_url string required

A base64 or remote url that resolves to a video file.

IntParameterSpec

Properties


type Literal "int" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


min integer

The minimum value of the parameter.

Default: None


max integer

The maximum value of the parameter.

Default: None


default_value integer

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

IntegrationModelProviderId

The internal identifiers for all model providers known by the system.

Allowed Values:

  • anthropic
  • fireworks
  • gemini
  • latticeflow
  • novita
  • openai
  • sambanova
  • together

ListParameterSpec

Properties


type Literal "list" required

The type of the parameter.


dtype ScalarDtype required

The data type of the elements in the list.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value array[Any]

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

MLTask

The type of machine learning task to be performed.

Allowed Values:

  • chat_completion
  • embeddings
  • custom

MaxAggregator

Aggregates numeric scores by taking the maximum value.

Properties


function Literal "max" required

Function


score_name string

The name to give to the aggregated score.

Default: None

MeanAggregator

Aggregates numeric scores by computing the mean.

Properties


function Literal "mean" required

Function


score_name string

The name to give to the aggregated score.

Default: None

MinAggregator

Aggregates numeric scores by taking the minimum value.

Properties


function Literal "min" required

Function


score_name string

The name to give to the aggregated score.

Default: None

ModelCustomConnectionConfig

Connection configuration for a model, that is provided manually by the user.

Properties


connection_type Literal "custom_connection" required

The type of connection config.


adapter_id string required

The ID of the model adapter to be used with this model.


url string required

The model endpoint URL.


api_key string, Secret

The key to be passed as the authorization header (Authorization: Bearer API_KEY).
Provide a raw string (deprecated) or reference a secret.

Default: None


model_key string

This field is used in case the model is not specified in the URL but in the body instead. For the "openai" adapter, this will be passed as the "model" parameter. For custom adapters, this value is available as model_info.model_key.

Default: None


tls_context TLSContext

TLS configuration for secure connections to the model endpoint.

Default: None


custom_headers object

Additional headers to include in requests to the model endpoint. Values may reference secrets.

Default: None

ModelParameterSpec

Properties


type Literal "model" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value string

The default value to use.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False

ModelProviderConnectionConfig

Connection configuration for a model, that is retrieved from a well-known provider integrated with the system.

Properties


connection_type Literal "provider_connection" required

The type of connection config.


provider_id ModelProviderId required

The id of the model provider.


model_key string required

A key used to identify the model in the external provider.

ModelProviderId

ModelUsageStats

An object that contains the model usage summary for the task result.

Properties


num_samples integer required

Num Samples


num_completion_tokens integer

Num Completion Tokens

Default: None


num_prompt_tokens integer

Num Prompt Tokens

Default: None

PassAtKAggregator

Aggregates binary (True/False) scores using the pass@k estimator. Estimates the probability that at least one of k independent attempts will succeed, computed as 1 - (1 - p)^k where p is the empirical pass rate across trials.

Properties


function Literal "pass@k" required

Function


k integer required

The number of independent attempts in the scenario being modelled.


score_name string

The name to give to the aggregated score.

Default: None

PassPowerKAggregator

Aggregates binary (True/False) scores using the pass^k estimator. Estimates the probability that an agent would succeed on all k independent attempts, computed as p^k where p is the empirical pass rate across trials.

Properties


function Literal "pass^k" required

Function


k integer required

The number of independent attempts in the scenario being modelled.


score_name string

The name to give to the aggregated score.

Default: None

RepeatabilityConfig

Configuration for a repeatability task result.

Properties


num_runs integer required

Number of times to run the task.

ResultStatus

Allowed Values:

  • succeeded
  • failed

ScalarDtype

The scalar data type.

Allowed Values:

  • string
  • integer
  • float
  • boolean

ScoreAggregator

Aggregation configuration for one or more score keys.

Properties


score_name string required

The name of the score that will be aggregated.


aggregator MeanAggregator, MinAggregator, MaxAggregator, PassAtKAggregator, PassPowerKAggregator required

Aggregator

ScorerPurpose

Allowed Values:

  • score
  • qa

StoredDataset

Properties


display_name string required

The display name of the dataset.


description string

An optional description of the dataset.

Default: None


long_description string

Long description of the dataset in Markdown format.

Default: None


key string required

Key: 1-250 chars, allowed: a-z A-Z 0-9 _ -

Pattern: ^[a-zA-Z0-9_\-]+$

Max Length: 250


id string required

Id


dataset_metadata DatasetMetadata

Dataset metadata.

Default: None


dataset_generation_metadata DatasetGenerationMetadata

Dataset generation metadata.

Default: None


created_at integer

Unix timestamp (in seconds).

Default: None


updated_at integer

Unix timestamp (in seconds).

Default: None


tags array[StoredTag]

Tags associated with the dataset.

Default: []

StoredModel

Properties


id string required

Id


display_name string required

The name of the Model.


key string required

Unique identifier assigned to the entity in AI GO!.

Pattern: ^((together|gemini|openai|fireworks|sambanova|anthropic|novita|latticeflow)\$)?[a-zA-Z0-9_-]+$

Max Length: 250


description string

Description

Default: None


rate_limit integer

The maximum allowed number of requests per minute.

Default: None


max_concurrent_requests integer

The maximum number of concurrent inference requests.

Default: None


task MLTask required


config ModelCustomConnectionConfig, CustomInferenceModelConfig, ModelProviderConnectionConfig required

The configuration for connecting to the model.


adapter_id string required

The ID of the model adapter to be used with this model.


created_at integer

Unix timestamp (in seconds).

Default: None


updated_at integer

Unix timestamp (in seconds).

Default: None

StoredTag

Properties


id string required

Id


value string required

The text value of the tag.


color string required

The color (#RRGGBB or #RGB) associated with the tag, used for UI representation.

Pattern: ^#([0-9a-fA-F]{6}|[0-9a-fA-F]{3})$

StoredTask

Properties


id string required

Id


key string required

Key: 1-250 chars, allowed: a-z A-Z 0-9 _ -

Pattern: ^[a-zA-Z0-9_\-]+$

Max Length: 250


display_name string required

The display name of the task.


description string required

The description of the task.


long_description string

Long description of the task in Markdown format.

Default: None


tasks array[MLTask]

The ML tasks for which the task is applicable.

Default: []


config_spec array[FloatParameterSpec, IntParameterSpec, BooleanParameterSpec, StringParameterSpec, ModelParameterSpec, DatasetParameterSpec, DatasetColumnParameterSpec, ListParameterSpec, DictParameterSpec, CategoricalParameterSpec] required

Config Spec


definition BenchmarkTaskDefinitionTemplate, SystemTaskDefinitionTemplate required

Definition


provider TaskProvider required

The provider of the task.


tags array[StoredTag] required

Tags associated with the task.


created_at integer

Unix timestamp (in seconds).

Default: None


updated_at integer

Unix timestamp (in seconds).

Default: None

StringKind

Specifies the kind of string parameter.

Allowed Values:

  • freeform
  • python
  • jinja

StringParameterExample

Properties


value string required

The example value for the string parameter.


display_name string required

The display name of the example.

StringParameterSpec

Properties


type Literal "string" required

The type of the parameter.


key string required

The key of the parameter.


display_name string required

The display name of the parameter.


description string

The description of the parameter.

Default: None


default_value string

The default value of the parameter.

Default: None


nullable boolean

Whether this parameter is nullable.

Default: False


string_kind StringKind

Default: freeform


examples array[StringParameterExample]

Examples for the string parameter.

Default: None

Subsampling

The subsampling strategy to use when selecting samples for evaluation. Supported values:

  • head - Select the first N samples.
  • random - Select N random samples. The random seed is fixed for reproducibility.
    If not specified, defaults to 'head'.

Allowed Values:

  • head
  • random

SynthesizerDatasetGenerationError

Properties


stage Literal "synthesizer" required

Stage


error_type string required

The type of the error.


message string required

The specific error message that occurred during generation.


source_sample object required

The source sample for which an error occurred.


synthesizer_index integer required

The index of the synthesizer that caused the error.

SystemTaskDefinitionTemplate

Properties


type Literal "system_task" required

The type of task definition.


compute_evidence_snippet string required

Python source code defining a def compute_evidence() function (sync or async) that returns metrics and optional metadata.

TaskDatasetTemplate

The dataset that will be used to evaluate the model.

Properties


id string required

Id

TaskMetricTemplate

Properties


key string

The key of the metric.

Default: None


type string required

The type of metric.

TaskProvider

Allowed Values:

  • latticeflow
  • user

TaskResultErrorStage

Allowed Values:

  • configuration
  • dataset
  • solver
  • score
  • metric
  • action

TaskScorerTemplate

Properties


key string

The key of the scorer.

Default: None


type string required

The type of the scorer.


display_name string

The display name of the scorer.

Default: None


purpose ScorerPurpose

The purpose of the scorer.

Default: score


metrics array[TaskMetricTemplate]

The metrics associated with this scorer, which will produce per-task metrics.

Default: None

TaskSolverTemplate

Properties


type string required

The type of the solver.

TextContent

A text content.

Properties


type Literal "text"

Type

Default: text


text string required

Text

TrialsConfig

Configuration for trials in a task result/specification. Only relevant for benchmark tasks.

Properties


num_trials integer required

Number of trials to run per sample.

TrialsDefinitionTemplate

Configuration for trials in a benchmark task definition template.

Properties


num_trials integer, string required

Number of trials to run per sample.


score_aggregators array[ScoreAggregator]

Score aggregators that compute an aggregated score given the score values for the different trials. Scores with no matching aggregator default to mean for numeric and boolean values (for other dtypes, no default aggregation is computed).

Default: None

TrustChainVerification

How to trust the CA trust chain.

  • verify_trust_chain (default) will verify the server certificate against the configured CA trust.
  • accept_untrusted will not perform server certificate verification. NOTE: This is a
    security hazard and should be avoided.

Allowed Values:

  • verify_trust_chain
  • accept_untrusted