Environment Anatomy#
A deep dive into the structure of OpenEnv environments.
Components#
Every OpenEnv environment consists of:
my_env/
βββ openenv.yaml # Manifest file
βββ my_env/
β βββ __init__.py
β βββ client.py # Client classes
β βββ server.py # Server/Environment
β βββ models.py # Pydantic models
βββ Dockerfile # Container definition
βββ pyproject.toml # Package metadata
βββ README.md # Documentation
The Manifest (openenv.yaml)#
name: my_env
version: 0.1.0
description: My custom environment
client:
class_name: MyEnvClient
module: my_env.client
action:
class_name: MyAction
module: my_env.models
observation:
class_name: MyObservation
module: my_env.models
default_image: my-env:latest
spec_version: 1
Models (Pydantic)#
Custom Action, Observation, and State types subclass the base classes from openenv.core.env_server.types β not pydantic.BaseModel directly. The base Observation already carries done and reward fields, which step() populates; Action and State add metadata plumbing used by the server.
from openenv.core.env_server.types import Action, Observation, State
class MyAction(Action):
command: str
args: list[str] = []
class MyObservation(Observation):
output: str
success: bool
class MyState(State):
history: list[str] = []
Environment Class#
Environments subclass the abstract Environment[ActT, ObsT, StateT] base and implement reset, step, and the state property. Reward and termination are carried on the returned observation β they are not a tuple return value.
from openenv.core.env_server.interfaces import Environment
class MyEnvironment(Environment[MyAction, MyObservation, MyState]):
def reset(self, seed=None, episode_id=None, **kwargs) -> MyObservation:
...
def step(self, action: MyAction, timeout_s=None, **kwargs) -> MyObservation:
...
@property
def state(self) -> MyState:
...
Server (FastAPI)#
Use create_app from openenv.core.env_server to wrap the environment as a FastAPI application. Pass the environment class (used as a factory so each WebSocket session gets its own instance) along with the action and observation types:
from openenv.core.env_server import create_app
app = create_app(
MyEnvironment,
MyAction,
MyObservation,
env_name="my_env",
)
This is what the environmentβs server/app.py entry point typically does β see envs/echo_env/server/app.py for a minimal real example.
Rewards via the Rubric#
Rewards are computed inside the environment, not by external code. The base Environment accepts an optional rubric on __init__ β pass it to super().__init__(rubric=...), call self._reset_rubric() from reset, and self._apply_rubric(action, observation) from step (or _apply_rubric_async from step_async). The Rubrics tutorial covers the composable API end-to-end.
Next Steps#
Deployment - Deploy your environment
Your First Environment - Build step by step