Environment

dsbx.Env

Gym-style single-agent environment for dynamic scheduling.

The environment is a thin wrapper around: - the existing generator (via dsbx.Gen), and - the new simulator (dsbx.Sim.DynaSchedSim).

It exposes observations and legal actions that are intentionally compatible with the current algorithms.llm_scheduler.env.LLMEnv interface so that LLM-based schedulers can be migrated incrementally.

class dsbx.Env.DynaSchedEnv(model=None, *, events=None, auto_generate_events=True, track_trajectory=True, traj_stream_path=None, sim_backend=None)

Bases: object

Single-agent environment wrapper around DynaSchedSim.

The wrapper exposes an observation/action interface compatible with the scheduler stack: observations contain time, ready_ops, and machines; legal actions contain job_id, machine_group, and machine_candidates and may include a concrete machine_id at execution time. Heuristic, RL, and LLM schedulers can run directly on this environment, while the environment records a Trajectory for later evaluation and visualization.

Parameters:

model (Optional[InputModel])
events (Optional[List[Event]])
auto_generate_events (bool)
track_trajectory (bool)
traj_stream_path (Optional[PathLike])
sim_backend (Optional[Any])

classmethod from_json(path, *, auto_generate_events=True)

Build an environment from a JSON configuration file.

This mirrors the DynaSchedBench generation workflow, but keeps everything in memory instead of writing artifacts to disk.

Return type:

DynaSchedEnv

Parameters:

path (str | Path)
auto_generate_events (bool)

classmethod from_jms_jsonl(path, *, track_trajectory=True, traj_stream_path=None)

Build an environment backed by JMSSim from a JMS/GEN-Bench JSONL file.

This constructor reads a JMSBench/GEN-Bench-style JSONL instance with static_info and dynamic_events, builds JMSSim, and exposes DynaSchedSim-compatible snapshots and legal actions through JMSSimBackend. This path does not depend on InputModel.

Return type:

DynaSchedEnv

Parameters:

path (str | Path)
track_trajectory (bool)
traj_stream_path (str | Path | None)

reset()

Reset the environment and return the initial observation.

If no event list has been provided, this will generate one using the same generator that backs the CLI (FastPathConstructor), but without running calibration or exporting artifacts.

Return type:: Dict[str, Any]

step(action)

Apply a scheduling action and return (obs, reward, done, info).

Return type:: Tuple[Dict[str, Any], float, bool, Dict[str, Any]]
Parameters:: action (Dict[str, Any])

legal_actions()

Return all currently legal actions.

Each action has the form:

{"job_id": str, "machine_group": str, "machine_candidates": List[str]}

When executing an action via step(), callers may optionally add a specific machine choice:

{“machine_id”: str}

Return type:: List[Dict[str, Any]]

advance_if_idle()

Advance time to the next decision point when there is no ready op.

This mirrors LLMEnv.advance_time_if_idle behavior and is useful for simulators that want to automatically skip idle periods.

Return type:: Dict[str, Any]

get_snapshot()

Return type:: Snapshot

get_light_state()

Return type:: Dict[str, Any]

inject_external_event(*, kind, ev_id, time=None)

Return type:

None

Parameters:

kind (str)
ev_id (Any)
time (float | None)

inject_external_events(events)

Return type:: None
Parameters:: events (List[Dict[str, Any]])

get_trajectory()

Return type:: Trajectory

property routing: Dict[Any, Any]

total_operations()

Return the total number of operations across all jobs.

This helper preserves the LLMEnv.total_operations semantics so upper-level algorithms can estimate an upper bound on scheduling steps without inspecting snapshot internals.

Return type:: int

done()

Check whether the environment has finished all jobs.

The behavior matches LLMEnv.done and the internal _check_done: all jobs must be completed or cancelled, and no future ArrivalEvent may remain.

Return type:: bool

get_next_process_time(job_id)

Return the processing time of the next operation for a given job.

This method preserves LLMEnv.get_next_process_time semantics. It is mainly used by heuristics such as SPT to rank jobs without modifying the underlying state.

Return type:: Optional[float]
Parameters:: job_id (str)

get_remaining_work(job_id)

Return the remaining work content for a given job.

This aligns with LLMEnv.get_remaining_work by returning the total processing time of unfinished operations. When available, the method prefers remaining_work_content from the snapshot.

Return type:: float
Parameters:: job_id (str)

get_machine_queue_length(target)

Return type:: int
Parameters:: target (str)

get_action_timing(action)

Return type:: Dict[str, Any]
Parameters:: action (Dict[str, Any])

estimate_action_score(action)

Delegate to the underlying simulator’s estimate_action_score.

This provides an LLMEnv.estimate_action_score-compatible interface so LLMPolicy can run on the new environment.

Return type:: float
Parameters:: action (Dict[str, Any])

quick_rollout_score(action, steps=1)

Delegate to the underlying simulator’s quick_rollout_score.

This matches LLMEnv.quick_rollout_score semantics: perform a short rollout on a local copy and return the negative estimated completion time.

Return type:

float

Parameters:

action (Dict[str, Any])
steps (int)

property static_bottlenecks: Dict[str, float]

close()

Close any resources held by the environment (e.g., trajectory streams).

Return type:: None

class dsbx.Env.JMSRawEnv(sim)

Bases: object

Environment wrapper around JMSSim.

This environment is dedicated to raw JMS/GEN-Bench JSONL instances under data/jms and data/genbench. It does not use dsbx.Gen.InputModel or runs/* directories and is therefore semantically separated from DynaSchedEnv.

Parameters:: sim (JMSSim)

classmethod from_jsonl(path)

Build an environment directly from a JMS/GEN-Bench JSONL file.

Return type:: JMSRawEnv
Parameters:: path (str | Path)

reset()

Reset simulator and return initial observation.

Return type:: Dict[str, Any]

step(action)

Apply a scheduling action and return (obs, reward, done, info).

The action format is:

{"job_id": int | str, "machine_id": int}

Reward is zero by default; downstream code can compute metrics from the final trajectory if needed.

Return type:: tuple[Dict[str, Any], float, bool, Dict[str, Any]]
Parameters:: action (Dict[str, Any])

legal_actions()

Return all currently legal (job, machine) assignments.

Return type:: List[Dict[str, Any]]

advance_if_idle()

Advance time to the next decision point when no ops are ready.

Return type:: Dict[str, Any]

property time: float

done()

Return type:: bool

get_light_state()

Return type:: Dict[str, Any]

get_gantt()

Return a simple Gantt-like schedule from the underlying simulator.

If the wrapped JMSSim does not implement get_gantt, an empty list is returned.

Return type:: List[Dict[str, Any]]