volcano
volcano
¶
Kubernetes Volcano dispatch utilities.
Provides kubectl wrappers for submitting Volcano Jobs, shipping code to PVCs, and rendering the volcano_job.yaml template.
get_queue_availability(queue_name: str, gpu_resource_key: str = 'nvidia.com/gpu', kubeconfig: str | None = None, context: str | None = None, timeout: float = 30.0) -> tuple[int, int] | None
¶
Return (available, capacity) GPU count for a Volcano queue.
Fetches spec.capability and status.allocated from the Queue CRD.
Returns None on error (queue not found, kubectl failure, etc.).
apply_job(yaml_content: str, namespace: str = 'default', kubeconfig: str | None = None, context: str | None = None, timeout: float = 60.0) -> RunResult
¶
Submit a Volcano Job via kubectl apply -f -.
delete_job(job_name: str, namespace: str = 'default', kubeconfig: str | None = None, context: str | None = None, timeout: float = 60.0) -> RunResult
¶
Delete a Volcano Job.
get_job_status(job_name: str, namespace: str = 'default', kubeconfig: str | None = None, context: str | None = None, timeout: float = 30.0) -> dict[str, Any] | None
¶
Get Volcano Job status as a dict, or None if not found.
get_pod_logs(job_name: str, namespace: str = 'default', kubeconfig: str | None = None, context: str | None = None, timeout: float = 30.0, tail_lines: int = 100) -> RunResult
¶
Get logs from pods belonging to a Volcano Job.
wait_completed(job_name: str, namespace: str = 'default', kubeconfig: str | None = None, context: str | None = None, poll_interval: float = 10.0, timeout: float | None = None) -> str | None
¶
Poll until a Volcano Job reaches a terminal state.
Returns the final state string ("Completed", "Failed", "Aborted", etc.) or None on timeout.
render_volcano_job(job_name: str, host_config: VolcanoHostConfig, bootstrap_command: str, n_chips: int | None = None, cluster_env: dict[str, str] | None = None) -> str
¶
Render the volcano_job.yaml template with concrete values.
Returns the rendered YAML string ready for kubectl apply.
ship_and_write_to_pvc(tarball: bytes, script: str, bootstrap_pys: dict[str, str], pvc_name: str, dest_path: str, queue: str, namespace: str = 'default', pvc_mount_path: str = '/workspace', kubeconfig: str | None = None, context: str | None = None, timeout: float = 120.0, helper_resources: dict[str, str] | None = None) -> RunResult
¶
Ship code + bootstrap files to a PVC via a temporary Volcano Job.
Uses a single Volcano Job (vcjob) as the helper instead of a plain Pod,
so it goes through the Volcano scheduler like real workloads. Code is
streamed via tar | kubectl exec -i … tar -xpf - (no intermediate
files). Bootstrap scripts are written via kubectl exec -i … cat.
The helper vcjob is cleaned up on exit.