GoogleCloudAiplatformV1MachineSpec
import type { GoogleCloudAiplatformV1MachineSpec } from "https://googleapis.deno.dev/v1/aiplatform:v1.ts";Specification of a single machine.
§Properties
The number of accelerators to attach to the machine. For accelerator optimized machine types (https://cloud.google.com/compute/docs/accelerator-optimized-machines), One may set the accelerator_count from 1 to N for machine with N GPUs. If accelerator_count is less than or equal to N / 2, Vertex will co-schedule the replicas of the model into the same VM to save cost. For example, if the machine type is a3-highgpu-8g, which has 8 H100 GPUs, one can set accelerator_count to 1 to 8. If accelerator_count is 1, 2, 3, or 4, Vertex will co-schedule 8, 4, 2, or 2 replicas of the model into the same VM to save cost. When co-scheduling, CPU, memory and storage on the VM will be distributed to replicas on the VM. For example, one can expect a co-scheduled replica requesting 2 GPUs out of a 8-GPU VM will receive 25% of the CPU, memory and storage of the VM. Note that the feature is not compatible with multihost_gpu_node_count. When multihost_gpu_node_count is set, the co-scheduling will not be enabled.
Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.
Optional. Immutable. The Nvidia GPU partition size. When specified, the requested accelerators will be partitioned into smaller GPU partitions. For example, if the request is for 8 units of NVIDIA A100 GPUs, and gpu_partition_size="1g.10gb", the service will create 8 * 7 = 56 partitioned MIG instances. The partition size must be a value supported by the requested accelerator. Refer to Nvidia GPU Partitioning for the available partition sizes. If set, the accelerator_count should be set to 1.
Immutable. The type of the machine. See the list of machine types
supported for
prediction
See the list of machine types supported for custom
training.
For DeployedModel this field is optional, and the default value is
n1-standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this
field is required.
Optional. Immutable. Configuration controlling how this resource pool consumes reservation.