startClusterHealthCheck method

Future<StartClusterHealthCheckResponse> startClusterHealthCheck({
  1. required String clusterName,
  2. required List<InstanceGroupHealthCheckConfiguration> deepHealthCheckConfigurations,
})

Start deep health checks for a SageMaker HyperPod cluster. You can use DescribeClusterNode API to track progress of the deep health checks. The unhealthy nodes will be automatically rebooted or replaced. Please see Resilience-related Kubernetes labels by SageMaker HyperPod for details.

May throw ResourceNotFound.

Parameter clusterName : The string name or the Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.

Parameter deepHealthCheckConfigurations : A list of configurations containing instance group names, EC2 instance IDs, and deep health checks to perform.

Implementation

Future<StartClusterHealthCheckResponse> startClusterHealthCheck({
  required String clusterName,
  required List<InstanceGroupHealthCheckConfiguration>
      deepHealthCheckConfigurations,
}) async {
  final headers = <String, String>{
    'Content-Type': 'application/x-amz-json-1.1',
    'X-Amz-Target': 'SageMaker.StartClusterHealthCheck'
  };
  final jsonResponse = await _protocol.send(
    method: 'POST',
    requestUri: '/',
    exceptionFnMap: _exceptionFns,
    // TODO queryParams
    headers: headers,
    payload: {
      'ClusterName': clusterName,
      'DeepHealthCheckConfigurations': deepHealthCheckConfigurations,
    },
  );

  return StartClusterHealthCheckResponse.fromJson(jsonResponse.body);
}