dataproc/v1 library
Cloud Dataproc API - v1
Manages Hadoop-based clusters and jobs on Google Cloud Platform.
For more information, see cloud.google.com/dataproc/
Create an instance of DataprocApi to access these resources:
Classes
- AcceleratorConfig
- Specifies the type and number of accelerator cards attached to the instances of an instance.
- AnalyzeBatchRequest
- A request to analyze a batch workload.
- AutoscalingConfig
- Autoscaling Policy config associated with the cluster.
- AutoscalingPolicy
- Describes an autoscaling policy for Dataproc cluster autoscaler.
- AutotuningConfig
- Autotuning configuration of the workload.
- AuxiliaryNodeGroup
- Node group identification and configuration information.
- AuxiliaryServicesConfig
- Auxiliary services configuration for a Cluster.
- BasicAutoscalingAlgorithm
- Basic algorithm for autoscaling.
- BasicYarnAutoscalingConfig
- Basic autoscaling configurations for YARN.
- Batch
- A representation of a batch workload in the service.
- Binding
- Associates members, or principals, with a role.
- Cluster
- Describes the identifying information, config, and status of a Dataproc cluster
- ClusterConfig
- The cluster config.
- ClusterMetrics
- Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only.
- ClusterSelector
- A selector that chooses target cluster for jobs based on metadata.
- ClusterStatus
- The status of a cluster and its instances.
- ConfidentialInstanceConfig
- Confidential Instance Config for clusters using Confidential VMs (https://cloud.google.com/compute/confidential-vm/docs)
- DataprocApi
- Manages Hadoop-based clusters and jobs on Google Cloud Platform.
- DataprocMetricConfig
- Dataproc metric config.
- DiagnoseClusterRequest
- A request to collect cluster diagnostic information.
- DiskConfig
- Specifies the config of disk options for a group of VM instances.
- DriverSchedulingConfig
- Driver scheduling configuration.
- EncryptionConfig
- Encryption settings for the cluster.
- EndpointConfig
- Endpoint config for this cluster
- EnvironmentConfig
- Environment configuration for a workload.
- ExecutionConfig
- Execution configuration for a workload.
- FlinkJob
- A Dataproc job for running Apache Flink applications on YARN.
- GceClusterConfig
- Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster.
- GetIamPolicyRequest
- Request message for GetIamPolicy method.
- GkeClusterConfig
- The cluster's GKE config.
- GkeNodeConfig
- Parameters that describe cluster nodes.
- GkeNodePoolAcceleratorConfig
- A GkeNodeConfigAcceleratorConfig represents a Hardware Accelerator request for a node pool.
- GkeNodePoolAutoscalingConfig
- GkeNodePoolAutoscaling contains information the cluster autoscaler needs to adjust the size of the node pool to the current cluster usage.
- GkeNodePoolConfig
- The configuration of a GKE node pool used by a Dataproc-on-GKE cluster (https://cloud.google.com/dataproc/docs/concepts/jobs/dataproc-gke#create-a-dataproc-on-gke-cluster).
- GkeNodePoolTarget
- GKE node pools that Dataproc workloads run on.
- GoogleCloudDataprocV1WorkflowTemplateEncryptionConfig
- Encryption settings for encrypting workflow template job arguments.
- HadoopJob
- A Dataproc job for running Apache Hadoop MapReduce (https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) jobs on Apache Hadoop YARN (https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html).
- HiveJob
- A Dataproc job for running Apache Hive (https://hive.apache.org/) queries on YARN.
- IdentityConfig
- Identity related configuration, including service account based secure multi-tenancy user mappings.
- InjectCredentialsRequest
- A request to inject credentials into a cluster.
- InstanceFlexibilityPolicy
- Instance flexibility Policy allowing a mixture of VM shapes and provisioning models.
- InstanceGroupAutoscalingPolicyConfig
- Configuration for the size bounds of an instance group, including its proportional size to other groups.
- InstanceGroupConfig
- The config settings for Compute Engine resources in an instance group, such as a master or worker group.
- InstanceReference
- A reference to a Compute Engine instance.
- InstanceSelection
- Defines machines types and a rank to which the machines types belong.
- InstanceSelectionResult
- Defines a mapping from machine types to the number of VMs that are created with each machine type.
- InstantiateWorkflowTemplateRequest
- A request to instantiate a workflow template.
- Interval
- Represents a time interval, encoded as a Timestamp start (inclusive) and a Timestamp end (exclusive).The start must be less than or equal to the end.
- Job
- A Dataproc job resource.
- JobPlacement
- Dataproc job config.
- JobReference
- Encapsulates the full scoping used to reference a job.
- JobScheduling
- Job scheduling options.
- JobStatus
- Dataproc job status.
- JupyterConfig
- Jupyter configuration for an interactive session.
- KerberosConfig
- Specifies Kerberos related configuration.
- KubernetesClusterConfig
- The configuration for running the Dataproc cluster on Kubernetes.
- KubernetesSoftwareConfig
- The software configuration for this Dataproc cluster running on Kubernetes.
- LifecycleConfig
- Specifies the cluster auto-delete schedule configuration.
- ListAutoscalingPoliciesResponse
- A response to a request to list autoscaling policies in a project.
- ListBatchesResponse
- A list of batch workloads.
- ListClustersResponse
- The list of all clusters in a project.
- ListJobsResponse
- A list of jobs in a project.
- ListOperationsResponse
- The response message for Operations.ListOperations.
- ListSessionsResponse
- A list of interactive sessions.
- ListSessionTemplatesResponse
- A list of session templates.
- ListWorkflowTemplatesResponse
- A response to a request to list workflow templates in a project.
- LoggingConfig
- The runtime logging config of the job.
- ManagedCluster
- Cluster that is managed by the workflow.
- ManagedGroupConfig
- Specifies the resources used to actively manage an instance group.
- MetastoreConfig
- Specifies a Metastore configuration.
- Metric
- A Dataproc custom metric.
- NamespacedGkeDeploymentTarget
- Used only for the deprecated beta.
- NodeGroup
- Dataproc Node Group.
- NodeGroupAffinity
- Node Group Affinity for clusters using sole-tenant node groups.
- NodeInitializationAction
- Specifies an executable to run on a fully configured node and a timeout period for executable completion.
- NodePool
- indicating a list of workers of same type
- Operation
- This resource represents a long-running operation that is the result of a network API call.
- OrderedJob
- A job executed by the workflow.
- ParameterValidation
- Configuration for parameter validation.
- PeripheralsConfig
- Auxiliary services configuration for a workload.
- PigJob
- A Dataproc job for running Apache Pig (https://pig.apache.org/) queries on YARN.
- Policy
- An Identity and Access Management (IAM) policy, which specifies access controls for Google Cloud resources.A Policy is a collection of bindings.
- PrestoJob
- A Dataproc job for running Presto (https://prestosql.io/) queries.
- ProjectsLocationsAutoscalingPoliciesResource
- ProjectsLocationsBatchesResource
- ProjectsLocationsOperationsResource
- ProjectsLocationsResource
- ProjectsLocationsSessionsResource
- ProjectsLocationsSessionTemplatesResource
- ProjectsLocationsWorkflowTemplatesResource
- ProjectsRegionsAutoscalingPoliciesResource
- ProjectsRegionsClustersNodeGroupsResource
- ProjectsRegionsClustersResource
- ProjectsRegionsJobsResource
- ProjectsRegionsOperationsResource
- ProjectsRegionsResource
- ProjectsRegionsWorkflowTemplatesResource
- ProjectsResource
- PyPiRepositoryConfig
- Configuration for PyPi repository
- PySparkBatch
- A configuration for running an Apache PySpark (https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html) batch workload.
- PySparkJob
- A Dataproc job for running Apache PySpark (https://spark.apache.org/docs/0.9.0/python-programming-guide.html) applications on YARN.
- QueryList
- A list of queries to run on a cluster.
- RegexValidation
- Validation based on regular expressions.
- RepairClusterRequest
- A request to repair a cluster.
- RepairNodeGroupRequest
- RepositoryConfig
- Configuration for dependency repositories
- ReservationAffinity
- Reservation Affinity for consuming Zonal reservation.
- ResizeNodeGroupRequest
- A request to resize a node group.
- RuntimeConfig
- Runtime configuration for a workload.
- RuntimeInfo
- Runtime information about workload execution.
- SecurityConfig
- Security related configuration, including encryption, Kerberos, etc.
- Session
- A representation of a session.
- SessionStateHistory
- Historical state information.
- SessionTemplate
- A representation of a session template.
- SetIamPolicyRequest
- Request message for SetIamPolicy method.
- ShieldedInstanceConfig
- Shielded Instance Config for clusters using Compute Engine Shielded VMs (https://cloud.google.com/security/shielded-cloud/shielded-vm).
- SoftwareConfig
- Specifies the selection and config of software inside the cluster.
- SparkBatch
- A configuration for running an Apache Spark (https://spark.apache.org/) batch workload.
- SparkHistoryServerConfig
- Spark History Server configuration for the workload.
- SparkJob
- A Dataproc job for running Apache Spark (https://spark.apache.org/) applications on YARN.
- SparkRBatch
- A configuration for running an Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) batch workload.
- SparkRJob
- A Dataproc job for running Apache SparkR (https://spark.apache.org/docs/latest/sparkr.html) applications on YARN.
- SparkSqlBatch
- A configuration for running Apache Spark SQL (https://spark.apache.org/sql/) queries as a batch workload.
- SparkSqlJob
- A Dataproc job for running Apache Spark SQL (https://spark.apache.org/sql/) queries.
- SparkStandaloneAutoscalingConfig
- Basic autoscaling configurations for Spark Standalone.
- StartClusterRequest
- A request to start a cluster.
- StartupConfig
- Configuration to handle the startup of instances during cluster create and update process.
- StateHistory
- Historical state information.
- StopClusterRequest
- A request to stop a cluster.
- SubmitJobRequest
- A request to submit a job.
- TemplateParameter
- A configurable parameter that replaces one or more fields in the template.
- TerminateSessionRequest
- A request to terminate an interactive session.
- TrinoJob
- A Dataproc job for running Trino (https://trino.io/) queries.
- UsageMetrics
- Usage metrics represent approximate total resources consumed by a workload.
- UsageSnapshot
- The usage snapshot represents the resources consumed by a workload at a specified time.
- ValueValidation
- Validation based on a list of allowed values.
- VirtualClusterConfig
- The Dataproc cluster config for a cluster that does not directly control the underlying compute resources, such as a Dataproc-on-GKE cluster (https://cloud.google.com/dataproc/docs/guides/dpgke/dataproc-gke-overview).
- WorkflowTemplate
- A Dataproc workflow template resource.
- WorkflowTemplatePlacement
- Specifies workflow execution target.Either managed_cluster or cluster_selector is required.
- YarnApplication
- A YARN application created by a job.
Typedefs
- CancelJobRequest = $Empty
- A request to cancel a job.
- Empty = $Empty
- A generic empty message that you can re-use to avoid defining duplicated empty messages in your APIs.
- Expr = $Expr
- Represents a textual expression in the Common Expression Language (CEL) syntax.
- GetPolicyOptions = $GetPolicyOptions01
- Encapsulates settings provided to GetIamPolicy.
- Status = $Status
- The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs.
- TestIamPermissionsRequest = $TestIamPermissionsRequest01
- Request message for TestIamPermissions method.
- TestIamPermissionsResponse = $TestIamPermissionsResponse
- Response message for TestIamPermissions method.
Exceptions / Errors
- ApiRequestError
- Represents a general error reported by the API endpoint.
- DetailedApiRequestError
- Represents a specific error reported by the API endpoint.