maxParsingRequestsPerMin property

int? maxParsingRequestsPerMin
getter/setter pair

The maximum number of requests the job is allowed to make to the LLM model per minute.

Consult https://cloud.google.com/vertex-ai/generative-ai/docs/quotas and your document size to set an appropriate value here. If unspecified, a default value of 5000 QPM would be used.

Implementation

core.int? maxParsingRequestsPerMin;