PrePackWeight property
\brief Optional function to pre-pack a constant tensor (i.e., a weight) to the kernel's preferred data layout.
For example, a Conv kernel can define this function to pack input W to the channel-last data layout before inference.
Pre-packing can operate in three different modes: no pre-packing mode, sharing mode, and non-sharing mode.
- No pre-packing mode: The kernel can forgo any weight pre-packing for the given
input_indexby settingis_packedto false and returning a successful OrtStatus. In this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called for that specificinput_index. - Sharing mode: Sharing is allowed if the
prepacked_weight_cacheargument is not NULL and the EP stores weight data in CPU-accessible memory. In this case, the kernel can optionally choose to share the packed weight with other kernels that use the same weight (compared by content hash). To do so, the kernel must allocate the packed weight with the providedallocator, then it stores the packed weight data intoprepacked_weight_cachevia SharedPrePackedWeightCache_StoreWeightData(), setsis_packedto true, and returns a successful OrtStatus. ORT will subsequently call OrtKernelImpl::SetSharedPrePackedWeight() to provide this kernel with the actual shared weight data, whose memory location could differ (i.e., if shared data was allocated by a previously processed kernel). - Non-sharing mode: In non-sharing mode, the
prepacked_weight_cacheargument is ignored. In this mode, the implementation allocates the packed data with the providedallocator, setsis_packedto true, and returns a successful OrtStatus. The kernel is ultimately responsible for releasing the packed data for the weight withallocator. ORT may release the original (unpacked) weight, which must not be accessed in OrtKernelImpl::Compute(). Note that in this mode, the kernel's OrtKernelImpl::SetSharedPrePackedWeight() function is not called by ORT for that specificinput_index.
\note This function is based on the internal OpKernel::PrePack() virtual function used within ORT.
\paramin this_ptr The OrtKernelImpl instance.
\paramin tensor The OrtValue instance representing the constant tensor (weight). Do not cache in the kernel.
\paramin input_index The input index of the tensor in this kernel.
\paramin allocator Allocator for allocating the pre-packed data. Its use is required in sharing mode and
recommended, but not required, in the non-sharing mode. This will be an allocator set by
the application for the session/environment (e.g., via CreateAndRegisterAllocatorV2
or RegisterAllocator), or an allocator on the OrtEpDevice (read-only or default) otherwise.
The allocator remains valid throughout the lifetime of the OrtKernelImpl instance.
\paramin prepacked_weight_cache May be NULL. If not NULL, the kernel may choose to share a packed weight by
first storing it in the OrtSharedPrePackedWeightCache instance and then
receiving the actual shared weight data in the call to
OrtKernelImpl::SetSharedPrePackedWeight(). See the above description for
"sharing mode".
\paramout is_packed Output parameter that the implementation sets to true if the kernel packed the tensor data.
\snippet{doc} snippets.dox OrtStatus Return Value
\note Implementation of this function is optional. If not implemented (set to NULL), ORT assumes the kernel
does not pre-pack weight data (i.e., is_packed defaults to false).
\since Version 1.24.
Implementation
external ffi.Pointer<
ffi.NativeFunction<
OrtStatusPtr Function(
ffi.Pointer<OrtKernelImpl> this_ptr,
ffi.Pointer<OrtValue> tensor,
ffi.Int input_index,
ffi.Pointer<OrtAllocator> allocator,
ffi.Pointer<OrtSharedPrePackedWeightCache> prepacked_weight_cache,
ffi.Pointer<ffi.Bool> is_packed,
)
>
>
PrePackWeight;