meshagent_dart_arrow 0.41.5
meshagent_dart_arrow: ^0.41.5 copied to clipboard
Apache Arrow type and IPC primitives for Meshagent Dart clients.
meshagent_dart_arrow #
Apache Arrow type and IPC primitives for Meshagent Dart clients.
This package targets Arrow IPC metadata version V5 and is validated against Apache Arrow 21.0.0 in PyArrow, JavaScript apache-arrow, and .NET Apache.Arrow smoke interop tests.
IPC Support #
Supported:
- Arrow IPC stream containers through
ArrowIpcStreamReaderandArrowIpcStreamWriter. - Arrow IPC file containers through
ArrowIpcFileReaderandArrowIpcFileWriter. - Schema messages, dictionary batches, record batches, and multi-batch tables.
- Full dictionary batches emitted as needed by the writer.
Not currently supported:
- IPC body compression.
- Dictionary replacement or delta semantics.
- Tensor IPC.
- Zero-copy buffer views over the original IPC body.
Type System #
The Dart model exposes Arrow schema types directly instead of Meshagent legacy data type classes. ArrowSchema, ArrowField, and the ArrowDataType subclasses model the Arrow type tree, including nested, dictionary, union, run-end encoded, temporal, interval, decimal, and binary/view types.
Record batches expose typed Arrow arrays. toRows() is a convenience projection for UI and simple SDK cases; it is not the storage model.
Arrow extension types are represented as standard Arrow field metadata:
ARROW:extension:nameARROW:extension:metadata
Use ArrowField.withExtension(...), extensionName, and extensionMetadata for this metadata. Extension types are not modeled as separate ArrowDataType subclasses because Arrow IPC stores the physical storage type plus extension metadata.
Dart Value Representation #
int8,int16,int32,uint8,uint16, anduint32decode to Dartint.int64anduint64decode toBigIntso VM and dart2js behavior stay exact.- Floating point values decode to
double. - UTF-8 values decode to
String. - Binary values decode to
Uint8List. date32anddate64decode toArrowDateValue.timestampdecodes toArrowTimestampValue, preserving unit, raw value, and timezone.decimal128anddecimal256decode toArrowDecimalValue, preserving precision, scale, bit width, and scaled integer value.- Lists, structs, maps, unions, dictionaries, and run-end encoded arrays preserve their Arrow array classes and also project through
operator []/toRows().
Examples #
Dart stream write/read:
final schema = ArrowSchema([
const ArrowField(name: 'code', type: ArrowUtf8Type(), nullable: false),
const ArrowField(
name: 'count',
type: ArrowIntType(bitWidth: 32, signed: true),
nullable: false,
),
]);
final batch = ArrowRecordBatch.fromColumns(
schema: schema,
columns: [
ArrowArray(field: schema.fields[0], values: ['SFO', 'LAX']),
ArrowArray(field: schema.fields[1], values: [1, 2]),
],
);
final bytes = ArrowIpcStreamWriter(schema: schema, batches: [batch]).write();
final table = ArrowIpcStreamReader(bytes).readTable();
Dart file write/read:
final fileBytes = ArrowIpcFileWriter.fromTable(table).write();
final fileTable = ArrowIpcFileReader(fileBytes).readTable();
Python pyarrow equivalent:
import pyarrow as pa
import pyarrow.ipc as ipc
schema = pa.schema([
pa.field("code", pa.utf8(), nullable=False),
pa.field("count", pa.int32(), nullable=False),
])
batch = pa.record_batch(
[pa.array(["SFO", "LAX"], type=pa.utf8()), pa.array([1, 2], type=pa.int32())],
schema=schema,
)
sink = pa.BufferOutputStream()
with ipc.new_stream(sink, schema) as writer:
writer.write_batch(batch)
bytes_ = sink.getvalue().to_pybytes()
table = ipc.open_stream(bytes_).read_all()
JavaScript apache-arrow equivalent:
import { tableFromArrays, tableFromIPC, tableToIPC } from 'apache-arrow';
const table = tableFromArrays({
code: ['SFO', 'LAX'],
count: Int32Array.from([1, 2]),
});
const bytes = tableToIPC(table, 'stream');
const decoded = tableFromIPC(bytes);
.NET Apache.Arrow equivalent:
using Apache.Arrow;
using Apache.Arrow.Ipc;
using Apache.Arrow.Types;
var schema = new Schema(
new[]
{
new Field("code", StringType.Default, nullable: false),
new Field("count", Int32Type.Default, nullable: false),
},
metadata: null
);
var batch = new RecordBatch(
schema,
new IArrowArray[]
{
new StringArray.Builder().Append("SFO").Append("LAX").Build(),
new Int32Array.Builder().Append(1).Append(2).Build(),
},
length: 2
);
await using var stream = new MemoryStream();
using var writer = new ArrowStreamWriter(stream, schema);
await writer.WriteRecordBatchAsync(batch);
await writer.WriteEndAsync();
stream.Position = 0;
using var reader = new ArrowStreamReader(stream);
var decoded = await reader.ReadNextRecordBatchAsync();