A dataframe for Dart. An abstraction to help manipulating and analysing data

Df #

A dataframe for Dart

Usage #

Create #

From csv:

   final df = await DataFrame.fromCsv('dataset/stocks.csv');

fromCSV parses files according to the csv standard, including support for escape double quotes (see: RFC4180).

Note: the type of the records are infered from the data. The first line of the csv must contains the headers for the column names. Optional parameters:

dateFormat: the string format of the date: reference. Ex: MMM dd yyyy

timestampCol: the column to be treated as a timestamp. Ex: timestamp

timestampFormat: the format of the timestamp: seconds, milliseconds or microseconds. Ex: TimestampFormat.microseconds

verbose: set to true to print some info

From records:

   final rows = <Map<String, dynamic>> rows[
      <String, dynamic>{'col1': 21, 'col2': 'foo', 'col3': DateTime.now()},
      <String, dynamic>{'col1': 22, 'col2': 'bar', 'col3': DateTime.now()},
   final df = DataFrame.fromRows(rows);

Select #

   final List<Map<String, dynamic>> rows = df.rows;
   // select a subset of rows
   final List<Map<String, dynamic>> rows = df.subset(0,100);
   /// select records for a column
   final List<double> values = df.colRecords<double>('col2');
   /// select list of records
   final List<List<dynamic>> records = df.records;

Mutate #

Add data:

   // add a row
   df.addRow(<String,dynamic>{'col1': 1, 'col2': 2.0});
   // add a line of records
   df.addRecord(<dynamic>[1, 2.0]);

Remove data:

   // remove the third row
   // limit the dataframe to 100 rows starting from index 30
   df.limit(100, startIndex: 30);

Copy a dataframe:

   // get a new dataframe from the existing one
   final DataFrame df2 = df.copy_();
   // get a new dataframe with limited data
   final DataFrame df2 = df.limit_(100);

Count #

Nulls and zeros:

   final int n = df.countNulls_('col1');
   final int n = df.countZeros_('col1');


   final int mean = df.mean('col1');
   final int sum = df.sum('col1');
   final int max = df.max('col1');
   final int min = df.min('col1');

Info #

   final int numRecords= df.length;
   final List<DataFrameColumn> cols = df.columns;
   final List<String> colNames = df.columnsNames;
   // print info and sample data
   // like head with a bit more details

Conventions #

All the dataframe operations are inplace. All the methods that return objects end with an underscore. Example:

   // inplace
   // get a new dataframe with limited data
   final DataFrame df2 = df.limit_(30);

Vocabulary conventions:

  • A row is a map of key/values pair
  • A record is a single cell value
  • An index is a row position
  • An indice is a column position
