koala

Build codecov Pub Version Pub Points GitHub

A poor man's version of a pandas DataFrame.
Collect, access & manipulate related data.

Examples

Create a DataFrame from a csv file, preexisting column names and data, map representations of the data or create an empty DataFrame and provide it with its properties later on

final fromCsv = Dataframe.fromCsv(
    path: 'path/to/file.csv', 
    eolToken: '\n', 
    maxRows: 40,
    skipColumns: ['some-irrelevant-column'],
    convertDates: true,
    datePattern: 'dd-MM-yyyy'
);

final fromNamesAndData = DataFrame(
    ['a', 'b'], 
    [
      [1, 2],
      [3, 4],
      [69, 420]
    ]
);

The DataFrame class inherits from the list which contains its data matrix, so rows may be accessed through normal indexing. Columns on the other hand can be accessed by calling the instance with a contained column name.

// get a row
final secondRow = df[1];

// get a column
final bColumn = df('b');
final typedBColumn = df<double?>('b');
final slicedBColumn = df('b', start: 1, end: 5);
final filteredBColumn = df.columnAsIterable('b').where((el) => el > 7).toList();

// grab a singular record
final record = df.record<int>(3, 'b');

Manipulate rows & column

// add and remove rows through the built-in list methods 
df.add([2, 5]);
df.removeAt(4);
df.removeLast();

// manipulate columns
df.addColumn('newColumn', [4, 8, 2]);
df.removeColumn('newColumn');
df.transformColumn('a', (record) => record * 2);

Copy or slice the DataFrame

final copy = df.copy();
final sliced = df.sliced(start: 30, end: 60);   
df.slice(start: 10, end: 15);  // in-place counterpart

Sort the DataFrame in-place or get a sorted copy of it

final sorted = df.sortedBy('a', ascending: true, nullFirst: false);
sorted.sortBy('b', ascending: false, compareRecords: (a, b) => Comparable.compare(a.toString().length, b.toString().length));

Obtain a readable representation of the DataFrame by simply passing it to the print function

DataFrame df = DataFrame.fromRowMaps([
  {'col1': 1, 'col2': 2},
  {'col1': 1, 'col2': 1},
  {'col1': null, 'col2': 8},
]);
print(df);

leads to the output:

    col1 col2
0 | 1    2   
1 | 1    1   
2 | null 8   

...and so on and so forth.

Contribution

I intend to actively maintain this repo, so feel free to create PRs, as there still is a hell of a lot of functionality one may add to the DataFrame.

Acknowledgements

This repository started off as a fork from the as of now unmaintained and generally lackluster df, ultimately however, I wound up rewriting basically everything. Still, shout out boyz.

Author

C'est moi, w2sv

Libraries

koala