dartframe 0.5.1 copy "dartframe: ^0.5.1" to clipboard
dartframe: ^0.5.1 copied to clipboard

DartFrame is a Dart library inspired by Geo-(Pandas), simplifying structured data handling (tables, CSVs, JSON) with function tools for filtering, transforming, and analysis.

Pub package Null Safety Likes Points SDK Version

Last Commits Pull Requests Code size License

Stars Forks Github watchers Issues

DartFrame #

DartFrame is a robust, lightweight Dart library designed for data manipulation and analysis. Inspired by popular data science tools like Pandas and GeoPandas, DartFrame provides a DataFrame-like structure for handling tabular data, making it easy to clean, analyze, and transform data directly in your Dart applications.

Key Features #

1. DataFrame Operations #

  • Creation: Create DataFrames from various sources such as CSV strings, JSON strings, or directly from lists and maps.
  • Data Exploration:
    • head(n): View the first n rows.
    • tail(n): View the last n rows.
    • limit(n,index): View the first n rows starting from a specified index.
    • describe(): Generate summary statistics.
    • structure(): Display the structure and data types of the DataFrame.
    • shape: Get the dimensions of the DataFrame.
    • columns: Access or modify column names.
    • rows: Access or modify row labels.
    • valueCounts(column): Get the frequency of each unique value in a column.
  • Data Cleaning:
    • Handle missing values using fillna(), replace(), and missing data indicators.
    • Rename columns with rename().
    • Drop unwanted columns with drop().
    • Filter rows based on condition functions with filter().

2. Data Transformation #

  • Add calculated columns directly: df['new_column'] = df['existing_column'] > 30.
  • Group data with groupBy() for aggregated insights.
  • Concatenate DataFrames vertically or horizontally.
  • Add row labels with addRow().
  • Add column labels with addColumn().
  • Shuffle rows with shuffle().

3. Analysis Tools #

  • Frequency counts of column values using valueCounts().
  • Count the number of zeros in a column using countZeros().
  • Count the number of null values in a column using countNulls().
  • Calculate mean, median, and other statistics directly on columns or grouped data.

4. Series Operations #

  • Series objects for 1D data manipulation.
  • Perform element-wise operations, conditional updates, and concatenation.

5. Data I/O #

  • Import data from CSV or JSON formats:
    • DataFrame.fromCSV()
    • DataFrame.fromJson()
  • Export data to JSON or CSV formats:
    • toJSON()

6. Customizable and Flexible #

  • Handle mixed data types with ease.
  • Optionally format and clean data on import.
  • Support for flexible column structures.

Documentation #

For comprehensive documentation on specific classes and their functionalities, please refer to the following:

  • DataFrame: Detailed guide on creating and manipulating DataFrames, including data loading, cleaning, transformation, and analysis.
  • Series: In-depth information on Series objects, covering creation, operations, statistical methods, and more.
  • GeoDataFrame: Documentation for working with geospatial data using GeoDataFrames.
  • GeoSeries: Details on GeoSeries, the geometry-aware counterpart to Series.

You can also find runnable examples in the example directory of the repository.


Installation #

To install DartFrame, add the following to your pubspec.yaml:

dependencies:
  dartframe: any

Then, run:

dart pub get

To get started, import the library:

import 'package:dartframe/dartframe.dart';

For detailed examples and usage, please refer to the documentation in the doc folder and the examples in the example folder.


Performance and Scalability #

DartFrame is optimized for small to medium-sized datasets. While not designed for big data processing, it can handle thousands of rows efficiently in memory. For larger datasets, consider integrating with distributed processing tools or databases.


Testing #

Tests are located in the test directory. To run tests, execute dart test in the project root.


Benchmarking #

Performance benchmarks are available in the benchmark directory. These benchmarks, built using the benchmark_harness package, help measure the performance of various operations on Series and DataFrame objects.

For detailed instructions on how to run these benchmarks and interpret their output, please see benchmark/BENCHMARKING.MD.

Reference (simulated) performance numbers can be found in benchmark/RESULTS.MD.


Contributing Features and bugs #

🍺 Pull requests are welcome #

Don't forget that open-source makes no sense without contributors. No matter how big your changes are, it helps us a lot even it is a line of change.

There might be a lot of grammar issues in the docs. It's a big help to us to fix them if you are fluent in English.

Reporting bugs and issues are contribution too, yes it is. Feel free to fork the repository, raise issues, and submit pull requests.

Please file feature requests and bugs at the issue tracker.

Author #

Charles Gameti: gameticharles@GitHub.

License #

This library is provided under the Apache License - Version 2.0.

3
likes
150
points
909
downloads

Publisher

unverified uploader

Weekly Downloads

DartFrame is a Dart library inspired by Geo-(Pandas), simplifying structured data handling (tables, CSVs, JSON) with function tools for filtering, transforming, and analysis.

Repository (GitHub)

Topics

#dataframe #series #geodataframe #geoseries

Documentation

Documentation
API reference

License

Apache-2.0 (license)

Dependencies

ffi, geojson_vi, geoxml, intl, r_tree, web

More

Packages that depend on dartframe