jsonstreamreader 1.1.4 jsonstreamreader: ^1.1.4 copied to clipboard
This project processes local and remote json files/sources by splitting it into smaller chunks, with automatic garbage collecting.
Json Stream Reader #
A Flutter Json Stream Reader
This project processes a local json file by splitting it into smaller chunks, with automatic garbage collecting.
Classes #
Streamer #
This class defines the size of each chunk,optionally values include
-
chunksize
default is 100
-
start
default is 0 and defines where Reader should start parsing file
-
end
default is the end of the file and defines where Reader should stop Reading file
Readers #
All readers have optional values which include
- int delay
how long the reader should wait before reading next chunk in microseconds
-
int combine
If combine is 1, array only values will be combined ie. [1,2,3] will result in {"0":1,"1":2,"2":3}, object values in an array are not affected ie. [{},{}] which mean [1,2,{"third":3}] will result in {"0":1,"1":2,"2":3, {"third":3} }
If combine is 2, array values will be collasped into an object closest to the root eg. {'first':{'second':{ "third":3 } } } will be collaped to {'first':{'/second/third/':3 } }
-
Reader
Provides access to json data by using function callbacks.
Methods include
- fromState(ReaderState state)
- void hault() Not to be confused with pause(), this prevents the reader from starting immediately eg. new StreamReader(s, combine: 1)..hault();
- void begin() Starts the reading process after hault()
- Future StreamReaderState pause() Pauses the reader and returns the last know state of the reader
- void resume() Continues reader after pause()
- onpause(ReaderState state) Called when pause is called
- progess(Function func(double progress))
- fail(Function func(dynamic err))
- done(Function func)
- trailing(Future func(dynamic value, String key))
- filter(String expression, Future<dynamic> func(dynamic value, String key))
-
StreamReader
Provides access to json data by using StreamController
Streams include
- fromState(StreamReaderState state)
- double progess
- dynamic fail
- Null done
- void hault() Not to be confused with pause(), this prevents the reader from starting immediately eg. new StreamReader(s, combine: 1)..hault();
- void begin() Starts the reading process after hault()
- Future StreamReaderState pause() Pauses the reader and returns the last know state of the reader
- void resume() Continue the reader after pause()
- Stream StreamReaderState onpause Called when pause is called
- Stream StreamItem pipe
- Stream StreamItem trailing
- Stream StreamItem filter
-
StreamItem
dynamic value
String key
-
ReaderState
- String toString() This method returns a string whose size is at most double the size of chunksize(ie. from Streamer), I suggest saving this string in a file instead of local storage, problems may arise when using fromString
- fromString() This method creates a ReaderState from a string, to be used in conjunction with toString()
-
StreamReaderState
- String toString() This method returns a string whose size is at most double the size of chunksize(ie. from Streamer), I suggest saving this string in a file instead of local storage, problems may arise when using fromString
- fromString() This method creates a StreamReaderState from a string, to be used in conjunction with toString()
-
GlobalReaderManager It is used to pause or resume all or one Reader or StreamReader.
- static Future List AbstractStateReader pause([AbstractReader reader])
- static void resume([AbstractReader reader]) There is no need to add or remove a state it is done automatically
-
FileState It is used to determine when a file is no longer in use. -Stream dynamic notinuse return time is either Stream RemoteSource or Stream File
Sources #
A streamer can accept either a File(dart.io) or RemoteSource From this package.
- File
File file = new File(path.path + '/citylots.json');
Streamer s = new Streamer(file);
- RemoteSource This class defines the size of each chunk,optionally values include
-
bool supportsRange
Whether or not Range headers can be sent,please not that ranges sent are pased on the initial size of the file. default is false
-
bool compressedResponse
Whether or not responses are compressed, if not
accept-encoding=identity
is sent.default is true, if you intend to use Range Headers,but the server's range does not support ranges as this package suggests, please set compressedResponse to false -
int offset When a file is resumed, range headers starts from the length of the file plus(+) the offset. default value is 0
-
bool trim In some cases your response might start with an unexpected string ie.
RemoteSource source = new RemoteSource('https://jsonplaceholder.typicode.com/posts');
Streamer s = new Streamer(source);
source.download();
or
RemoteSource source = new RemoteSource('https://raw.githubusercontent.com/zemirco/sf-city-lots-json/master/citylots.json');
Streamer s = new Streamer(source);
source.download();
NB. RemoteSource does not work with Content-disposition: attachment
The only difference is that you must call source.download()
; preferrable after creating Streamer. The package sends header Accept-Encoding: identity
to determine the size of response without downloading the file.
Other useful functions include:
- Future int length()
- Future FileSystemEntity delete()
- Future void pause()
- void setBody(Map<String, String> body)
- void addHeader(String property, String value)
Using JsonPath "Hack" in filter #
Expression | Path |
---|---|
$. | ^\/$ |
[0:9] | [0>= <=9]\/$ |
[:9]+ | [0>= <=9]+\/$ |
.. | \/.*\/.*\/$ |
... | \/.*\/.*\/.*\/$ |
$.* | /.* & / |
.* | .* |
[*] | .* |
{name} | any value with key "name" |
Operators #
Symbol | Meaning |
---|---|
& | AND |
| | OR |
! | NOT |
Operations can be used with path or jsonpath combinations. example:
-
"{name} & /items/"
all direct descendants of items with at least one key name
-
"{name} & $.items"
all direct descendants of items with at least one key name
-
"{name} & !$.items"
all direct descendants from root except for items with at least one key name
-
"{name} | !/"
all objects with at least one key name that is not a direct descendant of root
-
"({name} | !/) & $.items.*"
all objects with at least one key name or not a direct descendant of root and that is a descendant of items
The order does not matter(Operations was improved using https://github.com/riichard/boolean-parser-js)
Future #
After doing some benchmarking I found out these function are very expensive, so I added futures to the mix, now these futures are optional and must not be null.
What order are futures executed? #
- trailing
- filter
In most cases not all parts of an object is present in the first value instance, in this case a transformer will be handy for StreamReader while other methods can be used for Reader.
Benchmarks
- 02.json takes 2 milliseconds to execute
- citylots.json takes 8 minutes to execute this is mostly because of how complex the dataset is.
- www.carqueryapi.com takes about 2.3 milliseconds, please note that the file had to be modified I remove ?( from the beginning and ); at the end.
Execution time depends exclusively on the complexity of the file and the number of items to be processed. ie. a file 3 times larger than citylots.json can take approximately five minutes to execute.Using combine:2 can speed up processing considerably.
Assigning Multiple Readers #
File file = new File(path.path + '/file.json');
Streamer s = new Streamer(file, chunksize: 100);
Reader r = new Reader(s, delay: 10,start: 8, end: 6170007);
int item = 0;
r
.filter('\$.*', (dynamic value, String key) {
value = null;
key = null;
//print("value is ${value} and key is ${key} of stream 2");
}).done(() {
print('${item} executed in ${stopwatch.elapsed}');
}).progress((double progress) {
print('progress is ${progress}');
}).fail((err) {
print(err);
print(err.stackTrace);
});
Streamer s2 =
new Streamer(file, chunksize: 100, start: 6170016, end: 12340015);
Reader r2 = new Reader(s2, delay: 0);
r2.filter('\$.*', (dynamic value, String key) {
value = null;
key = null;
//print("value is ${value} and key is ${key} of stream 2");
}).done(() {
print('executed in ${stopwatch2.elapsed}');
}).progress((double progress) {
print('progress is ${progress}');
}).fail((err) {
print("error 2 is ${err}");
print(err.stackTrace);
});
This project was tested using
- https://github.com/zemirco/sf-city-lots-json/blob/master/citylots.json
- https://github.com/thaiwsa/aws-speed/blob/master/JsonProcess/jsondata/02.json
- http://www.carqueryapi.com/api/0.3/?callback=?&cmd=getMakes&year=1970&sold_in_us=1&utm_medium=referral&utm_campaign=ZEEF&utm_source=https%3A%2F%2Fjson-datasets.zeef.com%2Fjdorfman
- https://catalogue.data.gov.bc.ca/dataset/children-and-family-development-cases-in-care-demographics