Link Search Menu Expand Document

Dataset

There are so different interpretations of what a dataset is. Someone call dataset is feature set, but other call it result set, and the other call it feature + result set! So in this application we need to declare what a dataset is.

Dluid calls the dataset is a bundle of (feature set + result set). And feature and result is a record set.

DataSet : FeatureSet + ResultSet
FeatureSet is RecordSet
ResultSet is RecordSet
RecordSet is set of record
Record is one row in table.

For example, there is a xor data set below.

a b a xor b
0 0 0
0 1 1
1 0 1
1 1 0

In this case feature set is.

a b
0 0
0 1
1 0
1 1

And result set is.

a xor b
0
1
1
0

And record is

a b a xor b
0 0 0
a b a xor b
0 1 1
a b a xor b
1 0 1
a b a xor b
1 1 1

And record set is collection of record. So data set, feature set and result set are sub type of record set.

* Origin of sample datasource

| |link| |:—:|:—| |iris|https://github.com/deeplearning4j/dl4j-examples/tree/master/dl4j-examples/src/main/resources| |housing|https://github.com/chendaniely/pandas_for_everyone/tree/master/data| |wine|https://github.com/chendaniely/pandas_for_everyone/tree/master/data| |stock|https://github.com/chendaniely/pandas_for_everyone/tree/master/data| |mnist|dataset in dl4j.| |Stock train data|extracted date from 2014-07-24 to 2016-08-03|
|Stock test data|extracted date from 2016-08-04 to 2016-08-25|