Dataset Management
Training data is the data used to train a model or algorithm. Ths training data can be uploaded to and managed in navio as datasets. The training dataset can be either uploaded together with the model and packaged in the archive or they can be uploaded separately. Datasets are shared accross an entire workspace.
A user can upload datasets separately to navio in two ways via the user interface on the Datasets
page:
- Upload a data set as a
CSV
file - Import data from a relational database via a JDBC connection
Dataset Format
Each column must contain exactly one feature, and the target variable must be in the last column.
Upload as CSV File
The training dataset CSV file can be maximum 1GB in size. This is configured as an environment variable and so could be reconfigured if required.
Import data via a Database
Tables from a relational database can be imported as datasets to navio using a JDBC connection. Simply input your connection string and credentials and select the table you want to import. Keep in mind that the table needs to be in the correct format.
tip
Check out this page for more information about JDBC.