Skip to main content

Dataset Management

Training data is the data used to train a model or algorithm. Ths training data can be uploaded to and managed in navio as datasets. The training dataset can be either uploaded together with the model and packaged in the archive or they can be uploaded separately. Datasets are shared accross an entire workspace.

A user can upload datasets separately to navio in two ways via the user interface on the Datasets page:

  1. Upload a data set as a CSV file
  2. Import data from a relational database via a JDBC connection

upload datasets

Dataset Format

Each column must contain exactly one feature, and the target variable must be in the last column.

Upload as CSV File

The training dataset CSV file can be maximum 1GB in size. This is configured as an environment variable and so could be reconfigured if required.

Import data via a Database

Tables from a relational database can be imported as datasets to navio using a JDBC connection. Simply input your connection string and credentials and select the table you want to import. Keep in mind that the table needs to be in the correct format.

tip

Check out this page for more information about JDBC.

database import data