Model Retraining
A navio MLflow model can be retrained by associating a Retrainer with it. A Retrainer is itself an MLflow model which implements the training and packaging steps for the retrained model in its predict method. During retraining, the model_input
frame passed to the Retrainer's predict()
method will contain one row with the following fields:
dataPath
string path to the data parquet to be used for the model's retraining. The parquet is a folder readable via, e.g.,pandas.read_parquet
(withPyArrow
installed).destinationPath
string path specifying where to save the retrained model artifact as a zip file.
At the end of retraining, navio expects a response from the Retrainer containing the absolute path to a valid MLflow zip artifact of the retrained model. Therefore, the Retrainer's predict()
method should return an object in the form {"prediction": ["absolute/path/to/zip"]}
.
Thus, the Request Schema for all Retrainer models should look like:
{
"featureColumns": [
{
"name": "dataPath",
"sampleData": "/mnt/data.csv",
"type": "string",
"nullable": false
},
{
"name": "destinationPath",
"sampleData": "/mnt/model.zip",
"type": "string",
"nullable": false
}
],
"targetColumns": [
{
"name": "zipAbsPath",
"sampleData": "/mnt/model.zip",
"type": "string",
"nullable": false
}
]
}
Retrainer Request Schema