Skip to main content

Model Retraining

A navio MLflow model can be retrained by associating a Retrainer with it. A Retrainer is itself an MLflow model which implements the training and packaging steps for the retrained model in its predict method. During retraining, the model_input frame passed to the Retrainer's predict() method will contain one row with the following fields:

  • dataPath string path to the data parquet to be used for the model's retraining. The parquet is a folder readable via, e.g., pandas.read_parquet (with PyArrow installed).

  • destinationPath string path specifying where to save the retrained model artifact as a zip file.

At the end of retraining, navio expects a response from the Retrainer containing the absolute path to a valid MLflow zip artifact of the retrained model. Therefore, the Retrainer's predict() method should return an object in the form {"prediction": ["absolute/path/to/zip"]}.

Thus, the Request Schema for all Retrainer models should look like:

{
"featureColumns": [
{
"name": "dataPath",
"sampleData": "/mnt/data.csv",
"type": "string",
"nullable": false
},
{
"name": "destinationPath",
"sampleData": "/mnt/model.zip",
"type": "string",
"nullable": false
}
],
"targetColumns": [
{
"name": "zipAbsPath",
"sampleData": "/mnt/model.zip",
"type": "string",
"nullable": false
}
]
}

Retrainer Request Schema