Model Retraining

A navio MLflow model can be retrained by associating a Retrainer with it. A Retrainer is itself an MLflow model which implements the training and packaging steps for the retrained model in its predict method. During retraining, the model_input frame passed to the Retrainer's predict() method will contain one row with the following fields:

dataPath string path to the data parquet to be used for the model's retraining. The parquet is a folder readable via, e.g., pandas.read_parquet (with PyArrow installed).
destinationPath string path specifying where to save the retrained model artifact as a zip file.

At the end of retraining, navio expects a response from the Retrainer containing the absolute path to a valid MLflow zip artifact of the retrained model. Therefore, the Retrainer's predict() method should return an object in the form {"prediction": ["absolute/path/to/zip"]}.

Thus, the Request Schema for all Retrainer models should look like:

{
  "featureColumns": [
    {
      "name": "dataPath",
      "sampleData": "/mnt/data.csv",
      "type": "string",
      "nullable": false
    },
    {
      "name": "destinationPath",
      "sampleData": "/mnt/model.zip",
      "type": "string",
      "nullable": false
    }
  ],
  "targetColumns": [
    {
      "name": "zipAbsPath",
      "sampleData": "/mnt/model.zip",
      "type": "string",
      "nullable": false
    }
  ]
}

Retrainer Request Schema