trainingClient
Performs the OAuth2 authentication using oauthClientId and oauthClientSecret on oauthTokenUrl and sends the request to the server.
Training data received at the inport and the attached model script are uploaded to jobAPI's internal storage. With the authentication token, the submit request built from jobName, env, image, completionTime and resources is sent to jobAPI.
The server responds with the job status.
In case the job was scheduled as expected, training logs are continuously sent to the outport. If job ends with status SUCCESSFUL, the SavedModel is retrieved and sent to the outport.
Configuration Parameters
Parameter |
Type |
Description |
---|---|---|
oauthClientId |
string |
Client ID used for the OAuth2 authentication. |
oauthClientSecret |
string |
Client Secret used for the OAuth2 authentication. |
oauthTokenUrl |
string |
Url for the address where the OAuth2 authentication will be performed. |
jobAPI |
string |
Url where the job is submitted to, the logs retrieved and the storage endpoint is acquired. |
jobName |
string |
Internal name of the job. |
env |
list |
Environment variables used in your model. Use a JSON array notation of name, value objects. The env variables defined in this section are passed as parameters to your model script e.g. [{"name":"TEST\_DATA","value":"..."}]is passed as --test_data=.... |
image |
string |
The docker image that you want to run, e.g. tensorflow/tensorflow:1.4.0. |
completionTime |
string |
The time you want to allocate to run the job. Currently this is a not implemented free form string, but required. Later on the MLF team might schedule based on that signal and enforce this time. |
resources |
object |
Resources that your job needs. Use a JSON object representation and define gpus, cpus and memory (type integer) e.g. {"cpus":1,"gpus":1,"memory":2048}. |
requirements |
string |
Pip dependencies of your model script e.g. tensorflow==1.4.0. |
pythonMajorVersion |
integer |
Define the major python version of your model script. Allowed values are 2 or 3. If set to 3, the requirements are installed with pip3 and the script is run with python3 command. |
version |
string |
API Version |
script |
string |
This is the actual script which defines the machine learning model. An example can be found at https://github.com/tensorflow/models/blob/master/official/wide_deep/wide_deep.py |
Input
Input |
Type |
Description |
---|---|---|
config |
message |
Input to dynamically change the configurations. Only the message headers are considered. If a field has the same field as the config, it is overridden. |
data_input |
message |
The message header, which is a dictionary, is considered as list of files uploaded to the jobAPI storage. It is expected that the dictionaries have the following form: {"<file name>": <data>} |
Output
Output |
Type |
Description |
---|---|---|
response |
string |
The logs returned by the training process. Polling every second. This port is intended to be connected to a terminal. |
trainingStatus |
message |
The current training status is in message headers. If training
has succeeded, message body contains the savedModel. Message is
sent every minute. An example while the model is being trained
is:
{"containerID":"","finishTime":null,"job_id":"","name":"","startTime":"","status":"Running","submissionTime":""} After it is trained, the saved model is sent. |