Machine Learning On OpenLab(MOO) aims to help the data experts to reproduce the machine learning environment. This project is based on OpenLab now.
The reason for MOO, as Machine Learning is hard to reproduce cross multiple environments, as ML lacks:
These should be defined declaratively, using vendor neutral tech, to simplify reuse regardless of the choice of environment.
So MOO will try to define the model metadata to describe the environment of trained model, such as using tensorflow, version 1.12.0 or 2.0.0, python runtime, python packages and some necessary model description. MOO will use the necessary model information to reproduce the training or inference environment, this includes:
The above is enabled via your pass to MOO.
MOO provides CPU machine learning environment. Then MOO will provide a environment to train and get training result to you, also if you want to access the environment to continue developing the model or providing some other new training data to push forward the train process, MOO will provide the related information to you.
OpenLab is a community program to test and improve support for the cloud-related SDKs/Tools as well as platforms like Kubernetes, Terraform, CloudFoundry and more. The goal is to improve the usability, reliability and resiliency of tools and applications for hybrid and multi-cloud environments. All of the Machine Learning environments are provided by OpenLab, OpenLab will manage the life cycle of the environment. If you want to know more, please visit OpenLab
MOO provides a web UI for simplify the whole things. But before start using MOO, you need to do the following things first.
Module ZIP file preparation
MOO will accept a ZIP file or URL which direct to a ZIP file. The ZIP file includes:
metadata.yaml
The YAML needs to try to descript the model enviroment and model itself as much as possible.
model python file(Currently only support tensorflow python script)
The python file is treated as the entrypoint to training the specific model. Notes: The model file provided must be compssed to ZIP format.
Decide the host place of the ZIP file
Host on the website, such as the Object Storage Sever in Huawei Public Cloud.
MOO supports fetching the ZIP file via URL, you can just provide a URL to MOO, it will download the ZIP file and begin the train process automaticly.
On your local enviroment.
MOO also supports uploading the ZIP file directly.
CPU
Right now CPU backend is supported
Here, we had prepared the model file. We can access the web UI then.
a. Click "WorkFlow" Sub-Tag:
b. Choose a way to upload your Model ZIP file:
c. Choose CPU:
d. Login the Environment or not(If running SBIR job, this section must be ticked):
e. Click Submit Button and Enjoy. You will see the Job process histroy here:
Now your model had been submited successful, if you want to see the your Job status, you can just Click "Get Result" Button or forward to "Status"
Or
Here example, we can see the Job name is moo-job-hqx9uo, which is a CPU version Job and we don't want to login the enviroment(we don't "√" Login the Environment or not). We will jump to Status page and see the job status. There are several columes:
Job Name
The submited job name.
Credential
The Username and Password to login the ML enviroment provided by MOO.
IP Address
The public ip address and external port through ssh.
ML Result Link
The process log or already trained model url. You can access the model after training by MOO enviroment and the training log.
Inference
The inference page after model training complete.
Status The MOO job status.
Then, let's search the moo-job-hqx9uo job.
In above shows, moo-job-hqx9uo job had been defined as a non login enviroment job. So we can see the Status of moo-job-hqx9uo job is SUCCESS, the ML Result Link and Inference had been generated. If you want to login the Environment, MOO will keep the enviroment for a while, so ML Result Link will be replaced by "Please login the ML enviroment instead!", and won't generate the Inference page.
f. Go to the Result Link
In "process.log", you will see the whole training histroy about your model. In "training_result" directory, we will see the new trained model.
g. Taste Online Inference
Now click Inference Go Button to access the Inference page. Upload a new picture or file for pdict, and see the pdiction result in the following Result block.
Model Metadata spec aims to describe the model train enviroment, module application and parameters during training process as much as possible.
We'd like to bring up a standard here to describe machine learning environment.The standard is machine learning metadata spec. Typically, operators can reproduce the environment easily with the metadata yaml file. The spec is now just in POC. More feedback is welcome.
Framework
Key | Type | Description |
---|---|---|
name | String | The framework name |
version | String | The framework version |
runtime | Object | The framework runtime object |
runtime
Key | Type | Description |
---|---|---|
name | String | The framework runtime name |
requirement(optional) | List | The third-part library that the job depends on |
Model
Key | Type | Description |
---|---|---|
name | String | The name of model |
version | String | The version of model |
source | String | The address of model file. It can be a remote or local one |
creator | String | The author of model |
time | Date | The creation time of model |
Dataset
Key | Type | Description |
---|---|---|
name | String | TBD |
version | String | TBD |
source | String | TBD |
data_process
Key | Type | Description |
---|---|---|
data_load | TBD | TBD |
data_split | TBD | TBD |
padding | TBD | TBD |
model_architecture
Key | Type | Description |
---|---|---|
TBD | TBD | TBD |
TBD | TBD | TBD |
TBD | TBD | TBD |
training_params
Key | Type | Description |
---|---|---|
learning_rate | Float | TBD |
loss | String | TBD |
batch_size | Int | TBD |
epoch | Int | TBD |
optimizer | String | TBD |
train_op | String | TBD |
framework:
name: tensorflow
version: 1.12.0
runtime: python2.7
model:
name: EMNIST
entry_point: tensorflow_test_mnist.py
output_folder: training_result