Overview of the Google Cloud Platform (GCP) Cloud ML Engine Product

Below is an overview of the Google Cloud Platform (GCP) Cloud ML Engine product.

Knowledge Graph | Text | Top

^*Double-tap to expand/collapse an item. Left-tap & drag to move. Mouse-wheel/pinch to zoom.

Knowledge Graph | Text | Top

Google Cloud Platform (GCP) Cloud ML Engine Product Overview

Google Cloud Platform (GCP)

A serverless way of running TensorFlow

TensorFlow

Features

Simplifies book-keeping of data pre-processing through to training and web application deployment

Development Workflow

1. Use TensorFlow to create a computational graph and training application

2. Package the trainer application

2.1 Package has 6 files

mypackage/PKG-INFO

mypackage/setup.cfg

mypackage/setup.py

mypackage/trainer/__init__.py - required in every folder

mypackage/trainer/task.py - entrypoint for the ML engine; parse command line; calls model.train_and_evaluate()

mypackage/trainer/model.py - fetch data; create feature columns and engineer; create serving input function; call tf.estimator.train_and_evaluate() loop

2.2 Test locally to ensure python runs: python -m trainer.task --train_data_paths="..." --eval_data_paths="..." --output_dir=".../output" --train_steps=100 --job-dir="/tmp"

2.3 Test the package locally in ML Engine to ensure package is ok: gcloud ml-engine local train --module-name=trainer.task --package-path=".../mypackage/trainer" ... other task args

2.4 Submit to ML Engine: gcloud ml-engine jobs submit training $JOBNAME --region $REGION --module-name=trainer.task --job-dir=$OUTDIR --staging-bucket=gs://$BUCKET --scale-tier=BASIC ... other task args

3. Configure and start a Cloud ML Engine job

Deployment Workflow

1. Export the trained model - say to gs://$BUCKET

2. Deploy trained model to GCP as a microservice - gcloud or UI Console

Automatically scales up, and down to zero

2.1 gcloud ml-engine model create ${MODEL_NAME} --regions $REGION

2.2 glcoud ml-engine versions create ${MODEL_VERSION} -- model ${MODEL_NAME} --origin "gs://${BUCKET}/${MODEL_NAME}/export/exporter/9876543"

3. Client code to make REST call

3.1 token = GoogleCredentials.get_application_default().get_access_token().access_token

3.2 api = 'https://ml.googleapis.com/v1/projects/{}/models{}/versions{}:predict'.format(PROJECT, MODEL_NAME, MODEL_VERSION)

3.3 headers = {'Authorization': 'Bearer ' + token }

3.4 response = requests.post(api, json=..., headers=headers)

3.5 print response.content

Concepts

Serving input function - specifies what predict() will have to provide; similar to the input function used for training, but maps from the REST API JSON to the features

Scale Tiers

GCP Scale Tier Doco

BASIC - single machine

STANDARD_1 - small cluster - one master + 4 workers and 3 parameter servers

PREMUIM_1 - one master + 19 workers and 11 parameter servers

BASIC_GPU - one Tesla K80 GPU

BASIC_TPU - Master VM and a Cloud TPU with 8xTPU v2 cores

CUSTOM

Machine types for custom scale tiers

Hyper-parameter Tuning

Steps in Cloud ML Engine

1. Make the parameter a command-line argument

2. Ensure outputs contain something (like a trial number) so they do not overwrite each other

3. Supply hyper-parameters YAML file to the training job