Overview of the Google Cloud Platform (GCP) Cloud Vision API

Below is an overview of the Google Cloud Platform (GCP) Cloud Vision API.

^*Double-tap to expand/collapse an item. Left-tap & drag to move. Mouse-wheel/pinch to zoom.

Google Cloud Platform (GCP) Cloud Vision API Overview

Google Cloud Platform (GCP)

Used for working with and recognising content in still images

GCP Cloud Vision documentation

Usage Pattern (versus AutoML Vision)

Objective - enable ML practitioners to harness the power of Google's ML for images

Primary use case - face detection, OCR, object detection, etc

Data requirements - just images (labelled or not)

Output format - as required

Customisation - can be customised

Efforts - high for end-to-end model development

Features

Label Detection - broad sets of categories

Face Detection

Features

Multiple faces

Key facial attributes - "faceAnnotations"

Emotional state (surprise, anger, joy & sorrow likelihoods)

Head wear likelihood

Angles: roll, pan, tilt

Detection confidence

Under exposed likelihood

Blurred likelihood

Bounding polygon of face

Web Annotations - similar images from the Internet: entityId, score, description

Optical Character Recognition (OCR) - text detection, support for multiple languages

Explicit Content Detection - adult or violent content

Landmark Detection - popular natural or man-made structures

Logo Detection - popular product logos