Google Cloud Platform (GCP) Cloud Storage Product Overview
Google Cloud Platform (GCP)
Features
Object storage (replace only), NOT a file store (cannot open an object for writing)
Globally unique bucket names
Max size of a single object is 5 TB
Durability: 11 x 9's
Object metadata
Versioning: gsutil ls -a # to show all files, including all old versions of a file
Lifecycle Management
Lifecycle rules and actions - e.g. based on age, creation date, storage class then do an action, such as set it to nearline, coldline, delete it
Change notifications - e.g. use Cloud Functions to do something with the object when it is uploladed etc
Sharing files
Signed URLs provide temporary access to storage buckets
1. Create key : ``` gcloud iam service-accounts keys create ~/key.json --iam-account storage-admin-sa@my-project.iam.gserviceaccount.com
2. Create signed url: ``` gsutil signurl -d 10m ~/key.json gs://my-secure-bucket/blah.jpg
Static content
Give 'allUsers' read access
Access URL: https://storage.googleapis.com/bucketname/filename
Map bucket to your domain
GCP doco on static websites with Cloud Storage
Name the bucket using your domain or subdomain name
Point a CNAME record for your subdomain to c.storage.googleapis.com
Prove ownership with a TXT record
GCP doco for domain name verification
Set option on the bucket "edit website configuration" to specify Main page and 404 page
To serve as HTTPS
Guidance on HTTPS serving
Google doco on Firebase hosting, HTTPS and setting up a custom domain
Create a load balancer (under Network Services)
Backend config: point to your storage bucket, enable Cloud CDN optionally
Frontend config: http or https protocol, provide a certificate (have Google create for free (Let's Encrypt) or upload own)
Create A record to point to load balancer
Request Pays capability - user is charged instead of you
Bucket Lock
Retention Policy - set a minimum duration to prevent modification or deletion of an object after it has been uploaded
Event-based holds by default - places holds on objects as they are uploaded, preventing modification or deletion until you remove the hold - which then starts the clock for the retention policy
Security
Permissions
Can add members - via email account or choose 'allusers'
Can add a group, domain, user email, or project - and give read or owner permissions
IAM Roles
Storage Object Viewer - 4 permissions: get/list resourcemanager.projects and storage.objects
Creator - 3 permissions: storage.objects.create, get/list resourcemanager.projects
Admin - 9 permissions: storage.object.create/delete/get/getIamPolicy/list/setIamPolicy/update, resourcemanager.projects.get/list
Settings to choose at creation of bucket
Access Control model (choose at bucket creation)
Set permissions uniformly at bucket-level (Bucket Policy Only)
Set object-level and bucket-level permissions (more granular)
Storage Classes
GCP documentation on Storage Classes
Standard Storage - no minimum storage duration
Nearline Storage - min duration 30 days; generally cheaper than Standard if you access it less than once per month
Coldline Storage - min duration 90 days; generally cheaper than Standard if you access it less than once per year
Encryption
Encrypted by default - Google managed key
Customer managed key via Google Cloud Key Management Service (KMS)
Customer supplied keys - created and managed by you and uploaded into GCP
Set a retention policy: enter a duration of seconds/days/months/years
Labels - key/value pairs
Cloud Data Transfer Services
GCP doco on Cloud Data Transfer Services
Estimated transfer times over a network
1 TB: 30hrs@100Mbps, 3hrs@1Gbps, 18min@10Gbps, 2min@100Gbps
10 TB: 12days@100Mbps, 30hrs@1Gbps, 3hrs@10Gbps, 18min@100Gbps
100 TB: 124days@100Mbps, 12days@1Gbps, 30hrs@10Gbps, 3hrs@100Gbps
Online Transfer - for small amounts of data
Web console upload
gsutil: rsync multithreaded option; resumable in case of network interruption
JSON API
Cloud Storage Transfer Service
Bucket to bucket
Scheduled or ad hoc - create a transfer job, including from an S3 bucket
BigQuery Data Transfer Service
Import into BigQuery
Transfer Appliance
Transfer Appliance - physical hardware sent to you for offline data transfers (more than 20TB)
Need to request a transfer appliance from Google
You manage the encryption key
Disk sizes: 100TB, 480TB
Pricing
Overview
Pricing Tables GCP Doco
Charges are pro-rated to the sub-second for each object (except Nearline and Coldline)
Each object version is charged at same rate as the live version
At-Rest Data Storage Pricing
Regional: 99.9% availability; 2.3 cents / GB-month Standard, 1.6 cents Nearline, 0.06 cents Coldline
Multi-region: 99.95% availability (US or Asia or Europe) - 5 out of 1000 requests fail; 2.6 cents / GB-month, 1 cent Nearline, 0.07 cents Coldline
Dual-region: 99.95% availability (Iowa (middle US) and South Carolina (East US)); 3.6 cents / GB-month Standard, 2 cents Nearline, 0.09 cents Coldline
Egress within GCP Pricing
Within same location/zone, or between region to multiregion (or vice versa) services - Free
Between different parts of continent - 1 cent/GB
Between worldwide locations - see General Network usage pricing
General Network Usage Pricing
Monthly 0-1 TB - Aust-SE1 to: World 19 cents, China 23 cents; US multi to: Australia 19c, World 12c, China 23c
Monthly 1-10 TB - Aust-SE1 to: World 18 cents, China 22 cents; US multi to: Australia 18c, World 11c, China 22c
Monthly 10+ TB - Aust-SE1 to: World 15 cents, China 20 cents; US multi to: Australia 15c, World 8c, China 20c
Operations Pricing
Operations Overview
Operation - an action that changes or retrieves information about buckets and object in Cloud Storage
Class A - any insert/patch/update/setIamPolicy, bucket/object list, object compose/copy
Class B - any get/getIamPolicy, each object notification
Free Operations - bucket/object delete
Charged for a 404 when Website Configuration is enabled, but otherwise 307, 4xx or 5xx are free
Standard Storage - Class A 5 cents / 10k ops, Class B 0.4 cents / 10k ops
Nearline & Durable Reduced Availability (DRA) Storage - Class A 10 cents / 10k ops, Class B 1 cents / 10k ops
Coldline Storage - Class A 10 cents / 10k ops, Class B 5 cents / 10k ops
Object Lifecycle Management changes - as per original Storage Class of the object
Retrieval Pricing
Retrieve Cost - when you read, copy, rewrite data or metadata in Nearline or Coldline Storage, in addition to other costs
Nearline Storage - 1c / GB
Coldline Storage - 5c / GB
Minimum Storage Duration Pricing - applies to Nearline and Coldline, if you delete the object before the min duration, you are charged at-rest storage for the min duration. Can happen when overwriting or moving objects!
Always Free Limits
Cloud Storage Always Free usage limits
5 GB-months of Regional storage (US regions only - east-1, west-1, central-1)
5000 Class A operations per month
50,000 Class B operations per month
Network Egress - 1 GB for North America
Access Pattern
Summary: Use for structured append-only data (e.g. IoT data - NOT transactional), no-SQL, millisecond latency, real-time, high-throughput applications, all static web content
Capacity: Petabytes
Access metaphor: Key-value pairs, HBase API
Read: Scan rows
Write: Put row
Update granularity: Row
Usage: Managed, high throughput, scalable, flattened data