https://cloud.google.
com/tpu/docs/bfloat16
The Google hardware team chose bfloat16 for Cloud TPUs to improve hardware
efficiency while maintaining the ability to train deep learning models accurately, all with
minimal switching costs from float32.
https://cloud.google.com/tpu/docs/preemptible
Preemptible TPUs cost much less than non-preemptible TPUs.
An ephemeral IP address is an IP address that doesn't persist beyond the life of the
resource. For example, when you create an instance or forwarding rule without
specifying an IP address, Google Cloud automatically assigns the resource an
ephemeral IP address.
Cloud Data Fusion is a fully managed, code-free data integration service that helps
users efficiently build and manage ETL/ELT data pipelines.
https://developers.google.com/machine-learning/crash-course/feature-crosses/crossin
g-one-hot-vectors
BigQueryML is designed for structured data, not images.
With Dataflow, you can run your highly parallel workloads in a single pipeline, improving
efficiency and making your workflow easier to manage. If you are creating a new
Dataflow template, we recommend creating it as a Flex template. With a Flex template,
the pipeline is packaged as a Docker image in Artifact Registry, along with a template
specification file in Cloud Storage. The template specification contains a pointer to the
Docker image.
Matrix Factorization is used for recommender systems.
Important topics:
● Vertex AI memory store vs feature store
● Dataflow vs Dataproc
● Pub/sub integration
● Cloud function vs Cloud run vs Cloud build
● Bias vs Variance
● Log loss and logistic regression
https://www.geeksforgeeks.org/anscombes-quartet/
https://www.geeksforgeeks.org/bias-vs-variance-in-machine-learning/
https://cloud.google.com/bigquery/docs/bqml-introduction
BQML supports unsupervised (k-means), but has no support for image data. AutoML
has support for image, and video data but no support for audio or unsupervised
algorithms.
MODEL_TYPE = { ‘LINEAR_REG‘ | ‘LOGISTIC_REG‘ | ‘K MEANS‘ | ‘PCA‘ |
‘MATRIX_FACTORIZATION‘ | ‘AUTOENCODER‘ | ‘TENSORFLOW‘ |
‘AUTOML_REGRESSOR‘ |
‘AUTOML_CLASSIFIER‘ | ‘BOOSTED_TREE_CLASSIFIER‘ |
‘BOOSTED_TREE_REGRESSOR‘ |
‘DNN_CLASSIFIER‘ | ‘DNN_REGRESSOR‘ | ‘DNN_LINEAR_COMBINED_CLASSIFIER‘ |
‘DNN_LINEAR_COMBINED_REGRESSOR‘ | ‘ARIMA_PLUS‘ }
Decision Tree Models are explainable without any sophisticated tool for enlightenment.
https://cloud.google.com/vertex-ai/docs/predictions/online-prediction-logging
https://www.tensorflow.org/probability
https://blog.research.google/2017/04/federated-learning-collaborative.html
Machine learning models are created to optimize for a particular objective,
which determines how the model is built. "Others You May Like" and
"Recommended for You" recommendation models have click-through rate
(CTR) as the default optimization objective. Optimizing for CTR emphasizes
engagement, and you should optimize for CTR when you want to maximize
the likelihood that the user interacts with the recommendation. In contrast,
revenue per order is the default optimization objective for the "Frequently
Bought Together" recommendation model type, as "Frequently Bought
Together" focuses on cross-selling and increasing order values.
For "Others You May Like" and "Recommended for You" recommendation
models, we also support conversion rate (CVR) as the objective. Optimizing
for conversion rate maximizes the likelihood that the user adds the
recommended item to their cart. When CVR is specified as the objective for a
customer with sparse add-to-cart events, the multi-task learning mechanism
will be automatically activated, and transfer learn from detail-page-view
events, which are typically much denser than add-to-cart events.