Answer: B
Explanation: The best option for creating a Dataflow pipeline for real-time anomaly detection is to load the model directly into the Dataflow job as a dependency, and use it for prediction. This option has the following advantages: It minimizes the serving latency, as the model prediction logic is executed within the same Dataflow pipeline that ingests and processes the data. There is no need to invoke external services or containers, which can introduce network overhead and latency. It simplifies the deployment and management of the model, as the model is packaged with the Dataflow job and does not require a separate service or container. The model can be updated by redeploying the Dataflow job with a new model version. It leverages the scalability and reliability of Dataflow, as the model prediction logic can scale up or down with the data volume and handle failures and retries automatically. The other options are less optimal for the following reasons: Option A: Containerizing the model prediction logic in Cloud Run, which is invoked by Dataflow, introduces additional latency and complexity. Cloud Run is a serverless platform that runs stateless containers, which means that the model prediction logic needs to be initialized and loaded every time a request is made. This can increase the cold start latency and reduce the throughput. Moreover, Cloud Run has a limit on the number of concurrent requests per container, which can affect the scalability of the model prediction logic. Additionally, this option requires managing two separate services: the Dataflow pipeline and the Cloud Run container. Option C: Deploying the model to a Vertex AI endpoint, and invoking this endpoint in the Dataflow job, also introduces additional latency and complexity. Vertex AI is a managed service that provides various tools and features for machine learning, such as training, tuning, serving, and monitoring. However, invoking a Vertex AI endpoint from a Dataflow job requires making an HTTP request, which can incur network overhead and latency. Moreover, this option requires managing two separate services: the Dataflow pipeline and the Vertex AI endpoint. Option D: Deploying the model in a TFServing container on Google Kubernetes Engine, and invoking it in the Dataflow job, also introduces additional latency and complexity. TFServing is a highperformance serving system for TensorFlow models, which can handle multiple versions and variants of a model. However, invoking a TFServing container from a Dataflow job requires making a gRPC or REST request, which can incur network overhead and latency. Moreover, this option requires managing two separate services: the Dataflow pipeline and the Google Kubernetes Engine cluster. Reference: [Dataflow documentation] [TensorFlow documentation] [Cloud Run documentation] [Vertex AI documentation] [TFServing documentation]

Answer: D

Explanation:

The best option for reducing pipeline execution time and cost, while also minimizing pipeline

changes, is to enable caching for the pipeline job, and disable caching for the model training step.

This option allows you to leverage the power and simplicity of Vertex AI Pipelines to reuse the output

of the data preprocessing step, and avoid unnecessary recomputation. Vertex AI Pipelines is a service

that can orchestrate machine learning workflows using Vertex AI. Vertex AI Pipelines can run

preprocessing and training steps on custom Docker images, and evaluate, deploy, and monitor the

machine learning model. Caching is a feature of Vertex AI Pipelines that can store and reuse the

output of a pipeline step, and skip the execution of the step if the input parameters and the code

have not changed. Caching can help you reduce the pipeline execution time and cost, as you do not

need to re-run the same step with the same input and code. Caching can also help you minimize the

pipeline changes, as you do not need to add or remove any pipeline steps or parameters. By enabling

caching for the pipeline job, and disabling caching for the model training step, you can create a

Vertex AI pipeline that includes two steps. The first step preprocesses 10 TB data, completes in about

1 hour, and saves the result in a Cloud Storage bucket. The second step uses the processed data to

train a model. You can update the models code to allow you to test different algorithms, and run the

pipeline job with caching enabled. The pipeline job will reuse the output of the data preprocessing

step from the cache, and skip the execution of the step. The pipeline job will run the model training

step with the updated code, and disable the caching for the step. This way, you can reduce the

pipeline execution time and cost, while also minimizing pipeline changes1.

The other options are not as good as option D, for the following reasons:

Option A: Adding a pipeline parameter and an additional pipeline step, depending on the parameter

value, the pipeline step conducts or skips data preprocessing and starts model training, would

require more skills and steps than enabling caching for the pipeline job, and disabling caching for the

model training step. A pipeline parameter is a variable that can be used to control the input or

output of a pipeline step. A pipeline parameter can help you customize the pipeline logic and

behavior, and experiment with different values. An additional pipeline step is a new instance of a

pipeline component that can perform a part of the pipeline workflow, such as data preprocessing or

model training. An additional pipeline step can help you extend the pipeline functionality and

complexity, and handle different scenarios. However, adding a pipeline parameter and an additional

pipeline step, depending on the parameter value, the pipeline step conducts or skips data

preprocessing and starts model training, would require more skills and steps than enabling caching

for the pipeline job, and disabling caching for the model training step. You would need to write code,

define the pipeline parameter, create the additional pipeline step, implement the conditional logic,

and compile and run the pipeline. Moreover, this option would not reuse the output of the data

preprocessing step from the cache, but rather from the Cloud Storage bucket, which can increase the

data transfer and access costs1.

Option B: Creating another pipeline without the preprocessing step, and hardcoding the

preprocessed Cloud Storage file location for model training, would require more skills and steps than

enabling caching for the pipeline job, and disabling caching for the model training step. A pipeline

without the preprocessing step is a pipeline that only includes the model training step, and uses the

preprocessed data from the Cloud Storage bucket as the input. A pipeline without the preprocessing

step can help you avoid running the data preprocessing step every time, and reduce the pipeline

execution time and cost. However, creating another pipeline without the preprocessing step, and

hardcoding the preprocessed Cloud Storage file location for model training, would require more skills

and steps than enabling caching for the pipeline job, and disabling caching for the model training

step. You would need to write code, create a new pipeline, remove the preprocessing step, hardcode

the Cloud Storage file location, and compile and run the pipeline. Moreover, this option would not

reuse the output of the data preprocessing step from the cache, but rather from the Cloud Storage

bucket, which can increase the data transfer and access costs. Furthermore, this option would create

another pipeline, which can increase the maintenance and management costs1.

Option C: Configuring a machine with more CPU and RAM from the compute-optimized machine

family for the data preprocessing step, would not reduce the pipeline execution time and cost, while

also minimizing pipeline changes, but rather increase the pipeline execution cost and complexity. A

machine with more CPU and RAM from the compute-optimized machine family is a virtual machine

that has a high ratio of CPU cores to memory, and can provide high performance and scalability for

compute-intensive workloads. A machine with more CPU and RAM from the compute-optimized

machine family can help you optimize the data preprocessing step, and reduce the pipeline execution

time. However, configuring a machine with more CPU and RAM from the compute-optimized

machine family for the data preprocessing step, would not reduce the pipeline execution time and

cost, while also minimizing pipeline changes, but rather increase the pipeline execution cost and

complexity. You would need to write code, configure the machine type parameters for the data

preprocessing step, and compile and run the pipeline. Moreover, this option would increase the

pipeline execution cost, as machines with more CPU and RAM from the compute-optimized machine

family are more expensive than machines with less CPU and RAM from other machine

families. Furthermore, this option would not reuse the output of the data preprocessing step from

the cache, but rather re-run the data preprocessing step every time, which can increase the pipeline

execution time and cost1.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 3: MLOps

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.2 Automating ML workflows

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.4: Automating ML Workflows

Vertex AI Pipelines

Caching

Pipeline parameters

Machine types

Answer: B
Explanation: The best option for productionizing a Keras model is to use TensorFlow Extended (TFX), a framework for building end-to-end machine learning pipelines that can handle large-scale data and complex workflows. TFX provides standard components for data ingestion, transformation, validation, analysis, training, tuning, serving, and monitoring. TFX pipelines can be orchestrated with Vertex AI Pipelines, a managed service that runs on Google Cloud Platform and leverages Kubernetes and Argo. Vertex AI Pipelines allows you to automate the execution of your TFX pipeline steps, schedule retraining jobs, and scale up or down the resources as needed. By using TFX and Vertex AI Pipelines, you can take advantage of the following benefits: You can reuse the existing code in your Jupyter notebook, as TFX supports Keras as a first-class citizen. You can also use the Keras Tuner to optimize your model hyperparameters. You can ensure data quality and consistency by using the TFX Data Validation component, which can detect anomalies, drift, and skew in your data. You can also use the TFX SchemaGen component to generate a schema for your data and enforce it throughout the pipeline. You can analyze your model performance and fairness by using the TFX Model Analysis component, which can produce various metrics and visualizations. You can also use the TFX Model Validation component to compare your new model with a baseline model and set thresholds for deploying the model to production. You can deploy your model to various serving platforms by using the TFX Pusher component, which can push your model to Vertex AI, Cloud AI Platform, TensorFlow Serving, or TensorFlow Lite. You can also use the TFX Model Registry to manage the versions and metadata of your models. You can monitor your model performance and health by using the TFX Model Monitor component, which can detect data drift, concept drift, and prediction skew in your model. You can also use the TFX Evaluator component to compute metrics and validate your model against a baseline or a slice of data. You can reduce the cost and complexity of managing your own infrastructure by using Vertex AI Pipelines, which provides a serverless environment for running your TFX pipeline. You can also use the Vertex AI Experiments and Vertex AI TensorBoard to track and visualize your pipeline runs. Reference: [TensorFlow Extended (TFX)] [Vertex AI Pipelines] [TFX User Guide]

Answer: C

Explanation:

The best option for adding explanations to your model code with minimal effort and providing

explanations that are as accurate as possible is to upload the custom model to Vertex AI Model

Registry and configure feature-based attribution by using sampled Shapley with input baselines. This

option allows you to leverage the power and simplicity of Vertex Explainable AI to generate feature

attributions for each prediction, and understand how each feature contributes to the model output.

Vertex Explainable AI is a service that can help you understand and interpret predictions made by

your machine learning models, natively integrated with a number of Googles products and services.

Vertex Explainable AI can provide feature-based and example-based explanations to provide better

understanding of model decision making. Feature-based explanations are explanations that show

how much each feature in the input influenced the prediction. Feature-based explanations can help

you debug and improve model performance, build confidence in the predictions, and understand

when and why things go wrong. Vertex Explainable AI supports various feature attribution methods,

such as sampled Shapley, integrated gradients, and XRAI. Sampled Shapley is a feature attribution

method that is based on the Shapley value, which is a concept from game theory that measures how

much each player in a cooperative game contributes to the total payoff. Sampled Shapley

approximates the Shapley value for each feature by sampling different subsets of features, and

computing the marginal contribution of each feature to the prediction. Sampled Shapley can provide

accurate and consistent feature attributions, but it can also be computationally expensive. To reduce

the computation cost, you can use input baselines, which are reference inputs that are used to

compare with the actual inputs. Input baselines can help you define the starting point or the default

state of the features, and calculate the feature attributions relative to the input baselines. By

uploading the custom model to Vertex AI Model Registry and configuring feature-based attribution

by using sampled Shapley with input baselines, you can add explanations to your model code with

minimal effort and provide explanations that are as accurate as possible1.

The other options are not as good as option C, for the following reasons:

Option A: Creating an AutoML tabular model by using the BigQuery data with integrated Vertex

Explainable AI would require more skills and steps than uploading the custom model to Vertex AI

Model Registry and configuring feature-based attribution by using sampled Shapley with input

baselines. AutoML tabular is a service that can automatically build and train machine learning

models for structured or tabular data. AutoML tabular can use BigQuery as the data source, and

provide feature-based explanations by using integrated gradients as the feature attribution method.

However, creating an AutoML tabular model by using the BigQuery data with integrated Vertex

Explainable AI would require more skills and steps than uploading the custom model to Vertex AI

Model Registry and configuring feature-based attribution by using sampled Shapley with input

baselines. You would need to create a new AutoML tabular model, import the BigQuery data,

configure the model settings, train and evaluate the model, and deploy the model. Moreover, this

option would not use your existing custom model, which is already performing well, but create a new

model, which may not have the same performance or behavior as your custom model2.

Option B: Creating a BigQuery ML deep neural network model, and using the ML.EXPLAIN_PREDICT

method with the num_integral_steps parameter would not allow you to deploy the model to

production, and could provide less accurate explanations than using sampled Shapley with input

baselines. BigQuery ML is a service that can create and train machine learning models by using SQL

queries on BigQuery. BigQuery ML can create a deep neural network model, which is a type of

machine learning model that consists of multiple layers of neurons, and can learn complex patterns

and relationships from the data. BigQuery ML can also provide feature-based explanations by using

the ML.EXPLAIN_PREDICT method, which is a SQL function that returns the feature attributions for

each prediction. The ML.EXPLAIN_PREDICT method uses integrated gradients as the feature

attribution method, which is a method that calculates the average gradient of the prediction output

with respect to the feature values along the path from the input baseline to the input. The

num_integral_steps parameter is a parameter that determines the number of steps along the path

from the input baseline to the input. However, creating a BigQuery ML deep neural network model,

and using the ML.EXPLAIN_PREDICT method with the num_integral_steps parameter would not

allow you to deploy the model to production, and could provide less accurate explanations than

using sampled Shapley with input baselines. BigQuery ML does not support deploying the model to

Vertex AI Endpoints, which is a service that can provide low-latency predictions for individual

instances. BigQuery ML only supports batch prediction, which is a service that can provide highthroughput

predictions for a large batch of instances. Moreover, integrated gradients can provide less

accurate and consistent explanations than sampled Shapley, as integrated gradients can be sensitive

to the choice of the input baseline and the num_integral_steps parameter3.

Option D: Updating the custom serving container to include sampled Shapley-based explanations in

the prediction outputs would require more skills and steps than uploading the custom model to

Vertex AI Model Registry and configuring feature-based attribution by using sampled Shapley with

input baselines. A custom serving container is a container image that contains the model, the

dependencies, and a web server. A custom serving container can help you customize the prediction

behavior of your model, and handle complex or non-standard data formats. However, updating the

custom serving container to include sampled Shapley-based explanations in the prediction outputs

would require more skills and steps than uploading the custom model to Vertex AI Model Registry

and configuring feature-based attribution by using sampled Shapley with input baselines. You would

need to write code, implement the sampled Shapley algorithm, build and test the container image,

and upload and deploy the container image. Moreover, this option would not leverage the power

and simplicity of Vertex Explainable AI, which can provide feature-based explanations natively

integrated with Vertex AI services4.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 4: Evaluation

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.3 Monitoring ML models in production

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.3: Monitoring ML Models

Vertex Explainable AI

AutoML Tables

BigQuery ML

Using custom containers for prediction

Answer: C

Explanation:

The best option for implementing the processing steps so that they run at serving time, minimizing

code changes and infrastructure maintenance, and deploying the model into production as quickly as

possible, is to use the Predictor interface to implement a custom prediction routine. Build the custom

container, upload the container to Vertex AI Model Registry, and deploy it to a Vertex AI endpoint.

This option allows you to leverage the power and simplicity of Vertex AI to serve your XGBoost model

with minimal effort and customization. Vertex AI is a unified platform for building and deploying

machine learning solutions on Google Cloud. Vertex AI can deploy a trained XGBoost model to an

online prediction endpoint, which can provide low-latency predictions for individual instances. A

custom prediction routine (CPR) is a Python script that defines the logic for preprocessing the input

data, running the prediction, and postprocessing the output data. A CPR can help you customize the

prediction behavior of your model, and handle complex or non-standard data formats. A CPR can also

help you minimize the code changes, as you only need to write a few functions to implement the

prediction logic. A Predictor interface is a class that inherits from the base class aiplatform.Predictor,

and implements the abstract methods predict() and preprocess(). A Predictor interface can help you

create a CPR by defining the preprocessing and prediction logic for your model. A container image is

a package that contains the model, the CPR, and the dependencies. A container image can help you

standardize and simplify the deployment process, as you only need to upload the container image to

Vertex AI Model Registry, and deploy it to Vertex AI Endpoints. By using the Predictor interface to

implement a CPR, building the custom container, uploading the container to Vertex AI Model

Registry, and deploying it to a Vertex AI endpoint, you can implement the processing steps so that

they run at serving time, minimize code changes and infrastructure maintenance, and deploy the

model into production as quickly as possible1.

The other options are not as good as option C, for the following reasons:

Option A: Using FastAPI to implement an HTTP server, creating a Docker image that runs your HTTP

server, and deploying it on your organizations GKE cluster would require more skills and steps than

using the Predictor interface to implement a CPR, building the custom container, uploading the

container to Vertex AI Model Registry, and deploying it to a Vertex AI endpoint. FastAPI is a

framework for building web applications and APIs in Python. FastAPI can help you implement an

HTTP server that can handle prediction requests and responses, and perform data preprocessing and

postprocessing. A Docker image is a package that contains the model, the HTTP server, and the

dependencies. A Docker image can help you standardize and simplify the deployment process, as you

only need to build and run the Docker image. GKE is a service that can create and manage

Kubernetes clusters on Google Cloud. GKE can help you deploy and scale your Docker image on

Google Cloud, and provide high availability and performance. However, using FastAPI to implement

an HTTP server, creating a Docker image that runs your HTTP server, and deploying it on your

organizations GKE cluster would require more skills and steps than using the Predictor interface to

implement a CPR, building the custom container, uploading the container to Vertex AI Model

Registry, and deploying it to a Vertex AI endpoint. You would need to write code, create and

configure the HTTP server, build and test the Docker image, create and manage the GKE cluster, and

deploy and monitor the Docker image. Moreover, this option would not leverage the power and

simplicity of Vertex AI, which can provide online prediction natively integrated with Google Cloud

services2.

Option B: Using FastAPI to implement an HTTP server, creating a Docker image that runs your HTTP

server, uploading the image to Vertex AI Model Registry, and deploying it to a Vertex AI endpoint

would require more skills and steps than using the Predictor interface to implement a CPR, building

the custom container, uploading the container to Vertex AI Model Registry, and deploying it to a

Vertex AI endpoint. FastAPI is a framework for building web applications and APIs in Python. FastAPI

can help you implement an HTTP server that can handle prediction requests and responses, and

perform data preprocessing and postprocessing. A Docker image is a package that contains the

model, the HTTP server, and the dependencies. A Docker image can help you standardize and

simplify the deployment process, as you only need to build and run the Docker image. Vertex AI

Model Registry is a service that can store and manage your machine learning models on Google

Cloud. Vertex AI Model Registry can help you upload and organize your Docker image, and track the

model versions and metadata. Vertex AI Endpoints is a service that can provide online prediction for

your machine learning models on Google Cloud. Vertex AI Endpoints can help you deploy your

Docker image to an online prediction endpoint, which can provide low-latency predictions for

individual instances. However, using FastAPI to implement an HTTP server, creating a Docker image

that runs your HTTP server, uploading the image to Vertex AI Model Registry, and deploying it to a

Vertex AI endpoint would require more skills and steps than using the Predictor interface to

implement a CPR, building the custom container, uploading the container to Vertex AI Model

Registry, and deploying it to a Vertex AI endpoint. You would need to write code, create and

configure the HTTP server, build and test the Docker image, upload the Docker image to Vertex AI

Model Registry, and deploy the Docker image to Vertex AI Endpoints. Moreover, this option would

not leverage the power and simplicity of Vertex AI, which can provide online prediction natively

integrated with Google Cloud services2.

Option D: Using the XGBoost prebuilt serving container when importing the trained model into

Vertex AI, deploying the model to a Vertex AI endpoint, working with the backend engineers to

implement the pre- and postprocessing steps in the Golang backend service would not allow you to

implement the processing steps so that they run at serving time, and could increase the code

changes and infrastructure maintenance. A XGBoost prebuilt serving container is a container image

that is provided by Google Cloud, and contains the XGBoost framework and the dependencies. A

XGBoost prebuilt serving container can help you deploy a XGBoost model without writing any code,

but it also limits your customization options. A XGBoost prebuilt serving container can only handle

standard data formats, such as JSON or CSV, and cannot perform any preprocessing or postprocessing

on the input or output data. If your input data requires any transformation or normalization before

running the prediction, you cannot use a XGBoost prebuilt serving container. A Golang backend

service is a service that is implemented in Golang, a programming language that can be used for web

development and system programming. A Golang backend service can help you handle the

prediction requests and responses from the frontend, and communicate with the Vertex AI endpoint.

However, using the XGBoost prebuilt serving container when importing the trained model into

Vertex AI, deploying the model to a Vertex AI endpoint, working with the backend engineers to

implement the pre- and postprocessing steps in the Golang backend service would not allow you to

implement the processing steps so that they run at serving time, and could increase the code

changes and infrastructure maintenance. You would need to write code, import the trained model

into Vertex AI, deploy the model to a Vertex AI endpoint, implement the pre- and postprocessing

steps in the Golang backend service, and test and monitor the Golang backend service. Moreover,

this option would not leverage the power and simplicity of Vertex AI, which can provide online

prediction natively integrated with Google Cloud services2.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 2: Serving ML Predictions

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.1 Deploying ML models to production

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.2: Serving ML Predictions

Custom prediction routines

Using pre-built containers for prediction

Using custom containers for prediction

Answer: B
Explanation: The best option for developing a solution that predicts the presence and severity of the disease with high accuracy is to develop an image segmentation ML model to locate the boundaries of the rust spots. Image segmentation is a technique that partitions an image into multiple regions, each corresponding to a different object or semantic category. Image segmentation can be used to detect and localize the rust spots in the images of crops, and measure their shape and size. This information can then be used to determine the presence and severity of the disease, as the rust spots are correlated to the disease symptoms. Image segmentation can also handle the variability of the rust spots, as it does not rely on predefined templates or thresholds. Image segmentation can be implemented using deep learning models, such as U-Net, Mask R-CNN, or DeepLab, which can learn from large-scale datasets and achieve high accuracy and robustness. The other options are not as suitable for developing a solution that predicts the presence and severity of the disease with high accuracy, because: Creating an object detection model that can localize the rust spots would only provide the bounding boxes of the rust spots, not their exact boundaries. This would result in less precise measurements of the shape and size of the rust spots, and might affect the accuracy of the disease prediction. Object detection models are also more complex and computationally expensive than image segmentation models, as they have to perform both classification and localization tasks. Developing a template matching algorithm using traditional computer vision libraries would require manually designing and selecting the templates for the rust spots, which might not capture the diversity and variability of the rust spots. Template matching algorithms are also sensitive to noise, occlusion, rotation, and scale changes, and might fail to detect the rust spots in different scenarios. Template matching algorithms are also less accurate and robust than deep learning models, as they do not learn from data. Developing an image classification ML model to predict the presence of the disease would only provide a binary or categorical output, not the location or severity of the disease. Image classification models are also less informative and interpretable than image segmentation models, as they do not provide any spatial information or visual explanation for the prediction. Image classification models might also suffer from class imbalance or mislabeling issues, as the presence of the disease might not be consistent or clear across the images. Reference: Image Segmentation | Computer Vision | Google Developers Crop diseases and pests detection based on deep learning: a review | Plant Methods | Full Text Using Deep Learning for Image-Based Plant Disease Detection Computer Vision, IoT and Data Fusion for Crop Disease Detection Using ¦ On Using Artificial Intelligence and the Internet of Things for Crop ¦ Crop Disease Detection Using Machine Learning and Computer Vision

Answer: C

Explanation:

The best option for continuing experimenting and iterating on your pipeline to improve model

performance, using Cloud Build for CI/CD, and deploying new pipelines into production quickly and

easily, is to set up a CI/CD pipeline that builds and tests your source code and then deploys built

artifacts into a pre-production environment. After a successful pipeline run in the pre-production

environment, deploy the pipeline to production. This option allows you to leverage the power and

simplicity of Cloud Build to automate, monitor, and manage your pipeline development and

deployment workflow. Cloud Build is a service that can create and run continuous integration and

continuous delivery (CI/CD) pipelines on Google Cloud. Cloud Build can build your source code, run

unit tests, and deploy built artifacts to various Google Cloud services, such as Vertex AI Pipelines,

Vertex AI Endpoints, and Artifact Registry. A CI/CD pipeline is a workflow that can automate the

process of building, testing, and deploying software. A CI/CD pipeline can help you improve the

quality and reliability of your software, accelerate the development and delivery cycle, and reduce

the manual effort and errors. A pre-production environment is an environment that can simulate the

production environment, but is isolated from the real users and data. A pre-production environment

can help you test and validate your software before deploying it to production, and catch any bugs or

issues that may affect the user experience or the system performance. By setting up a CI/CD pipeline

that builds and tests your source code and then deploys built artifacts into a pre-production

environment, you can ensure that your pipeline code is consistent and error-free, and that your

pipeline artifacts are compatible and functional. After a successful pipeline run in the pre-production

environment, you can deploy the pipeline to production, which is the environment where your

software is accessible and usable by the real users and data. By deploying the pipeline to production

after a successful pipeline run in the pre-production environment, you can minimize the chance that

the new pipeline implementations will break in production, and ensure that your software meets the

user expectations and requirements1.

The other options are not as good as option C, for the following reasons:

Option A: Setting up a CI/CD pipeline that builds and tests your source code, and if the tests are

successful, using the Google Cloud console to upload the built container to Artifact Registry and

upload the compiled pipeline to Vertex AI Pipelines would not allow you to deploy new pipelines into

production quickly and easily, and could increase the manual effort and errors. The Google Cloud

console is a web-based user interface that can help you access and manage various Google Cloud

services, such as Artifact Registry and Vertex AI Pipelines. Artifact Registry is a service that can store

and manage your container images and other artifacts on Google Cloud. Artifact Registry can help

you upload and organize your container images, and track the image versions and metadata. Vertex

AI Pipelines is a service that can orchestrate machine learning workflows using Vertex AI. Vertex AI

Pipelines can run preprocessing and training steps on custom Docker images, and evaluate, deploy,

and monitor the machine learning model. However, setting up a CI/CD pipeline that builds and tests

your source code, and if the tests are successful, using the Google Cloud console to upload the built

container to Artifact Registry and upload the compiled pipeline to Vertex AI Pipelines would not

allow you to deploy new pipelines into production quickly and easily, and could increase the manual

effort and errors. You would need to write code, create and run the CI/CD pipeline, use the Google

Cloud console to upload the built container to Artifact Registry, and use the Google Cloud console to

upload the compiled pipeline to Vertex AI Pipelines. Moreover, this option would not use a preproduction

environment to test and validate your pipeline before deploying it to production, which

could increase the chance that the new pipeline implementations will break in production1.

Option B: Setting up a CI/CD pipeline that builds your source code and then deploys built artifacts

into a pre-production environment, running unit tests in the pre-production environment, and if the

tests are successful, deploying the pipeline to production would not allow you to test and validate

your pipeline before deploying it to production, and could cause errors or poor performance. A unit

test is a type of test that can verify the functionality and correctness of a small and isolated unit of

code, such as a function or a class. A unit test can help you debug and improve your code quality, and

catch any bugs or issues that may affect the code logic or output. However, setting up a CI/CD

pipeline that builds your source code and then deploys built artifacts into a pre-production

environment, running unit tests in the pre-production environment, and if the tests are successful,

deploying the pipeline to production would not allow you to test and validate your pipeline before

deploying it to production, and could cause errors or poor performance. You would need to write

code, create and run the CI/CD pipeline, deploy the built artifacts to the pre-production

environment, run the unit tests in the pre-production environment, and deploy the pipeline to

production. Moreover, this option would not run the pipeline in the pre-production environment,

which could prevent you from testing and validating the pipeline functionality and compatibility, and

catching any bugs or issues that may affect the pipeline workflow or output1.

Option D: Setting up a CI/CD pipeline that builds and tests your source code and then deploys built

artifacts into a pre-production environment, after a successful pipeline run in the pre-production

environment, rebuilding the source code, and deploying the artifacts to production would not allow

you to deploy new pipelines into production quickly and easily, and could increase the complexity

and cost of the pipeline development and deployment. Rebuilding the source code is a process that

can recompile and repackage the source code into executable artifacts, such as container images and

pipeline files. Rebuilding the source code can help you incorporate any changes or updates that may

have occurred in the source code, and ensure that the artifacts are consistent and up-to-date.

However, setting up a CI/CD pipeline that builds and tests your source code and then deploys built

artifacts into a pre-production environment, after a successful pipeline run in the pre-production

environment, rebuilding the source code, and deploying the artifacts to production would not allow

you to deploy new pipelines into production quickly and easily, and could increase the complexity

and cost of the pipeline development and deployment. You would need to write code, create and run

the CI/CD pipeline, deploy the built artifacts to the pre-production environment, run the pipeline in

the pre-production environment, rebuild the source code, and deploy the artifacts to

production. Moreover, this option would increase the pipeline development and deployment time,

as rebuilding the source code can be a time-consuming and resource-intensive process1.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 3: MLOps

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.2 Automating ML workflows

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.4: Automating ML Workflows

Cloud Build

Vertex AI Pipelines

Artifact Registry

Pre-production environment

Answer: C
Explanation: The best option for handling missing values in a categorical feature is to replace them with a placeholder category indicating a missing value. This is a type of imputation, which is a method of estimating the missing values based on the observed data. Imputing the missing values with a placeholder category preserves the information that the data is missing, and avoids introducing bias or distortion in the feature distribution. It also allows the machine learning model to learn from the missingness pattern, and potentially use it as a predictor for the target variable. The other options are not suitable for handling missing values in a categorical feature, because: Removing the rows with missing values and upsampling the dataset by 5% would reduce the size of the dataset and potentially lose important information. It would also introduce sampling bias and overfitting, as the upsampling process would create duplicate or synthetic observations that do not reflect the true population. Replacing the missing values with the features mean would not make sense for a categorical feature, as the mean is a numerical measure that does not capture the mode or frequency of the categories. It would also create a new category that does not exist in the original data, and might confuse the machine learning model. Moving the rows with missing values to the validation dataset would compromise the validity and reliability of the model evaluation, as the validation dataset would not be representative of the test or production data. It would also reduce the amount of data available for training the model, and might introduce leakage or inconsistency between the training and validation datasets. Reference: Imputation of missing values Effective Strategies to Handle Missing Values in Data Analysis How to Handle Missing Values of Categorical Variables? Google Cloud launches machine learning engineer certification Google Professional Machine Learning Engineer Certification Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

Answer: A

Explanation:

The best option for accessing internal data in the most secure way, while mitigating the risk of data

exfiltration, is to enable VPC Service Controls for peerings, and add Vertex AI to a service perimeter.

This option allows you to leverage the power and simplicity of VPC Service Controls to isolate and

protect your data and services on Google Cloud. VPC Service Controls is a service that can create a

secure perimeter around your Google Cloud resources, such as BigQuery, Cloud Storage, and Vertex

AI. VPC Service Controls can help you prevent unauthorized access and data exfiltration from your

perimeter, and enforce fine-grained access policies based on context and identity. Peerings are

connections that can allow traffic to flow between different networks. Peerings can help you connect

your Google Cloud network with other Google Cloud networks or external networks, and enable

communication between your resources and services. By enabling VPC Service Controls for peerings,

you can allow your training code to download internal data by using an API endpoint hosted in your

projects network, and restrict the data transfer to only authorized networks and services. Vertex AI

is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex

AI can support various types of models, such as linear regression, logistic regression, k-means

clustering, matrix factorization, and deep neural networks. Vertex AI can also provide various tools

and services for data analysis, model development, model deployment, model monitoring, and

model governance. By adding Vertex AI to a service perimeter, you can isolate and protect your

Vertex AI resources, such as models, endpoints, pipelines, and feature store, and prevent data

exfiltration from your perimeter1.

The other options are not as good as option A, for the following reasons:

Option B: Creating a Cloud Run endpoint as a proxy to the data, and using Identity and Access

Management (IAM) authentication to secure access to the endpoint from the training job would

require more skills and steps than enabling VPC Service Controls for peerings, and adding Vertex AI

to a service perimeter. Cloud Run is a service that can run your stateless containers on a fully

managed environment or on your own Google Kubernetes Engine cluster. Cloud Run can help you

deploy and scale your containerized applications quickly and easily, and pay only for the resources

you use. A Cloud Run endpoint is a URL that can expose your containerized application to the

internet or to other Google Cloud services. A Cloud Run endpoint can help you access and invoke

your application from anywhere, and handle the load balancing and traffic routing. A proxy is a server

that can act as an intermediary between a client and a target server. A proxy can help you modify,

filter, or redirect the requests and responses between the client and the target server, and provide

additional functionality or security. IAM is a service that can manage access control for Google Cloud

resources. IAM can help you define who (identity) has what access (role) to which resource, and

enforce the access policies. By creating a Cloud Run endpoint as a proxy to the data, and using IAM

authentication to secure access to the endpoint from the training job, you can access internal data by

using an API endpoint hosted in your projects network, and restrict the data access to only

authorized identities and roles. However, creating a Cloud Run endpoint as a proxy to the data, and

using IAM authentication to secure access to the endpoint from the training job would require more

skills and steps than enabling VPC Service Controls for peerings, and adding Vertex AI to a service

perimeter. You would need to write code, create and configure the Cloud Run endpoint, implement

the proxy logic, deploy and monitor the Cloud Run endpoint, and set up the IAM policies. Moreover,

this option would not prevent data exfiltration from your network, as the Cloud Run endpoint can be

accessed from outside your network2.

Option C: Configuring VPC Peering with Vertex AI and specifying the network of the training job

would not allow you to access internal data by using an API endpoint hosted in your projects

network, and could cause errors or poor performance. VPC Peering is a service that can create a

peering connection between two VPC networks. VPC Peering can help you connect your Google

Cloud network with another Google Cloud network or an external network, and enable

communication between your resources and services. By configuring VPC Peering with Vertex AI and

specifying the network of the training job, you can allow your training code to access Vertex AI

resources, such as models, endpoints, pipelines, and feature store, and use the same network for the

training job. However, configuring VPC Peering with Vertex AI and specifying the network of the

training job would not allow you to access internal data by using an API endpoint hosted in your

projects network, and could cause errors or poor performance. You would need to write code,

create and configure the VPC Peering connection, and specify the network of the training

job. Moreover, this option would not isolate and protect your data and services on Google Cloud, as

the VPC Peering connection can expose your network to other networks and services3.

Option D: Downloading the data to a Cloud Storage bucket before calling the training job would not

allow you to access internal data by using an API endpoint hosted in your projects network, and

could increase the complexity and cost of the data access. Cloud Storage is a service that can store

and manage your data on Google Cloud. Cloud Storage can help you upload and organize your data,

and track the data versions and metadata. A Cloud Storage bucket is a container that can hold your

data on Cloud Storage. A Cloud Storage bucket can help you store and access your data from

anywhere, and provide various storage classes and options. By downloading the data to a Cloud

Storage bucket before calling the training job, you can access the data from Cloud Storage, and use it

as the input for the training job. However, downloading the data to a Cloud Storage bucket before

calling the training job would not allow you to access internal data by using an API endpoint hosted

in your projects network, and could increase the complexity and cost of the data access. You would

need to write code, create and configure the Cloud Storage bucket, download the data to the Cloud

Storage bucket, and call the training job. Moreover, this option would create an intermediate data

source on Cloud Storage, which can increase the storage and transfer costs, and expose the data to

unauthorized access or data exfiltration4.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 1: Data Engineering

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 1: Framing ML problems,

1.2 Defining data needs

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 2: Data

Engineering, Section 2.2: Defining Data Needs

VPC Service Controls

Cloud Run

VPC Peering

Cloud Storage

Answer: D
Explanation: The trace in the question shows that the training time is taking longer than expected. This is likely due to the input function not being optimized. To decrease training time in a cost-efficient way, the best option is to rewrite the input function using parallel reads, parallel processing, and prefetch. This will allow the model to process the data more efficiently and decrease training time. Reference: [Cloud TPU Performance Guide] [Data input pipeline performance guide]

Answer: C

Explanation:

The best option for deploying a new version of a model to a production Vertex AI endpoint that is

serving traffic, directing all user traffic to the new model, and deploying the model with minimal

disruption to your application, is to create a new model, set the parentModel parameter to the

model ID of the currently deployed model, upload the model to Vertex AI Model Registry, deploy the

new model to the existing endpoint, and set the new model to 100% of the traffic. This option allows

you to leverage the power and simplicity of Vertex AI to update your model version and serve online

predictions with low latency. Vertex AI is a unified platform for building and deploying machine

learning solutions on Google Cloud. Vertex AI can deploy a trained model to an online prediction

endpoint, which can provide low-latency predictions for individual instances. A model is a resource

that represents a machine learning model that you can use for prediction. A model can have one or

more versions, which are different implementations of the same model. A model version can have

different parameters, code, or data than another version of the same model. A model version can

help you experiment and iterate on your model, and improve the model performance and accuracy.

A parentModel parameter is a parameter that specifies the model ID of the model that the new

model version is based on. A parentModel parameter can help you inherit the settings and metadata

of the existing model, and avoid duplicating the model configuration. Vertex AI Model Registry is a

service that can store and manage your machine learning models on Google Cloud. Vertex AI Model

Registry can help you upload and organize your models, and track the model versions and metadata.

An endpoint is a resource that provides the service endpoint (URL) you use to request the prediction.

An endpoint can have one or more deployed models, which are instances of model versions that are

associated with physical resources. A deployed model can help you serve online predictions with low

latency, and scale up or down based on the traffic. By creating a new model, setting the parentModel

parameter to the model ID of the currently deployed model, uploading the model to Vertex AI Model

Registry, deploying the new model to the existing endpoint, and setting the new model to 100% of

the traffic, you can deploy a new version of a model to a production Vertex AI endpoint that is

serving traffic, direct all user traffic to the new model, and deploy the model with minimal disruption

to your application1.

The other options are not as good as option C, for the following reasons:

Option A: Creating a new endpoint, creating a new model, setting it as the default version, uploading

the model to Vertex AI Model Registry, deploying the new model to the new endpoint, and updating

Cloud DNS to point to the new endpoint would require more skills and steps than creating a new

model, setting the parentModel parameter to the model ID of the currently deployed model,

uploading the model to Vertex AI Model Registry, deploying the new model to the existing endpoint,

and setting the new model to 100% of the traffic. Cloud DNS is a service that can provide reliable and

scalable Domain Name System (DNS) services on Google Cloud. Cloud DNS can help you manage

your DNS records, and resolve domain names to IP addresses. By updating Cloud DNS to point to the

new endpoint, you can redirect the user traffic to the new endpoint, and avoid breaking the existing

application. However, creating a new endpoint, creating a new model, setting it as the default

version, uploading the model to Vertex AI Model Registry, deploying the new model to the new

endpoint, and updating Cloud DNS to point to the new endpoint would require more skills and steps

than creating a new model, setting the parentModel parameter to the model ID of the currently

deployed model, uploading the model to Vertex AI Model Registry, deploying the new model to the

existing endpoint, and setting the new model to 100% of the traffic. You would need to write code,

create and configure the new endpoint, create and configure the new model, upload the model to

Vertex AI Model Registry, deploy the model to the new endpoint, and update Cloud DNS to point to

the new endpoint. Moreover, this option would create a new endpoint, which can increase the

maintenance and management costs2.

Option B: Creating a new endpoint, creating a new model, setting the parentModel parameter to the

model ID of the currently deployed model and setting it as the default version, uploading the model

to Vertex AI Model Registry, and deploying the new model to the new endpoint and setting the new

model to 100% of the traffic would require more skills and steps than creating a new model, setting

the parentModel parameter to the model ID of the currently deployed model, uploading the model

to Vertex AI Model Registry, deploying the new model to the existing endpoint, and setting the new

model to 100% of the traffic. A parentModel parameter is a parameter that specifies the model ID of

the model that the new model version is based on. A parentModel parameter can help you inherit

the settings and metadata of the existing model, and avoid duplicating the model configuration. A

default version is a model version that is used for prediction when no other version is specified. A

default version can help you simplify the prediction request, and avoid specifying the model version

every time. By setting the parentModel parameter to the model ID of the currently deployed model

and setting it as the default version, you can create a new model that is based on the existing model,

and use it for prediction without specifying the model version. However, creating a new endpoint,

creating a new model, setting the parentModel parameter to the model ID of the currently deployed

model and setting it as the default version, uploading the model to Vertex AI Model Registry, and

deploying the new model to the new endpoint and setting the new model to 100% of the traffic

would require more skills and steps than creating a new model, setting the parentModel parameter

to the model ID of the currently deployed model, uploading the model to Vertex AI Model Registry,

deploying the new model to the existing endpoint, and setting the new model to 100% of the traffic.

You would need to write code, create and configure the new endpoint, create and configure the new

model, upload the model to Vertex AI Model Registry, and deploy the model to the new

endpoint. Moreover, this option would create a new endpoint, which can increase the maintenance

and management costs2.

Option D: Creating a new model, setting it as the default version, uploading the model to Vertex AI

Model Registry, and deploying the new model to the existing endpoint would not allow you to inherit

the settings and metadata of the existing model, and could cause errors or poor performance. A

default version is a model version that is used for prediction when no other version is specified. A

default version can help you simplify the prediction request, and avoid specifying the model version

every time. By setting the new model as the default version, you can use the new model for

prediction without specifying the model version. However, creating a new model, setting it as the

default version, uploading the model to Vertex AI Model Registry, and deploying the new model to

the existing endpoint would not allow you to inherit the settings and metadata of the existing model,

and could cause errors or poor performance. You would need to write code, create and configure the

new model, upload the model to Vertex AI Model Registry, and deploy the model to the existing

endpoint. Moreover, this option would not set the parentModel parameter to the model ID of the

currently deployed model, which could prevent you from inheriting the settings and metadata of the

existing model, and cause inconsistencies or conflicts between the model versions2.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 2: Serving ML Predictions

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.1 Deploying ML models to production

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.2: Serving ML Predictions

Vertex AI

Cloud DNS

Answer: A
Explanation: The best option for using a managed service to submit training jobs with different frameworks is to use Vertex AI Training. Vertex AI Training is a fully managed service that allows you to train custom models on Google Cloud using any framework, such as TensorFlow, PyTorch, scikit-learn, XGBoost, etc. You can also use custom containers to run your own libraries and dependencies. Vertex AI Training handles the infrastructure provisioning, scaling, and monitoring for you, so you can focus on your model development and optimization. Vertex AI Training also integrates with other Vertex AI services, such as Vertex AI Pipelines, Vertex AI Experiments, and Vertex AI Prediction. The other options are not as suitable for using a managed service to submit training jobs with different frameworks, because: Configuring Kubeflow to run on Google Kubernetes Engine and submit training jobs through TFJob would require more infrastructure maintenance, as Kubeflow is not a fully managed service, and you would have to provision and manage your own Kubernetes cluster. This would also incur more costs, as you would have to pay for the cluster resources, regardless of the training job usage. TFJob is also mainly designed for TensorFlow models, and might not support other frameworks as well as Vertex AI Training. Creating a library of VM images on Compute Engine, and publishing these images on a centralized repository would require more development time and effort, as you would have to create and maintain different VM images for different frameworks and libraries. You would also have to manually configure and launch the VMs for each training job, and handle the scaling and monitoring yourself. This would not leverage the benefits of a managed service, such as Vertex AI Training. Setting up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure would require more configuration and administration, as Slurm is not a native Google Cloud service, and you would have to install and manage it on your own VMs or clusters. Slurm is also a general-purpose workload manager, and might not have the same level of integration and optimization for ML frameworks and libraries as Vertex AI Training. Reference: Vertex AI Training | Google Cloud Kubeflow on Google Cloud | Google Cloud TFJob for training TensorFlow models with Kubernetes | Kubeflow Compute Engine | Google Cloud Slurm Workload Manager

Answer: D

Explanation:

The best option for training an ML model on a large dataset, using a TPU to accelerate the training

process, and discovering that the TPU is not reaching its full capacity, is to increase the batch size.

This option allows you to leverage the power and simplicity of TPUs to train your model faster and

more efficiently. A TPU is a custom-developed application-specific integrated circuit (ASIC) that can

accelerate machine learning workloads. A TPU can provide high performance and scalability for

various types of models, such as linear regression, logistic regression, k-means clustering, matrix

factorization, and deep neural networks. A TPU can also support various tools and frameworks, such

as TensorFlow, PyTorch, and JAX. A batch size is a parameter that specifies the number of training

examples in one forward/backward pass. A batch size can affect the speed and accuracy of the

training process. A larger batch size can help you utilize the parallel processing power of the TPU, and

reduce the communication overhead between the TPU and the host CPU. A larger batch size can also

help you avoid overfitting, as it can reduce the variance of the gradient updates. By increasing the

batch size, you can train your model on a large dataset faster and more efficiently, and make full use

of the TPU capacity1.

The other options are not as good as option D, for the following reasons:

Option A: Increasing the learning rate would not help you utilize the parallel processing power of the

TPU, and could cause errors or poor performance. A learning rate is a parameter that controls how

much the model is updated in each iteration. A learning rate can affect the speed and accuracy of the

training process. A larger learning rate can help you converge faster, but it can also cause instability,

divergence, or oscillation. By increasing the learning rate, you may not be able to find the optimal

solution, and your model may perform poorly on the validation or test data2.

Option B: Increasing the number of epochs would not help you utilize the parallel processing power

of the TPU, and could increase the complexity and cost of the training process. An epoch is a measure

of the number of times all of the training examples are used once in the training process. An epoch

can affect the speed and accuracy of the training process. A larger number of epochs can help you

learn more from the data, but it can also cause overfitting, underfitting, or diminishing returns. By

increasing the number of epochs, you may not be able to improve the model performance

significantly, and your training process may take longer and consume more resources3.

Option C: Decreasing the learning rate would not help you utilize the parallel processing power of the

TPU, and could slow down the training process. A learning rate is a parameter that controls how

much the model is updated in each iteration. A learning rate can affect the speed and accuracy of the

training process. A smaller learning rate can help you find a more precise solution, but it can also

cause slow convergence or local minima. By decreasing the learning rate, you may not be able to

reach the optimal solution in a reasonable time, and your training process may take longer2.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 2: ML Models and

Architectures, Week 1: Introduction to ML Models and Architectures

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 2: Architecting ML

solutions, 2.1 Designing ML models

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 4: ML

Models and Architectures, Section 4.1: Designing ML Models

Use TPUs

Triose phosphate utilization and beyond: from photosynthesis to end ¦

Cloud TPU performance guide

Google TPU: Architecture and Performance Best Practices - Run

Answer: B
Explanation: The best option for creating and orchestrating an end-to-end training pipeline for a TensorFlow model is to use TensorFlow Extended (TFX) and standard TFX components, and deploy the pipeline to Vertex AI Pipelines. TFX is an end-to-end platform for deploying production ML pipelines, which consists of several built-in components that cover the entire ML lifecycle, from data ingestion and validation, to model training and evaluation, to model deployment and monitoring. TFX also supports custom components and integrations with other Google Cloud services, such as BigQuery, Dataflow, and Cloud Storage. Vertex AI Pipelines is a fully managed service that allows you to run TFX pipelines on Google Cloud, without having to worry about infrastructure provisioning, scaling, or maintenance. Vertex AI Pipelines also provides a user-friendly interface to monitor and manage your pipelines, as well as tools to track and compare experiments. The other options are not as suitable for creating and orchestrating an end-to-end training pipeline for a TensorFlow model, because: Creating the pipeline using Kubeflow Pipelines domain-specific language (DSL) and predefined Google Cloud components would require more development time and effort, as Kubeflow Pipelines DSL is not as expressive or compatible with TensorFlow as TFX. Predefined Google Cloud components might not cover all the stages of the ML lifecycle, and might not be optimized for TensorFlow models. Orchestrating the pipeline using Kubeflow Pipelines deployed on Google Kubernetes Engine would require more infrastructure maintenance, as Kubeflow Pipelines is not a fully managed service, and you would have to provision and manage your own Kubernetes cluster. This would also incur more costs, as you would have to pay for the cluster resources, regardless of the pipeline usage. Reference: TFX | ML Production Pipelines | TensorFlow Vertex AI Pipelines | Google Cloud Kubeflow Pipelines | Google Cloud Google Cloud launches machine learning engineer certification Google Professional Machine Learning Engineer Certification Professional ML Engineer Exam Guide

Answer: C
Explanation: The best option for handling missing data in this case is to predict the missing values using linear regression. Linear regression is a supervised learning technique that can be used to estimate the relationship between a continuous target variable and one or more predictor variables. In this case, the target variable is the distance from the closest school, and the predictor variables are the other features in the dataset, such as house size, location, number of rooms, etc. By fitting a linear regression model on the data that has no missing values, we can then use the model to predict the missing values for the distance from the closest school feature. This way, we can preserve all the instances in the dataset and avoid introducing bias or reducing variance. The other options are not suitable for handling missing data in this case, because: Deleting the rows that have missing values would reduce the size of the dataset and potentially lose important information. Since every instance is important, we want to keep as much data as possible. Applying feature crossing with another column that does not have missing values would create a new feature that combines the values of two existing features. This might increase the complexity of the model and introduce noise or multicollinearity. It would not solve the problem of missing values, as the new feature would still have missing values whenever the distance from the closest school feature is missing. Replacing the missing values with zeros would distort the distribution of the feature and introduce bias. It would also imply that the houses with missing values are located at the same distance from the closest school, which is unlikely to be true. A zero value might also be outside the range of the feature, as the distance from the closest school is unlikely to be exactly zero for any house. Reference: Linear Regression Imputation of missing values Google Cloud launches machine learning engineer certification Google Professional Machine Learning Engineer Certification Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

Answer: A
Explanation: The most likely reason for the observed result is that the model is overfitting in areas with less traffic and underfitting in areas with more traffic. Overfitting means that the model learns the specific patterns and noise in the training data, but fails to generalize well to new and unseen data. Underfitting means that the model is not able to capture the complexity and variability of the data, and performs poorly on both training and test data. In this case, the model might have learned to segment the images well when there is less traffic, but it might not have enough data or features to handle the more challenging scenarios when there is more traffic. This could lead to a decrease in the AUC metric, which measures the ability of the model to distinguish between different classes. AUC is a suitable metric for this classification model, as it is not affected by class imbalance or threshold selection. The other options are not likely to be the reason for the result, as they are not related to the traffic density. Too much data representing congested areas would not cause the model to fail in those areas, but rather help the model learn better. Gradients vanishing or exploding is a problem that occurs during the training process, not after the deployment, and it affects the whole model, not specific scenarios. Reference: Image Segmentation: U-Net For Self Driving Cars Intelligent Semantic Segmentation for Self-Driving Vehicles Using Deep Learning Sharing Pixelopolis, a self-driving car demo from Google I/O built with TensorFlow Lite Google Cloud launches machine learning engineer certification Google Professional Machine Learning Engineer Certification Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

Answer: B
Explanation: The best option for building a recommendation system without any user event data is to use simple heuristics based on content metadata. This is a type of content-based filtering, which recommends items that are similar to the ones that the user has interacted with or selected, based on their attributes. For example, if a user selects a comedy movie from the US released in 2020, the system can recommend other comedy movies from the US released in 2020 or nearby years. This approach does not require any machine learning, but it can leverage the existing metadata of the videos to provide relevant recommendations. It also allows the system to start collecting user event data, such as views, likes, ratings, etc., which can be used to train a more sophisticated machine learning model in the future, such as a collaborative filtering model or a hybrid model that combines content and collaborative information. Reference: Recommendation Systems Content-Based Filtering Collaborative Filtering Hybrid Recommender Systems: A Systematic Literature Review

Answer: A
Explanation: TensorFlow Data Validation (TFDV) is a library that helps you understand, validate, and monitor your data for machine learning. It can automatically detect and report schema anomalies, such as missing features, new features, or different data types, in your data. It can also generate descriptive statistics and data visualizations to help you explore and debug your data. TFDV can be integrated with your model training pipeline to ensure data quality and consistency throughout the machine learning lifecycle. Reference: TensorFlow Data Validation Data Validation | TensorFlow Data Validation | Machine Learning Crash Course | Google Developers

Answer: C

Explanation:

The best option for automating the retraining of your model by using minimal additional code when

model feature values change, and minimizing the number of times that your model is retrained to

reduce training costs, is to create a Vertex AI Model Monitoring job configured to monitor prediction

drift, configure alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is

detected, and use a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in

BigQuery. This option allows you to leverage the power and simplicity of Vertex AI, Pub/Sub, and

Cloud Functions to monitor your model performance and retrain your model when needed. Vertex AI

is a unified platform for building and deploying machine learning solutions on Google Cloud. Vertex

AI can deploy a trained model to an online prediction endpoint, which can provide low-latency

predictions for individual instances. Vertex AI can also provide various tools and services for data

analysis, model development, model deployment, model monitoring, and model governance. A

Vertex AI Model Monitoring job is a resource that can monitor the performance and quality of your

deployed models on Vertex AI. A Vertex AI Model Monitoring job can help you detect and diagnose

issues with your models, such as data drift, prediction drift, training/serving skew, or model

staleness. Prediction drift is a type of model monitoring metric that measures the difference

between the distributions of the predictions generated by the model on the training data and the

predictions generated by the model on the online data. Prediction drift can indicate that the model

performance is degrading, or that the online data is changing over time. By creating a Vertex AI

Model Monitoring job configured to monitor prediction drift, you can track the changes in the model

predictions, and compare them with the expected predictions. Alert monitoring is a feature of Vertex

AI Model Monitoring that can notify you when a monitoring metric exceeds a predefined threshold.

Alert monitoring can help you set up rules and conditions for triggering alerts, and choose the

notification channel for receiving alerts. Pub/Sub is a service that can provide reliable and scalable

messaging and event streaming on Google Cloud. Pub/Sub can help you publish and subscribe to

messages, and deliver them to various Google Cloud services, such as Cloud Functions. A Pub/Sub

queue is a resource that can hold messages that are published to a Pub/Sub topic. A Pub/Sub queue

can help you store and manage messages, and ensure that they are delivered to the subscribers. By

configuring alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is

detected, you can send a notification to a Pub/Sub topic, and trigger a downstream action based on

the alert. Cloud Functions is a service that can run your stateless code in response to events on

Google Cloud. Cloud Functions can help you create and execute functions without provisioning or

managing servers, and pay only for the resources you use. A Cloud Function is a resource that can

execute a piece of code in response to an event, such as a Pub/Sub message. A Cloud Function can

help you perform various tasks, such as data processing, data transformation, or data analysis.

BigQuery is a service that can store and query large-scale data on Google Cloud. BigQuery can help

you analyze your data by using SQL queries, and perform various tasks, such as data exploration, data

transformation, or data visualization. BigQuery ML is a feature of BigQuery that can create and

execute machine learning models in BigQuery by using SQL queries. BigQuery ML can help you build

and train various types of models, such as linear regression, logistic regression, k-means clustering,

matrix factorization, and deep neural networks. By using a Cloud Function to monitor the Pub/Sub

queue, and trigger retraining in BigQuery, you can automate the retraining of your model by using

minimal additional code when model feature values change. You can write a Cloud Function that

listens to the Pub/Sub queue, and executes a SQL query to retrain your model in BigQuery ML when

a prediction drift alert is received. By retraining your model in BigQuery ML, you can update your

model parameters and improve your model performance and accuracy1.

The other options are not as good as option C, for the following reasons:

Option A: Enabling request-response logging on Vertex AI Endpoints, scheduling a TensorFlow Data

Validation job to monitor prediction drift, and executing model retraining if there is significant

distance between the distributions would require more skills and steps than creating a Vertex AI

Model Monitoring job configured to monitor prediction drift, configuring alert monitoring to publish

a message to a Pub/Sub queue when a monitoring alert is detected, and using a Cloud Function to

monitor the Pub/Sub queue, and trigger retraining in BigQuery. Request-response logging is a

feature of Vertex AI Endpoints that can record the requests and responses that are sent to and from

the online prediction endpoint. Request-response logging can help you collect and analyze the online

prediction data, and troubleshoot any issues with your model. TensorFlow Data Validation is a tool

that can analyze and validate your data for machine learning. TensorFlow Data Validation can help

you explore, understand, and clean your data, and detect various data issues, such as data drift, data

skew, or data anomalies. Prediction drift is a type of data issue that measures the difference between

the distributions of the predictions generated by the model on the training data and the predictions

generated by the model on the online data. Prediction drift can indicate that the model performance

is degrading, or that the online data is changing over time. By enabling request-response logging on

Vertex AI Endpoints, and scheduling a TensorFlow Data Validation job to monitor prediction drift, you

can collect and analyze the online prediction data, and compare the distributions of the predictions.

However, enabling request-response logging on Vertex AI Endpoints, scheduling a TensorFlow Data

Validation job to monitor prediction drift, and executing model retraining if there is significant

distance between the distributions would require more skills and steps than creating a Vertex AI

Model Monitoring job configured to monitor prediction drift, configuring alert monitoring to publish

a message to a Pub/Sub queue when a monitoring alert is detected, and using a Cloud Function to

monitor the Pub/Sub queue, and trigger retraining in BigQuery. You would need to write code,

enable and configure the request-response logging, create and run the TensorFlow Data Validation

job, define and measure the distance between the distributions, and execute the model

retraining. Moreover, this option would not automate the retraining of your model, as you would

need to manually check the prediction drift and trigger the retraining2.

Option B: Enabling request-response logging on Vertex AI Endpoints, scheduling a TensorFlow Data

Validation job to monitor training/serving skew, and executing model retraining if there is significant

distance between the distributions would not help you monitor the changes in the model feature

values, and could cause errors or poor performance. Training/serving skew is a type of data issue that

measures the difference between the distributions of the features used to train the model and the

features used to serve the model. Training/serving skew can indicate that the model is not trained on

the representative data, or that the data is changing over time. By enabling request-response logging

on Vertex AI Endpoints, and scheduling a TensorFlow Data Validation job to monitor training/serving

skew, you can collect and analyze the online prediction data, and compare the distributions of the

features. However, enabling request-response logging on Vertex AI Endpoints, scheduling a

TensorFlow Data Validation job to monitor training/serving skew, and executing model retraining if

there is significant distance between the distributions would not help you monitor the changes in the

model feature values, and could cause errors or poor performance. You would need to write code,

enable and configure the request-response logging, create and run the TensorFlow Data Validation

job, define and measure the distance between the distributions, and execute the model

retraining. Moreover, this option would not monitor the prediction drift, which is a more direct and

relevant metric for measuring the model performance and quality2.

Option D: Creating a Vertex AI Model Monitoring job configured to monitor training/serving skew,

configuring alert monitoring to publish a message to a Pub/Sub queue when a monitoring alert is

detected, and using a Cloud Function to monitor the Pub/Sub queue, and trigger retraining in

BigQuery would not help you monitor the changes in the model feature values, and could cause

errors or poor performance. Training/serving skew is a type of data issue that measures the

difference between the distributions of the features used to train the model and the features used to

serve the model. Training/serving skew can indicate that the model is not trained on the

representative data, or that the data is changing over time. By creating a Vertex AI Model Monitoring

job configured to monitor training/serving skew, you can track the changes in the model features,

and compare them with the expected features. However, creating a Vertex AI Model Monitoring job

configured to monitor training/serving skew, configuring alert monitoring to publish a message to a

Pub/Sub queue when a monitoring alert is detected, and using a Cloud Function to monitor the

Pub/Sub queue, and trigger retraining in BigQuery would not help you monitor the changes in the

model feature values, and could cause errors or poor performance. You would need to write code,

create and configure the Vertex AI Model Monitoring job, configure the alert monitoring, create and

configure the Pub/Sub queue, and write a Cloud Function to trigger the retraining. Moreover, this

option would not monitor the prediction drift, which is a more direct and relevant metric for

measuring the model performance and quality1.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 4: ML Governance

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production

Answer: B
Explanation: The main advantage of implementing machine learning for this business case is that new problematic phrases can be identified in spam posts. This is because machine learning can learn from the data and the feedback, and adapt to the changing patterns and trends of spam posts. Machine learning can also capture the semantic and contextual meaning of the posts, and not just rely on the presence or absence of keywords. By using machine learning, you can improve the accuracy and coverage of your anti-spam service, and detect new and emerging types of spam posts that may not be captured by the keyword list. The other options are not advantages of implementing machine learning for this business case for the following reasons: A) Posts can be compared to the keyword list much more quickly is not an advantage, as it does not improve the quality or effectiveness of the anti-spam service. It only improves the efficiency of the service, which is not the primary objective. Moreover, machine learning may not necessarily be faster than the keyword list, depending on the complexity and size of the model and the data. C) A much longer keyword list can be used to flag spam posts is not an advantage, as it does not address the limitations or challenges of the keyword list approach. It only increases the size and complexity of the keyword list, which can make it harder to maintain and update. Moreover, a longer keyword list may not improve the accuracy or coverage of the anti-spam service, as it may introduce more false positives or false negatives, or miss new and emerging types of spam posts. D) Spam posts can be flagged using far fewer keywords is not an advantage, as it does not reflect the capabilities or benefits of machine learning. It only reduces the size and complexity of the keyword list, which can make it easier to maintain and update. However, using fewer keywords may not improve the accuracy or coverage of the anti-spam service, as it may lose some information or meaning of the posts, or miss some types of spam posts. Reference: Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate Google Cloud launches machine learning engineer certification Machine Learning for Spam Detection Spam Detection Using Machine Learning

Answer: C

Explanation:

The best option for creating a scalable and maintainable production process that runs end-to-end

and tracks the connections between steps, using prototype code to production, feature engineering

code in PySpark that runs on Dataproc Serverless, and model training that is executed by using a

Vertex AI custom training job, is to use the Kubeflow pipelines SDK to write code that specifies two

components. The first is a Dataproc Serverless component that launches the feature engineering job.

The second is a custom component wrapped in the create_custom_training_job_from_component

utility that launches the custom model training job. This option allows you to leverage the power and

simplicity of Kubeflow pipelines to orchestrate and automate your machine learning workflows on

Vertex AI. Kubeflow pipelines is a platform that can build, deploy, and manage machine learning

pipelines on Kubernetes. Kubeflow pipelines can help you create reusable and scalable pipelines,

experiment with different pipeline versions and parameters, and monitor and debug your pipelines.

Kubeflow pipelines SDK is a set of Python packages that can help you build and run Kubeflow

pipelines. Kubeflow pipelines SDK can help you define pipeline components, specify pipeline

parameters and inputs, and create pipeline steps and tasks. A component is a self-contained set of

code that performs one step in a pipeline, such as data preprocessing, model training, or model

evaluation. A component can be created from a Python function, a container image, or a prebuilt

component. A custom component is a component that is not provided by Kubeflow pipelines, but is

created by the user to perform a specific task. A custom component can be wrapped in a utility

function that can help you create a Vertex AI custom training job from the component. A custom

training job is a resource that can run your custom training code on Vertex AI. A custom training job

can help you train various types of models, such as linear regression, logistic regression, k-means

clustering, matrix factorization, and deep neural networks. By using the Kubeflow pipelines SDK to

write code that specifies two components, the first is a Dataproc Serverless component that launches

the feature engineering job, and the second is a custom component wrapped in the

create_custom_training_job_from_component utility that launches the custom model training job,

you can create a scalable and maintainable production process that runs end-to-end and tracks the

connections between steps. You can write code that defines the two components, their inputs and

outputs, and their dependencies. You can then use the Kubeflow pipelines SDK to create a pipeline

that runs the two components in sequence, and submit the pipeline to Vertex AI Pipelines for

execution. By using Dataproc Serverless component, you can run your PySpark feature engineering

code on Dataproc Serverless, which is a service that can run Spark batch workloads without

provisioning and managing your own cluster. By using custom component wrapped in the

create_custom_training_job_from_component utility, you can run your custom model training code

on Vertex AI, which is a unified platform for building and deploying machine learning solutions on

Google Cloud1.

The other options are not as good as option C, for the following reasons:

Option A: Creating a Vertex AI Workbench notebook, using the notebook to submit the Dataproc

Serverless feature engineering job, using the same notebook to submit the custom model training

job, and running the notebook cells sequentially to tie the steps together end-to-end would require

more skills and steps than using the Kubeflow pipelines SDK to write code that specifies two

components, the first is a Dataproc Serverless component that launches the feature engineering job,

and the second is a custom component wrapped in the

create_custom_training_job_from_component utility that launches the custom model training job.

Vertex AI Workbench is a service that can provide managed notebooks for machine learning

development and experimentation. Vertex AI Workbench can help you create and run JupyterLab

notebooks, and access various tools and frameworks, such as TensorFlow, PyTorch, and JAX. By

creating a Vertex AI Workbench notebook, using the notebook to submit the Dataproc Serverless

feature engineering job, using the same notebook to submit the custom model training job, and

running the notebook cells sequentially to tie the steps together end-to-end, you can create a

production process that runs end-to-end and tracks the connections between steps. You can write

code that submits the Dataproc Serverless feature engineering job and the custom model training job

to Vertex AI, and run the code in the notebook cells. However, creating a Vertex AI Workbench

notebook, using the notebook to submit the Dataproc Serverless feature engineering job, using the

same notebook to submit the custom model training job, and running the notebook cells

sequentially to tie the steps together end-to-end would require more skills and steps than using the

Kubeflow pipelines SDK to write code that specifies two components, the first is a Dataproc

Serverless component that launches the feature engineering job, and the second is a custom

component wrapped in the create_custom_training_job_from_component utility that launches the

custom model training job. You would need to write code, create and configure the Vertex AI

Workbench notebook, submit the Dataproc Serverless feature engineering job and the custom model

training job, and run the notebook cells. Moreover, this option would not use the Kubeflow pipelines

SDK, which can simplify the pipeline creation and execution process, and provide various features,

such as pipeline parameters, pipeline metrics, and pipeline visualization2.

Option B: Creating a Vertex AI Workbench notebook, initiating an Apache Spark context in the

notebook, and running the PySpark feature engineering code, using the same notebook to run the

custom model training job in TensorFlow, and running the notebook cells sequentially to tie the steps

together end-to-end would not allow you to use Dataproc Serverless to run the feature engineering

job, and could increase the complexity and cost of the production process. Apache Spark is a

framework that can perform large-scale data processing and machine learning. Apache Spark can

help you run various tasks, such as data ingestion, data transformation, data analysis, and data

visualization. PySpark is a Python API for Apache Spark. PySpark can help you write and run Spark

code in Python. An Apache Spark context is a resource that can initialize and configure the Spark

environment. An Apache Spark context can help you create and manage Spark objects, such as

SparkSession, SparkConf, and SparkContext. By creating a Vertex AI Workbench notebook, initiating

an Apache Spark context in the notebook, and running the PySpark feature engineering code, using

the same notebook to run the custom model training job in TensorFlow, and running the notebook

cells sequentially to tie the steps together end-to-end, you can create a production process that runs

end-to-end and tracks the connections between steps. You can write code that initiates an Apache

Spark context and runs the PySpark feature engineering code, and runs the custom model training

job in TensorFlow, and run the code in the notebook cells. However, creating a Vertex AI Workbench

notebook, initiating an Apache Spark context in the notebook, and running the PySpark feature

engineering code, using the same notebook to run the custom model training job in TensorFlow, and

running the notebook cells sequentially to tie the steps together end-to-end would not allow you to

use Dataproc Serverless to run the feature engineering job, and could increase the complexity and

cost of the production process. You would need to write code, create and configure the Vertex AI

Workbench notebook, initiate and configure the Apache Spark context, run the PySpark feature

engineering code, and run the custom model training job in TensorFlow. Moreover, this option would

not use Dataproc Serverless, which is a service that can run Spark batch workloads without

provisioning and managing your own cluster, and provide various benefits, such as autoscaling,

dynamic resource allocation, and serverless billing2.

Option D: Creating a Vertex AI Pipelines job to link and run both components, using the Kubeflow

pipelines SDK to write code that specifies two components, the first component initiates an Apache

Spark context that runs the PySpark feature engineering code, and the second component runs the

TensorFlow custom model training code, would not allow you to use Dataproc Serverless to run the

feature engineering job, and could increase the complexity and cost of the production process.

Vertex AI Pipelines is a service that can run Kubeflow pipelines on Vertex AI. Vertex AI Pipelines can

help you create and manage machine learning pipelines, and integrate with various Vertex AI

services, such as Vertex AI Workbench, Vertex AI Training, and Vertex AI Prediction. A Vertex AI

Pipelines job is a resource that can execute a pipeline on Vertex AI Pipelines. A Vertex AI Pipelines

job can help you run your pipeline steps and tasks, and monitor and debug your pipeline execution.

By creating a Vertex AI Pipelines job to link and run both components, using the Kubeflow pipelines

SDK to write code that specifies two components, the first component initiates an Apache Spark

context that runs the PySpark feature engineering code, and the second component runs the

TensorFlow custom model training code, you can create a scalable and maintainable production

process that runs end-to-end and tracks the connections between steps. You can write code that

defines the two components, their inputs and outputs, and their dependencies. You can then use the

Kubeflow pipelines SDK to create a pipeline that runs the two components in sequence, and submit

the pipeline to Vertex AI Pipelines for execution. However, creating a Vertex AI Pipelines job to link

and run both components, using the Kubeflow pipelines SDK to write code that specifies two

components, the first component initiates an Apache Spark context that runs the PySpark feature

engineering code,

Answer: D
Explanation: The best hardware to choose for your model while minimizing cost is a Vertex AI Workbench usermanaged notebooks instance running on an n1-standard-16 with a preemptible v3-8 TPU. This hardware configuration can provide you with high performance, scalability, and efficiency for your TensorFlow model, as well as low cost and flexibility for your long-running and checkpointing process. The v3-8 TPU is a cloud tensor processing unit (TPU) device, which is a custom ASIC chip designed by Google to accelerate ML workloads. It can handle large and complex models and datasets, and offer fast and stable training and inference. The n1-standard-16 is a general-purpose VM that can support the CPU and memory requirements of your model, as well as the data preprocessing and postprocessing tasks. By choosing a preemptible v3-8 TPU, you can take advantage of the lower price and availability of the TPU devices, as long as you can tolerate the possibility of the device being reclaimed by Google at any time. However, since you have built frequent checkpointing into your training process, you can resume your model from the last saved state, and avoid losing any progress or data. Moreover, you can use the Vertex AI Workbench user-managed notebooks to create and manage your notebooks instances, and leverage the integration with Vertex AI and other Google Cloud services. The other options are not optimal for the following reasons: A) A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with 4 NVIDIA P100 GPUs is not a good option, as it has higher cost and lower performance than the v3-8 TPU. The NVIDIA P100 GPUs are the previous generation of GPUs from NVIDIA, which have lower performance, scalability, and efficiency than the latest NVIDIA A100 GPUs or the TPUs. They also have higher price and lower availability than the preemptible TPUs, which can increase the cost and complexity of your solution. B) A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with an NVIDIA P100 GPU is not a good option, as it has higher cost and lower performance than the v3-8 TPU. It also has less GPU memory and compute power than the option with 4 NVIDIA P100 GPUs, which can limit the size and complexity of your model, and affect the training and inference speed and quality. C) A Vertex AI Workbench user-managed notebooks instance running on an n1-standard-16 with a non-preemptible v3-8 TPU is not a good option, as it has higher cost and lower flexibility than the preemptible v3-8 TPU. The non-preemptible v3-8 TPU has the same performance, scalability, and efficiency as the preemptible v3-8 TPU, but it has higher price and lower availability, as it is reserved for your exclusive use. Moreover, since your model is long-running and checkpointing, you do not need the guarantee of the device not being reclaimed by Google, and you can benefit from the lower cost and higher availability of the preemptible v3-8 TPU. Reference: Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate Google Cloud launches machine learning engineer certification Cloud TPU Vertex AI Workbench user-managed notebooks Preemptible VMs NVIDIA Tesla P100 GPU

Answer: B

Explanation:

The best option for scaling a Vertex AI endpoint efficiently when the demand increases in the future,

using a scikit-learn model that is deployed to a Vertex AI endpoint and tested on live production

traffic, is to configure an appropriate minReplicaCount value based on expected baseline traffic. This

option allows you to leverage the power and simplicity of Vertex AI to automatically scale your

endpoint resources according to the traffic patterns. Vertex AI is a unified platform for building and

deploying machine learning solutions on Google Cloud. Vertex AI can deploy a trained model to an

online prediction endpoint, which can provide low-latency predictions for individual instances.

Vertex AI can also provide various tools and services for data analysis, model development, model

deployment, model monitoring, and model governance. A minReplicaCount value is a parameter

that specifies the minimum number of replicas that the endpoint must always have, regardless of the

load. A minReplicaCount value can help you ensure that the endpoint has enough resources to

handle the expected baseline traffic, and avoid high latency or errors. By configuring an appropriate

minReplicaCount value based on expected baseline traffic, you can scale your endpoint efficiently

when the demand increases in the future. You can set the minReplicaCount value when you deploy

the model to the endpoint, or update it later. Vertex AI will automatically scale up or down the

number of replicas within the range of the minReplicaCount and maxReplicaCount values, based on

the target utilization percentage and the autoscaling metric1.

The other options are not as good as option B, for the following reasons:

Option A: Deploying two models to the same endpoint and distributing requests among them evenly

would not allow you to scale your endpoint efficiently when the demand increases in the future, and

could increase the complexity and cost of the deployment process. A model is a resource that

represents a machine learning model that you can use for prediction. A model can have one or more

versions, which are different implementations of the same model. A model version can help you

experiment and iterate on your model, and improve the model performance and accuracy. An

endpoint is a resource that provides the service endpoint (URL) you use to request the prediction. An

endpoint can have one or more deployed models, which are instances of model versions that are

associated with physical resources. A deployed model can help you serve online predictions with low

latency, and scale up or down based on the traffic. By deploying two models to the same endpoint

and distributing requests among them evenly, you can create a load balancing mechanism that can

distribute the traffic across the models, and reduce the load on each model. However, deploying two

models to the same endpoint and distributing requests among them evenly would not allow you to

scale your endpoint efficiently when the demand increases in the future, and could increase the

complexity and cost of the deployment process. You would need to write code, create and configure

the two models, deploy the models to the same endpoint, and distribute the requests among them

evenly. Moreover, this option would not use the autoscaling feature of Vertex AI, which can

automatically adjust the number of replicas based on the traffic patterns, and provide various

benefits, such as optimal resource utilization, cost savings, and performance improvement2.

Option C: Setting the target utilization percentage in the autoscalingMetricSpecs configuration to a

higher value would not allow you to scale your endpoint efficiently when the demand increases in

the future, and could cause errors or poor performance. A target utilization percentage is a

parameter that specifies the desired utilization level of each replica. A target utilization percentage

can affect the speed and accuracy of the autoscaling process. A higher target utilization percentage

can help you reduce the number of replicas, but it can also cause high latency, low throughput, or

resource exhaustion. By setting the target utilization percentage in the autoscalingMetricSpecs

configuration to a higher value, you can increase the utilization level of each replica, and save some

resources. However, setting the target utilization percentage in the autoscalingMetricSpecs

configuration to a higher value would not allow you to scale your endpoint efficiently when the

demand increases in the future, and could cause errors or poor performance. You would need to

write code, create and configure the autoscalingMetricSpecs, and set the target utilization

percentage to a higher value. Moreover, this option would not ensure that the endpoint has enough

resources to handle the expected baseline traffic, which could cause high latency or errors1.

Option D: Changing the models machine type to one that utilizes GPUs would not allow you to scale

your endpoint efficiently when the demand increases in the future, and could increase the

complexity and cost of the deployment process. A machine type is a parameter that specifies the

type of virtual machine that the prediction service uses for the deployed model. A machine type can

affect the speed and accuracy of the prediction process. A machine type that utilizes GPUs can help

you accelerate the computation and processing of the prediction, and handle more prediction

requests at the same time. By changing the models machine type to one that utilizes GPUs, you can

improve the prediction performance and efficiency of your model. However, changing the models

machine type to one that utilizes GPUs would not allow you to scale your endpoint efficiently when

the demand increases in the future, and could increase the complexity and cost of the deployment

process. You would need to write code, create and configure the model, deploy the model to the

endpoint, and change the machine type to one that utilizes GPUs. Moreover, this option would not

use the autoscaling feature of Vertex AI, which can automatically adjust the number of replicas based

on the traffic patterns, and provide various benefits, such as optimal resource utilization, cost

savings, and performance improvement2.

Reference:

Configure compute resources for prediction | Vertex AI | Google Cloud

Deploy a model to an endpoint | Vertex AI | Google Cloud

Answer: C
Explanation: The best approach to build a model that predicts how much inventory the logistics team should order each month is to use a time series forecasting model to predict each items monthly sales. This approach can capture the temporal patterns and trends in the sales data, such as seasonality, cyclicality, and autocorrelation. It can also account for the variability and uncertainty in the demand, and provide confidence intervals and error metrics for the predictions. By using a time series forecasting model, you can provide the logistics team with accurate and reliable estimates of the future sales for each item, which can help them optimize the inventory levels and avoid overstocking or understocking. You can use various methods and tools to build a time series forecasting model, such as ARIMA, LSTM, Prophet, or BigQuery ML. The other options are not optimal for the following reasons: A) Using a clustering algorithm to group popular items together is not a good approach, as it does not provide any quantitative or temporal information about the sales or the inventory. It only provides a qualitative and static categorization of the items based on their similarity or dissimilarity. Moreover, clustering is an unsupervised learning technique, which does not use any target variable or feedback to guide the learning process. This can result in arbitrary and inconsistent clusters, which may not reflect the true demand or preferences of the customers. B) Using a regression model to predict how much additional inventory should be purchased each month is not a good approach, as it does not account for the individual differences and dynamics of each item. It only provides a single aggregated value for the whole inventory, which can be misleading and inaccurate. Moreover, a regression model is not well-suited for handling time series data, as it assumes that the data points are independent and identically distributed, which is not the case for sales data. A regression model can also suffer from overfitting or underfitting, depending on the choice and complexity of the features and the model. D) Using a classification model to classify inventory levels as UNDER_STOCKED, OVER_STOCKED, and CORRECTLY_STOCKED is not a good approach, as it does not provide any numerical or predictive information about the sales or the inventory. It only provides a discrete and subjective label for the inventory levels, which can be vague and ambiguous. Moreover, a classification model is not wellsuited for handling time series data, as it assumes that the data points are independent and identically distributed, which is not the case for sales data. A classification model can also suffer from class imbalance, misclassification, or overfitting, depending on the choice and complexity of the features, the model, and the threshold. Reference: Professional ML Engineer Exam Guide Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate Google Cloud launches machine learning engineer certification Time Series Forecasting: Principles and Practice BigQuery ML: Time series analysis

Answer: A

Explanation:

The best option for deploying a custom tabular ML model to production for online predictions, and

monitoring the feature distribution over time with minimal effort, using a model that was provided

by the banks vendor, the training data is not available due to its sensitivity, and the model is

packaged as a Vertex AI Model serving container which accepts a string as input for each prediction

instance, is to upload the model to Vertex AI Model Registry and deploy the model to a Vertex AI

endpoint, create a Vertex AI Model Monitoring job with feature drift detection as the monitoring

objective, and provide an instance schema. This option allows you to leverage the power and

simplicity of Vertex AI to serve and monitor your model with minimal code and configuration. Vertex

AI is a unified platform for building and deploying machine learning solutions on Google Cloud.

Vertex AI can deploy a trained model to an online prediction endpoint, which can provide lowlatency

predictions for individual instances. Vertex AI can also provide various tools and services for

data analysis, model development, model deployment, model monitoring, and model governance. A

Vertex AI Model Registry is a resource that can store and manage your models on Vertex AI. A Vertex

AI Model Registry can help you organize and track your models, and access various model

information, such as model name, model description, and model labels. A Vertex AI Model serving

container is a resource that can run your custom model code on Vertex AI. A Vertex AI Model serving

container can help you package your model code and dependencies into a container image, and

deploy the container image to an online prediction endpoint. A Vertex AI Model serving container

can accept various input formats, such as JSON, CSV, or TFRecord. A string input format is a type of

input format that accepts a string as input for each prediction instance. A string input format can help

you encode your feature values into a single string, and separate them by commas. By uploading the

model to Vertex AI Model Registry and deploying the model to a Vertex AI endpoint, you can serve

your model for online predictions with minimal code and configuration. You can use the Vertex AI API

or the gcloud command-line tool to upload the model to Vertex AI Model Registry, and provide the

model name, model description, and model labels. You can also use the Vertex AI API or the gcloud

command-line tool to deploy the model to a Vertex AI endpoint, and provide the endpoint name,

endpoint description, endpoint labels, and endpoint resources. A Vertex AI Model Monitoring job is a

resource that can monitor the performance and quality of your deployed models on Vertex AI. A

Vertex AI Model Monitoring job can help you detect and diagnose issues with your models, such as

data drift, prediction drift, training/serving skew, or model staleness. Feature drift is a type of model

monitoring metric that measures the difference between the distributions of the features used to

train the model and the features used to serve the model over time. Feature drift can indicate that

the online data is changing over time, and the model performance is degrading. By creating a Vertex

AI Model Monitoring job with feature drift detection as the monitoring objective, and providing an

instance schema, you can monitor the feature distribution over time with minimal effort. You can use

the Vertex AI API or the gcloud command-line tool to create a Vertex AI Model Monitoring job, and

provide the monitoring objective, the monitoring frequency, the alerting threshold, and the

notification channel. You can also provide an instance schema, which is a JSON file that describes the

features and their types in the prediction input data. An instance schema can help Vertex AI Model

Monitoring parse and analyze the string input format, and calculate the feature distributions and

distance scores1.

The other options are not as good as option A, for the following reasons:

Option B: Uploading the model to Vertex AI Model Registry and deploying the model to a Vertex AI

endpoint, creating a Vertex AI Model Monitoring job with feature skew detection as the monitoring

objective, and providing an instance schema would not help you monitor the changes in the online

data over time, and could cause errors or poor performance. Feature skew is a type of model

monitoring metric that measures the difference between the distributions of the features used to

train the model and the features used to serve the model at a given point in time. Feature skew can

indicate that the model is not trained on the representative data, or that the data is changing over

time. By creating a Vertex AI Model Monitoring job with feature skew detection as the monitoring

objective, and providing an instance schema, you can monitor the feature distribution at a given

point in time with minimal effort. However, uploading the model to Vertex AI Model Registry and

deploying the model to a Vertex AI endpoint, creating a Vertex AI Model Monitoring job with feature

skew detection as the monitoring objective, and providing an instance schema would not help you

monitor the changes in the online data over time, and could cause errors or poor performance. You

would need to use the Vertex AI API or the gcloud command-line tool to upload the model to Vertex

AI Model Registry, deploy the model to a Vertex AI endpoint, create a Vertex AI Model Monitoring

job, and provide an instance schema. Moreover, this option would not monitor the feature drift,

which is a more direct and relevant metric for measuring the changes in the online data over time,

and the model performance and quality1.

Option C: Refactoring the serving container to accept key-value pairs as input format, uploading the

model to Vertex AI Model Registry and deploying the model to a Vertex AI endpoint, creating a

Vertex AI Model Monitoring job with feature drift detection as the monitoring objective would

require more skills and steps than uploading the model to Vertex AI Model Registry and deploying

the model to a Vertex AI endpoint, creating a Vertex AI Model Monitoring job with feature drift

detection as the monitoring objective, and providing an instance schema. A key-value pair input

format is a type of input format that accepts a key-value pair as input for each prediction instance. A

key-value pair input format can help you specify the feature names and values in a JSON object, and

separate them by colons. By refactoring the serving container to accept key-value pairs as input

format, uploading the model to Vertex AI Model Registry and deploying the model to a Vertex AI

endpoint, creating a Vertex AI Model Monitoring job with feature drift detection as the monitoring

objective, you can serve and monitor your model with minimal code and configuration. You can write

code to refactor the serving container to accept key-value pairs as input format, and use the Vertex

AI API or the gcloud command-line tool to upload the model to Vertex AI Model Registry, deploy the

model to a Vertex AI endpoint, and create a Vertex AI Model Monitoring job. However, refactoring

the serving container to accept key-value pairs as input format, uploading the model to Vertex AI

Model Registry and deploying the model to a Vertex AI endpoint, creating a Vertex AI Model

Monitoring job with feature drift detection as the monitoring objective would require more skills and

steps than uploading the model to Vertex AI Model Registry and deploying the model to a Vertex AI

endpoint, creating a Vertex AI Model Monitoring job with feature drift detection as the monitoring

objective, and providing an instance schema. You would need to write code, refactor the serving

container, upload the model to Vertex AI Model Registry, deploy the model to a Vertex AI endpoint,

and create a Vertex AI Model Monitoring job. Moreover, this option would not use the instance

schema, which is a JSON file that can help Vertex AI Model Monitoring parse and analyze the string

input format, and calculate the feature distributions and distance scores1.

Option D: Refactoring the serving container to accept key-value pairs as input format, uploading the

model to Vertex AI Model Registry and deploying the model to a Vertex AI endpoint, creating a

Vertex AI Model Monitoring job with feature skew detection as the monitoring objective would

require more skills and steps than uploading the model to Vertex AI Model Registry and deploying

the model to a Vertex AI endpoint, creating a Vertex AI Model Monitoring job with feature drift

detection as the monitoring objective, and providing an instance schema, and would not help you

monitor the changes in the online data over time, and could cause errors or poor performance.

Feature skew is a type of model monitoring metric that measures the difference between the

distributions of the features used to train the model and the features used to serve the model at a

given point in time. Feature skew can indicate that the model is not trained on the representative

data, or that the data is changing over time. By creating a Vertex AI Model Monitoring job with

feature skew detection as the monitoring objective, you can monitor the feature distribution at a

given point in time with minimal effort. However, refactoring the serving container to accept keyvalue

pairs as input format, uploading the model to Vertex AI Model Registry and deploying the

model to a Vertex AI endpoint, creating a Vertex AI Model Monitoring job with feature skew

detection as the monitoring objective would require more skills and steps than uploading the model

to Vertex AI Model Registry and deploying the model to a Vertex AI endpoint, creating a Vertex AI

Model Monitoring job with feature drift detection as the monitoring objective, and providing an

instance schema, and would not help you monitor the changes in the online data over time, and

could cause errors or poor performance. You would need to write code, refactor the serving

container, upload the model to Vertex AI Model Registry, deploy the model to a Vertex AI endpoint,

and create a Vertex AI Model Monitoring job. Moreover, this option would not monitor the feature

drift, which is a more direct and relevant metric for measuring the changes in the online data over

time, and the model performance and quality1.

Reference:

Using Model Monitoring | Vertex AI | Google Cloud

Exam Code	Professional-Machine-Learning-Engineer
Exam Name	Google Professional Machine Learning Engineer
Questions	296 Questions Answers With Explanation
Update Date	July 16, 2026
Price	Was : ~~$81~~ Today : $45 Was : ~~$99~~ Today : $55 Was : ~~$117~~ Today : $65

Google Professional-Machine-Learning-Engineer Exam Dumps

Google Professional Machine Learning Engineer

What Is the Professional-Machine-Learning-Engineer Certification Exam?

Why the Machine Learning Engineer Certification Matters?

Who Should Take the Professional-Machine-Learning-Engineer Exam?

Knowledge and Skills Evaluated in the Google Professional Machine Learning Engineer

Professional-Machine-Learning-Engineer Exam Preparation Resources

Preparation Features:

How to Prepare for the Professional-Machine-Learning-Engineer Certification Exam?

Benefits of Earning the Machine Learning Engineer Certification

Prepare for the Professional-Machine-Learning-Engineer Exam with MyCertsHub

Leave Your Review