Verified Professional-Machine-Learning-Engineer dumps Q&As 100% Pass in First Attempt Guaranteed Updated Dump from DumpsQuestion
Pass Google Cloud Certified Professional-Machine-Learning-Engineer Exam With 150 Questions
To be eligible to take the exam, candidates must have a strong understanding of machine learning concepts, including supervised and unsupervised learning, deep learning, and reinforcement learning. They must also have experience working with Google Cloud's machine learning services, such as AutoML, AI Platform, and TensorFlow.
NEW QUESTION # 59
Machine Learning Specialist is training a model to identify the make and model of vehicles in images. The Specialist wants to use transfer learning and an existing model trained on images of general objects. The Specialist collated a large custom dataset of pictures containing different vehicle makes and models.
What should the Specialist do to initialize the model to re-train it with the custom data?
- A. Initialize the model with random weights in all layers including the last fully connected layer.
- B. Initialize the model with random weights in all layers and replace the last fully connected layer.
- C. Initialize the model with pre-trained weights in all layers including the last fully connected layer.
- D. Initialize the model with pre-trained weights in all layers and replace the last fully connected layer.
Answer: D
Explanation:
Explanation/Reference:
NEW QUESTION # 60
You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on Al Platform for high-throughput online prediction. Which architecture should you use?
- A. * Validate the accuracy of the model that you trained on preprocessed data
* Create a new model that uses the raw data and is available in real time
* Deploy the new model onto Al Platform for online prediction - B. * Stream incoming prediction request data into Cloud Spanner
* Create a view to abstract your preprocessing logic.
* Query the view every second for new records
* Submit a prediction request to Al Platform using the transformed data
* Write the predictions to an outbound Pub/Sub queue. - C. * Send incoming prediction requests to a Pub/Sub topic
* Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic.
* Implement your preprocessing logic in the Cloud Function
* Submit a prediction request to Al Platform using the transformed data
* Write the predictions to an outbound Pub/Sub queue - D. * Send incoming prediction requests to a Pub/Sub topic
* Transform the incoming data using a Dataflow job
* Submit a prediction request to Al Platform using the transformed data
* Write the predictions to an outbound Pub/Sub queue
Answer: D
Explanation:
https://cloud.google.com/architecture/data-preprocessing-for-ml-with-tf-transform-pt1#where_to_do_preprocessing
NEW QUESTION # 61
A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.
Which approach will provide the MAXIMUM performance boost?
- A. Initialize the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector.
- B. Use gated recurrent units (GRUs) instead of LSTM and run the training process until the validation loss stops decreasing.
- C. Initialize the words by term frequency-inverse document frequency (TF-IDF) vectors pretrained on a large collection of news articles related to the energy sector.
- D. Reduce the learning rate and run the training process until the training loss stops decreasing.
Answer: D
NEW QUESTION # 62
A Machine Learning Specialist at a company sensitive to security is preparing a dataset for model training. The dataset is stored in Amazon S3 and contains Personally Identifiable Information (PII).
The dataset:
* Must be accessible from a VPC only.
* Must not traverse the public internet.
How can these requirements be satisfied?
- A. Create a VPC endpoint and use Network Access Control Lists (NACLs) to allow traffic between only the given VPC endpoint and an Amazon EC2 instance.
- B. Create a VPC endpoint and use security groups to restrict access to the given VPC endpoint and an Amazon EC2 instance
- C. Create a VPC endpoint and apply a bucket access policy that restricts access to the given VPC endpoint and the VPC.
- D. Create a VPC endpoint and apply a bucket access policy that allows access from the given VPC endpoint and an Amazon EC2 instance.
Answer: C
NEW QUESTION # 63
This graph shows the training and validation loss against the epochs for a neural network.
The network being trained is as follows:
* Two dense layers, one output neuron
* 100 neurons in each layer
* 100 epochs
* Random initialization of weights
Which technique can be used to improve model performance in terms of accuracy in the validation set?
- A. Early stopping
- B. Random initialization of weights with appropriate seed
- C. Adding another layer with the 100 neurons
- D. Increasing the number of epochs
Answer: D
NEW QUESTION # 64
You are training an LSTM-based model on Al Platform to summarize text using the following job submission script:
You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What should you do?
- A. Modify the 'epochs' parameter
- B. Modify the batch size' parameter
- C. Modify the 'learning rate' parameter
- D. Modify the 'scale-tier' parameter
Answer: A
NEW QUESTION # 65
You are building a real-time prediction engine that streams files which may contain Personally Identifiable Information (Pll) to Google Cloud. You want to use the Cloud Data Loss Prevention (DLP) API to scan the files. How should you ensure that the Pll is not accessible by unauthorized individuals?
- A. Stream all files to Google CloudT and then write the data to BigQuery Periodically conduct a bulk scan of the table using the DLP API.
- B. Create three buckets of data: Quarantine, Sensitive, and Non-sensitive Write all data to the Quarantine bucket. Periodically conduct a bulk scan of that bucket using the DLP API, and move the data to either the Sensitive or Non-Sensitive bucket
- C. Create two buckets of data Sensitive and Non-sensitive Write all data to the Non-sensitive bucket Periodically conduct a bulk scan of that bucket using the DLP API, and move the sensitive data to the Sensitive bucket
- D. Stream all files to Google Cloud, and write batches of the data to BigQuery While the data is being written to BigQuery conduct a bulk scan of the data using the DLP API.
Answer: A
NEW QUESTION # 66
Your data science team needs to rapidly experiment with various features, model architectures, and hyperparameters. They need to track the accuracy metrics for various experiments and use an API to query the metrics over time. What should they use to track and report their experiments while minimizing manual effort?
- A. Use Kubeflow Pipelines to execute the experiments Export the metrics file, and query the results using the Kubeflow Pipelines API.
- B. Use Al Platform Training to execute the experiments Write the accuracy metrics to Cloud Monitoring, and query the results using the Monitoring API.
- C. Use Al Platform Notebooks to execute the experiments. Collect the results in a shared Google Sheets file, and query the results using the Google Sheets API
- D. Use Al Platform Training to execute the experiments Write the accuracy metrics to BigQuery, and query the results using the BigQueryAPI.
Answer: A
NEW QUESTION # 67
You are training a Resnet model on Al Platform using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf .data dataset?
Choose 2 answers
- A. Use the interleave option for reading data
- B. Increase the buffer size for the shuffle option.
- C. Decrease the batch size argument in your transformation
- D. Reduce the value of the repeat parameter
- E. Set the prefetch option equal to the training batch size
Answer: A,E
NEW QUESTION # 68
A company that promotes healthy sleep patterns by providing cloud-connected devices currently hosts a sleep tracking application on AWS. The application collects device usage information from device users. The company's Data Science team is building a machine learning model to predict if and when a user will stop utilizing the company's devices. Predictions from this model are used by a downstream application that determines the best approach for contacting users.
The Data Science team is building multiple versions of the machine learning model to evaluate each version against the company's business goals. To measure long-term effectiveness, the team wants to run multiple versions of the model in parallel for long periods of time, with the ability to control the portion of inferences served by the models.
Which solution satisfies these requirements with MINIMAL effort?
- A. Build and host multiple models in Amazon SageMaker. Create an Amazon SageMaker endpoint configuration with multiple production variants. Programmatically control the portion of the inferences served by the multiple models by updating the endpoint configuration.
- B. Build and host multiple models in Amazon SageMaker. Create multiple Amazon SageMaker endpoints, one for each model. Programmatically control invoking different models for inference at the application layer.
- C. Build and host multiple models in Amazon SageMaker Neo to take into account different types of medical devices. Programmatically control which model is invoked for inference based on the medical device type.
- D. Build and host multiple models in Amazon SageMaker. Create a single endpoint that accesses multiple models. Use Amazon SageMaker batch transform to control invoking the different models through the single endpoint.
Answer: D
NEW QUESTION # 69
You need to build classification workflows over several structured datasets currently stored in BigQuery. Because you will be performing the classification several times, you want to complete the following steps without writing code: exploratory data analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?
- A. Use Al Platform Notebooks to run the classification model with pandas library
- B. Use Al Platform to run the classification model job configured for hyperparameter tuning
- C. Run a BigQuery ML task to perform logistic regression for the classification
- D. Configure AutoML Tables to perform the classification task
Answer: D
Explanation:
https://cloud.google.com/automl-tables/docs/beginners-guide
NEW QUESTION # 70
A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-based ensemble model using Amazon SageMaker with Area Under the ROC Curve (AUC) as the objective metric. This workflow will eventually be deployed in a pipeline that retrains and tunes hyperparameters each night to model click-through on data that goes stale every 24 hours.
With the goal of decreasing the amount of time it takes to train these models, and ultimately to decrease costs, the Specialist wants to reconfigure the input hyperparameter range(s).
Which visualization will accomplish this?
- A. A histogram showing whether the most important input feature is Gaussian.
- B. A scatter plot showing the correlation between maximum tree depth and the objective metric.
- C. A scatter plot showing the performance of the objective metric over each training iteration.
- D. A scatter plot with points colored by target variable that uses t-Distributed Stochastic Neighbor Embedding (t-SNE) to visualize the large number of input variables in an easier-to-read dimension.
Answer: D
NEW QUESTION # 71
A retail company is using Amazon Personalize to provide personalized product recommendations for its customers during a marketing campaign. The company sees a significant increase in sales of recommended items to existing customers immediately after deploying a new solution version, but these sales decrease a short time after deployment. Only historical data from before the marketing campaign is available for training.
How should a data scientist adjust the solution?
- A. Use the event tracker in Amazon Personalize to include real-time user interactions.
- B. Add user metadata and use the HRNN-Metadata recipe in Amazon Personalize.
- C. Implement a new solution using the built-in factorization machines (FM) algorithm in Amazon SageMaker.
- D. Add event type and event value fields to the interactions dataset in Amazon Personalize.
Answer: D
NEW QUESTION # 72
A trucking company is collecting live image data from its fleet of trucks across the globe. The data is growing rapidly and approximately 100 GB of new data is generated every day. The company wants to explore machine learning uses cases while ensuring the data is only accessible to specific IAM users.
Which storage option provides the most processing flexibility and will allow access control with IAM?
- A. Use an Amazon S3-backed data lake to store the raw images, and set up the permissions using bucket policies.
- B. Configure Amazon EFS with IAM policies to make the data available to Amazon EC2 instances owned by the IAM users.
- C. Use a database, such as Amazon DynamoDB, to store the images, and set the IAM policies to restrict access to only the desired IAM users.
- D. Setup up Amazon EMR with Hadoop Distributed File System (HDFS) to store the files, and restrict access to the EMR instances using IAM policies.
Answer: D
Explanation:
Explanation
NEW QUESTION # 73
A machine learning (ML) specialist wants to secure calls to the Amazon SageMaker Service API. The specialist has configured Amazon VPC with a VPC interface endpoint for the Amazon SageMaker Service API and is attempting to secure traffic from specific sets of instances and IAM users. The VPC is configured with a single public subnet.
Which combination of steps should the ML specialist take to secure the traffic? (Choose two.)
- A. Add a SageMaker Runtime VPC endpoint interface to the VPC.
- B. Modify the security group on the endpoint network interface to restrict access to the instances.
- C. Modify the ACL on the endpoint network interface to restrict access to the instances.
- D. Add a VPC endpoint policy to allow access to the IAM users.
- E. Modify the users' IAM policy to allow access to Amazon SageMaker Service API calls only.
Answer: B,D
Explanation:
Explanation/Reference: https://aws.amazon.com/blogs/machine-learning/private-package-installation-in-amazon- sagemaker-running-in-internet-free-mode/
NEW QUESTION # 74
Your team is working on an NLP research project to predict political affiliation of authors based on articles they have written. You have a large training dataset that is structured like this:
You followed the standard 80%-10%-10% data distribution across the training, testing, and evaluation subsets. How should you distribute the training examples across the train-test-eval subsets while maintaining the 80-10-10 proportion?
A)
B)
C)
D)
- A. Option D
- B. Option C
- C. Option B
- D. Option A
Answer: C
Explanation:
If we just put inside the Training set , Validation set and Test set , randomly Text, Paragraph or sentences the model will have the ability to learn specific qualities about The Author's use of language beyond just his own articles. Therefore the model will mixed up different opinions. Rather if we divided things up a the author level, so that given authors were only on the training data, or only in the test data or only in the validation data. The model will find more difficult to get a high accuracy on the test validation (What is correct and have more sense!). Because it will need to really focus in author by author articles rather than get a single political affiliation based on a bunch of mixed articles from different authors. https://developers.google.com/machine-learning/crash-course/18th-century-literature For example, suppose you are training a model with purchase data from a number of stores. You know, however, that the model will be used primarily to make predictions for stores that are not in the training data. To ensure that the model can generalize to unseen stores, you should segregate your data sets by stores. In other words, your test set should include only stores different from the evaluation set, and the evaluation set should include only stores different from the training set. https://cloud.google.com/automl-tables/docs/prepare#ml-use
NEW QUESTION # 75
You work for a magazine publisher and have been tasked with predicting whether customers will cancel their annual subscription. In your exploratory data analysis, you find that 90% of individuals renew their subscription every year, and only 10% of individuals cancel their subscription. After training a NN Classifier, your model predicts those who cancel their subscription with 99% accuracy and predicts those who renew their subscription with 82% accuracy. How should you interpret these results?
- A. This is not a good result because the model should have a higher accuracy for those who renew their subscription than for those who cancel their subscription.
- B. This is a good result because the accuracy across both groups is greater than 80%.
- C. This is not a good result because the model is performing worse than predicting that people will always renew their subscription.
- D. This is a good result because predicting those who cancel their subscription is more difficult, since there is less data for this group.
Answer: D
NEW QUESTION # 76
Your task is classify if a company logo is present on an image. You found out that 96% of a data does not include a logo. You are dealing with data imbalance problem. Which metric do you use to evaluate to model?
- A. F1 Score
- B. F Score with higher recall weighted than precision
- C. RMSE
- D. F Score with higher precision weighting than recall
Answer: B
NEW QUESTION # 77
You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-recommended best practices?
- A. Convert the images to tf .Tensor Objects, and then run Dataset. from_tensor_slices{).
- B. Convert the images to tf .Tensor Objects, and then run tf. data. Dataset. from_tensors ().
- C. Convert the images Into TFRecords, store the images in Cloud Storage, and then use the tf. data API to read the images for training
- D. Create a tf.data.Dataset.prefetch transformation
Answer: C
NEW QUESTION # 78
......
To be eligible for the Google Professional Machine Learning Engineer Certification Exam, you must have a strong background in software engineering, data modeling, and statistics. You must also have hands-on experience working with machine learning frameworks such as TensorFlow or PyTorch, and be familiar with cloud computing platforms such as Google Cloud Platform.
Ultimate Guide to Prepare Free Professional-Machine-Learning-Engineer Exam Questions and Answer: https://drive.google.com/open?id=1mg9mfikW4VjfMl8NmcoyvAjDRpIAZtkD
Pass Professional-Machine-Learning-Engineer Tests Engine pdf - All Free Dumps: https://www.dumpsquestion.com/Professional-Machine-Learning-Engineer-exam-dumps-collection.html