hd xxx

Practice Test 4 | Google Cloud Certified Professional Data Engineer | Dumps | Mock Test


Data engineers of an ecommerce company wants to build and train an ML model to weigh customer feedback on a product but the management don’t want to disclose the information regarding who submitted the feedback. Some of the information like delivery address and purchase history are critical for training of ML model. After the data is available your data exploration team needs to query the data so it is important to protect the sensitive data fields. The data is unstructured text-based dataset. Identify the best possible solution that can be quickly deployed:

A. Analyze the data using Cloud Data Prep, identify the sensitive data and remove sensitive data from the dataset. Create the recipe for the same. Once done Cloud Data Flow will be triggered which will store the data in Big Query for further analysis.
B. Use Google Cloud Data Loss Prevention API to identify sensitive information and mask the same before storing it for analysis.
C. Analyze the data using Cloud Data Prep, identify the sensitive data and mask sensitive data from the dataset. Create the recipe for the same. Once done Cloud Data Flow will be triggered which will store the data in Big Query for further analysis.
D. Create a machine learning model to identify sensitive information based on past training. This model will analyze the data and will mask the same before supplying to data exploration team.

Correct answer is B.

Option A is incorrect. Building a solution using cloud data prep require some manual intervention and analysis. Identifying sensitive data hidden in unstructured data could be tricky and can lead to mistakes. Also, in the option removing sensitive data should not be done. Solution should encrypt or mask the sensitive data rather than completely removing from the dataset.

Option B is correct. Cloud DLP identify the data using more than 90 predefined detectors to identify patterns, format and checksums. Using cloud DLP sensitive data can be easily identified by the algorithm, Also, the algorithm can mask the data based on user input.

This solution can be easily deployed and will be most accurate.

Option C is incorrect. Building a solution using cloud data prep require some manual intervention and analysis. Identifying sensitive data hidden in unstructured data could be tricky and can lead to mistakes.

Option D is incorrect. Creating a custom machine learning model will take significant amount of time. Also, training the model will take lots of input. As sensitive data analysis can be a bit tricky and require lots of training for model. Using CLoud DLP can be an effective solution.

Comments are closed, but trackbacks and pingbacks are open.
Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.