makeporngreatagain.pro
yeahporn.top
hd xxx

Practice Test 2 | Google Cloud Certified Professional Data Engineer | Dumps | Mock Test

5,231

A company decides to migrate its on-premise data infrastructure to the cloud mainly for high availability of cloud services and to lower the high costs of storing data on-premise. The infrastructure uses HDFS to store data and be processed and transformed using Apache Hive & Spark. The company wants to migrate the infrastructure and DevOps team still wants to administrate the infrastructure in the cloud. As a data architect, which of the following is the approach recommended by Google?

A. Use Dataproc to process the data. Store data in Google Storage.
B. Build a Dataflow pipeline. Store the data in Google Storage. Use Cloud Compute to launch instances and install the required dependencies for processing the data.
C. Use Dataproc to process the data. Store data in Dataproc’s HDFS.
D. Build a Dataflow pipeline. Store the data in persistent disks in HDFS. Execute the code in Spark framework provided by Dataflow.

Answer: A.

Description: Dataproc is cloud-native Apache Hadoop & Apache Spark service. Dataproc is a fully-managed service from Google to run Apache Hadoop & Spark clusters. Dataflow is a simplified streaming/batching data processing service. With Apache Beam, it provides rich set of windowing and session analysis primitives as well as an ecosystem of source & sink connectors.

Answer B is incorect: Dataflow is serverless which may not suit DevOps requirement to fully manage the pipeline and it’s unnecessary to use Cloud Compute for installing dependencies. Answer C is incorrect: Dataproc’s HDFS is volatile, means it will be removed when the cluster is deleted. Dataproc clusters can be kept up indefinitely but this may lead to high costs which defeats the purpose of migration.

Answer D: In addition to what discussed in answer B, storing data using persistent disks can be only accessible by Compute engines and it’s more expensive than storing in Google Storage.

Answer A fulfills the requirements for migrating the on-premise infra to the cloud with high availability, minimum costs and full control by DevOps.

Source(s):

Cloud Dataproc: https://cloud.google.com/dataproc/

Cloud Dataflow & Dataflow vs. Dataproc:

Comments are closed, but trackbacks and pingbacks are open.

baseofporn.com https://www.opoptube.com
Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.