makeporngreatagain.pro
yeahporn.top
hd xxx

Practice Test 1 | Google Cloud Certified Professional Data Engineer | Dumps | Mock Test

4,779

A FinTech company has over 20TB of data in ORC format stored in on-premise disks. The CTO decides to migrate the current infrastructure to Google Cloud. The current data pipeline cleanses and transforms raw data for reporting and further analysis and prediction using Apache Hive & Spark.

Which of the following Google Cloud products you should use?

A. Dataproc for processing and Cloud Storage for storing data.
B. Dataproc for processing, BigQuery for storage, Dataflow for data pipeline.
C. App Engine for processing, Google Storage for storing data, Dataflow for data pipeline.
D. Dataproc for processing, Dataproc local HDFS for storage, Dataflow for data pipeline.

Answer: A

When you want to move Hadoop & Spark workloads from an on-premises environment to Google Cloud Platform (GCP), It’s recommended to use Dataproc to run Apache Spark & Hadoop clusters.

Cloud Storage is a good option if:

  • Your data in ORC, Parquet, Avro, or any other format will be used by different clusters or jobs, and you need data persistence if the cluster terminates.
  • You need high throughput and your data is stored in files larger than 128 MB.
  • You need cross-zone durability for your data.
  • You need data to be highly available—for example, you want to eliminate HDFS NameNode as a single point of failure.

Source(s):

Migrating Apache Spark Jobs to Cloud Dataproc:

Comments are closed, but trackbacks and pingbacks are open.

baseofporn.com https://www.opoptube.com
Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.