Published inGoogle Cloud - CommunityParallel & Serverless CSV Ingestion to CloudSQL Using Cloud DataflowThis blog post explores how to solve this problem efficiently using a Dataflow pipeline powered by Apache Beam.Aug 31, 2024Aug 31, 2024
Published inDev GeniusSpark Configuration CalculatorOne aspect that has always posed a challenge for me is memory management. To delve deeper into the nuances of Spark and foster a practical…Dec 18, 2023Dec 18, 2023
Published inGoogle Cloud - CommunityUsing Spark on Dataproc & Apache Iceberg To Build an Open LakehouseIn this article, our primary focus is using Spark on Dataproc in GCP for reading and writing from a Lakehouse.Dec 6, 20231Dec 6, 20231
Published inGoogle Cloud - CommunityA guide to RAID multiple Local SSDs & mount it to DataprocA guide to RAID multiple Local SSDs & mount it to DataprocNov 28, 2023Nov 28, 2023
Published inGoogle Cloud - CommunityUnderstanding Driver Pools in DataprocLet’s learn about driver pools in Dataproc — a mechanism to scale application concurrency in Dataproc clustersAug 16, 20231Aug 16, 20231
Published inGoogle Cloud - CommunityA Custom AMA App Using VertexAIIn this blog, we’ll walk through the process of creating a custom search engine using VertexAI, Streamlit and Langchain.Jul 26, 20232Jul 26, 20232
Hadoop — Understanding Splits, Blocks & Everything In BetweenWe will try to answer a simple question through the length of this article: How are mapper and reducer task counts determined?Apr 21, 2023Apr 21, 2023
Published inBetter ProgrammingUnderstanding CPU Oversubscription in Dataproc/HadoopWhen does it make sense to oversubscribe the CPU cores?Apr 7, 2023Apr 7, 2023
Published inGoogle Cloud - CommunityGCP Cloud Logging : How to Enable Data Access Audit For Selected BucketsIn this post, we will look at how to enable data access audit for selected GCS buckets while excluding other buckets within the same…Oct 4, 20221Oct 4, 20221
Published inGoogle Cloud - CommunityDataproc — Why is my cluster not scaling?In this post, we will address 3 questions with one common answer that customers ask while using autoscalingAug 16, 20221Aug 16, 20221