The version of Apache Hive can be retrieved from Command Line. You don’t have to navigate through the configuration files or browse through the User Interface. There are two commands that can be used from Command Line to obtain the version of Apache Hive. COMMAND #1 This command follows the popular convention used by other […]
This is Big Data
Did you know that every minute:
50,000 photos are posted on Instagram,
500,000 photos are shared on Snapchat,
1,000,000 swipes are done on Tinder and
4,00,000 videos are watched on YouTube
What is Cloud Computing?
We hear the term “Cloud Computing” a lot in the media, advertisements, news and in memes. Cloud Computing has continuously been a trending term throughout the last decade. But what does cloud computing mean? Cloud computing is the offering of computing as a service. Consumers can pay the cloud computing service for on-demand use of […]
Amazon EMR and Google Cloud Dataproc: Top 10 Common Features
Amazon Web Services and Google Cloud Platform are the two of the three market leaders in cloud computing. They both offer similar kind of cloud-native big data platforms to filter, transform, aggregate and process data at scale. Amazon EMR and Google Cloud Dataproc are Amazon Web Service’s and Google Cloud Platform’s managed big data platforms […]
My Path To AWS Certified Big Data Specialty
Amazon Web Services certifications are few of the most reputed in the field of Software Engineering. I successfully completed the AWS Big Data Speciality certification on Nov 25, 2019. This certification tests the candidate on two of the most wanted skills right now – Cloud and Big Data technologies. Prior to taking this certification, I […]
Deep Learning Specialization – Neural Networks and Deep Learning
Deep Learning is one of the most sought after skills in tech right now. On November 14, 2019, I completed the Neural Networks and Deep Learning course offered by deeplearning.ai on coursera.org. Besides Cloud Computing and Big Data technologies, I have huge interests in Machine Learning and Deep Learning. I did my Masters in Computer […]
Amazon QuickSight – Visual Types Demystified
Amazon QuickSight is a managed business analytics service that’s part of the Amazon Web Services suite. Amazon QuickSight offers capabilities to create dashboards with visualizations and perform ad hoc analysis to obtain insights from the data. Amazon QuickSight works with several AWS data sources such as RDS, Aurora and Redshift, and also other data sources […]
Execute Linux Commands from Spark Shell and PySpark Shell
Linux commands can be executed from Spark Shell and PySpark Shell. This comes in handy during development to run some Linux commands like listing the contents of a HDFS directory or a local directory. These methods are provided by the native libraries of Scala and Python languages. Hence, we can even use these methods within […]
Course Review – Machine Learning A-Z: Hands-On Python & R In Data Science
I completed Machine Learning A-Z: Hands-On Python & R In Data Science course from Udemy on Aug 1, 2019. I would say “Machine Learning A-Z for Programmers” is a more apt title for the course. It’s a beginner friendly course aimed towards programmers that covers a wide range of topics with hands-on programming with Python […]
Lean Six Sigma White Belt
I received my Lean Six Sigma White Belt on July 25, 2019 through my employer, CME Group. White Belt was a great way to get my feet wet with Lean Six Sigma. In this post, I provide a gist of what Lean Six Sigma is and share my experience. Lean Six Sigma Lean Six Sigma […]