Cloudera CCA Spark and Hadoop Developer (CCA175) Certification

Cloudera CCA Spark and Hadoop Developer
Cloudera CCA Spark and Hadoop Developer

Cloudera’s CCA Spark and Hadoop Developer (CCA175) exam validates the candidate’s ability to employ various Big Data tools such as Hadoop, Spark, Hive, Impala, Sqoop, Flume, Kafka, etc. to solve hands-on problems. I passed the CCA175 certification exam on May 13, 2019, and wanted to share my experience. This article has everything you should know about the CCA175 exam.

Cloudera CCA175

CCA175 exam has a time limit of 2 hours to solve 8-12 hands-on tasks on the Cloudera Enterprise cluster. Each task has to be solved using Big Data tools such as Hadoop, Spark, Hive, Sqoop, Flume, Kafka, etc. The passing score is 70% and the exam costs USD $295. There are no prerequisites for this exam. The exam can be taken from your remote location. All you need is a computer with a webcam and a good internet connection.

Official certification page – https://www.cloudera.com/about/training/certification/cca-spark.html
Register for CCA175 – https://university.cloudera.com/content/cca175

Why Cloudera CCA175?

CCA175 requires good knowledge and hands-on experience with technologies such as Hadoop, HDFS, Spark, Scala, PySpark, Sqoop, Hive, Flume, Kafka, and Avro. I enjoy setting goals and working towards them. I wanted to force myself to properly learn and practice these technologies. I tend to look through the User Guide and Documentation only when I face issues during my development. In preparing for a certification exam, I’m forced to learn the topics formally and read through the Documentation pages within the tight deadline. Unlike other Spark certification exams, CCA175 tests not just on Spark but also on other Big Data technologies. Furthermore, certifications help with showcasing that you possess the required knowledge in the domain.

Exam Topics

One should be familiar with the following technologies to pass the exam.

  • Apache Hadoop – Hadoop, HDFS, Yarn
  • Apache Spark – Spark RDD, Spark Datasets, Spark SQL, Spark Streaming using both Scala and Python
  • Apache Sqoop – Import, import-all-tables, export, job, eval, list-tables, list-databases, create-hive-table, merge, codegen
  • Apache Hive – DDL, DML, Partitioning, Windowing and Analytical functions
  • Cloudera Impala
  • Apache Avro
  • Apache Flume
  • Apache Kafka

Preparation Plan

I believe the main objective should be to learn the topics thoroughly instead of learning the bare minimum to pass the certification. I strongly recommend practicing all the topics from the below URLs. This plan not only helps you complete the certification but also makes you proficient in these technologies.

Things to remember before taking the CCA175 Exam

  • Have a computer with a webcam and a good internet connection.
  • Make use you have Google Chrome installed along with ExamLocal’s add-on. Verify your computer is compatible to take the exam by using the self-check – https://www.examslocal.com/ScheduleExam/Home/CompatibilityCheck.
  • Keep an identification card like a Driver’s license or Passport to verify your identity to the proctor.
  • Keep the desk and room void of any electronics and papers. The proctor would ask you to show the desk and room with your webcam to verify this.
  • If you’re planning to take the exam on a laptop, connect it to an external monitor as the laptop screen may be too small to view the remote desktop.
  • Ensure no one else is in the room before starting the exam. Keep the doors locked if possible to prevent any disturbances.
  • If you’re taking an exam from your workplace or library, make sure the firewall is configured to allow connections to ExamLocal.
  • Drink water and eat food before the exam as you’re not allowed any drinks or food during the exam.
  • Use the restroom just before the exam starts as you’re not allowed any breaks during the exam.

Things to remember during the CCA175 Exam

  • Be patient and remain calm. There’s no need to panic.
  • Read all the questions before starting on the solutions. Start with the easy ones.
  • Verify each solution after solving them. Check the output location and format of the output. You may not have time at the end of the exam to verify again.
  • Be cognizant of the time. Skip the problem and come back later if you’re stuck.
  • Keep in mind that you don’t need to score 100% as the passing score is 70%. It’s okay to miss a problem. Don’t let one hard problem impact your ability to solve other problems.
  • Don’t wait until the program is running to generate the output. Let it run in the background and start working on the next problem.
  • Always look towards the monitor and do not chew, talk, or cover your mouth during the exam. Proctor may disconnect you from the exam if they feel suspicious about your activities.

Things to remember after taking the CCA175 Exam

  • Make note of the things that you found challenging during the notes. You can come back to this list later and close out your gaps.
  • Relax and be patient. You will receive the exam results within 24 hours. I received mine within 2 hours after the completion of the exam.
  • If you pass the exam, you’d receive your digital certificate and license within 48 hours. I received mine after 40 hours.
  • If you didn’t pass the exam, remind yourself that this is not an easy exam and it’s okay if you didn’t make it. Practice the topics that you found challenging and come back stronger. DO NOT GIVE UP!

Final Words

CCA175 is not an easy exam. Preparation requires at least a couple of months if your intention is to learn the topics thoroughly during the process. Keep reading and practice every scenario. If you follow the above plan, you’ll not only complete your certification but will also become proficient in these topics.

Please feel free to post your questions/thoughts below and share your success stories. All the best!

22 Comments

    • Rahul, all the libraries were already included. I didn’t have to import them explicitly.

      Reply
      • Hi Ashwin,

        when you said avro libraries were included does that mean you don’t have to mentioned “–packages org.apache.spark:spark-avro_xxxxx” when you start spark shell ?
        Without mentioning above package option we can read avro files ?

        Reply
        • Sumit, that’s correct. Avro package would already be installed in the spark’s lib directory. So we don’t have to specify it with spark-shell command.

          Reply
    • Hey Radhika. No, I didn’t get any questions on Flume. But I would recommend being familiar with the basics.

      Reply
  • Ashwin – Great blog, with nice tips and suggestions for the preparation. I’m not able to see the Cloudera sandbox VM download link. looks like recently in Feb, 2020 cloudera discontinued it, correct me if I’m wrong.

    Spent a lot of time to download and install Cloudera quickstart vm. But no luck.
    Link I’ve tried https://www.cloudera.com/downloads/quickstart_vms/5-13.html

    Appreciate if you route me to get a sandbox installed with all required technologies.

    Reply
  • Hi Ashwin,
    I read your blog on AWS Big Data Specialty and Cloudera CCA175. Both are very precise and would be helpful in exam preparation. However, I wanted to which one of the two would help for getting a better job as Data Engineer. I understand skills from both the certifications are required to become a data engineer. I intent to prepare myself for both the area but wish to take only one certification exam. In this case which one would you recommend.

    I am currently working as Data Analyst and have 6-7 month’s of AWS experience.

    Reply
    • Hi Sukh,

      Thank you for your feedback!

      I would recommend pursuing AWS certification and doing some side project using Cloudera/Big data technologies.

      In my opinion, AWS Certification is valued more than Cloudera Certification. AWS certification has 3 years of validity while Cloudera’s has just 2 years of validity. Most importantly, with big data technologies, side projects demonstrate your skillset and knowledge more than certification.

      Best wishes,

      Ashwin

      Reply
  • Hi Ashwin

    Regarding Spark, do you still need to learn RDDs for this certification or will Datasets suffice?

    Thanks

    Reply
    • Hi Pavi. The exam doesn’t require any specific api. You can solve the problems using RDD/Dataset/Spark SQL based on your preference.

      Reply
  • Hi Ashwin,
    Do we need to know scala/java for the exam or its fine if we are good at python ?

    Reply
    • Hi Neelima. The exam doesn’t require us to use any particular language. We can use either Scala or Python. I would recommend being familiar with both because they are very similar and it’s easy to pick up the other if you already know one.

      Reply
  • Hi Ashwin
    Thanks for the post and congrats on getting certified. I am preparing for the cloudera spark exam and wondering if I explicitly need to learn scala for it as I am fairly comfortable in writting python code.
    This is specifically in regards to datasets that aren’t available in python.
    Secondly , I read the exam requirements on the cloudera website that doesn’t mention anything on flume,swoop and Kafka . It mentions only 3 broader skill sets to be tested i.e data analysis , configuration and ETL. I was wondering if I need to prepare for flume , sqoop and Kafka too or just simply can skip them
    Thanks

    Reply
    • Hi Khurram. Thank you. You can use either Python or Scala. The exam just cares about the output. Sqoop, flume and kafka are not included in the new syllabus and they can be skipped. Good luck!

      Reply
  • Hello Ashwin,
    Do you need both Python and Scala, or it is up to the exam taker to decide which programming language will be used, or can you for example solve one with Scala and another one with Python.

    Thanks.

    Reply
    • Hi Nikola. The exam doesn’t require us to use any particular language. We can use either Scala or Python. I would recommend being familiar with both because they are very similar and it’s easy to pick up the other if you already know one.

      Reply
  • Hi,

    I just want get experience or see how real CCA175 exam environment looks like.. Is it possible.

    Right now ,I am practicing on cloudxlab..but I have heard that the actual environment look like cloudera VM..Do I need install cloudera VM..any suggestion.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *