Python, Spark and Hadoop for Big Data Certificate for HOREA MUSTEA
Certificate ID:
717343
Authentication Code:
b5d03
Certified Person Name:
HOREA MUSTEA
Trainer Name:
Anna K.
Duration Days:
3
Duration Hours:
21
Course Name:
Python, Spark and Hadoop for Big Data
Course Date:
2023-04-25 10:00 to 2023-04-27 17:00
Course Outline:
Introduction
- Overview of Spark and Hadoop features and architecture
- Understanding big data
- Python programming basics
Getting Started
- Setting up Python, Spark, and Hadoop
- Understanding data structures in Python
- Understanding PySpark API
- Understanding HDFS and MapReduce
Integrating Spark and Hadoop with Python
- Implementing Spark RDD in Python
- Processing data using MapReduce
- Creating distributed datasets in HDFS
Machine Learning with Spark MLlib
Processing Big Data with Spark Streaming
Working with Recommender Systems
Working with Kafka, Sqoop, Kafka, and Flume
Apache Mahout with Spark and Hadoop
Troubleshooting
Summary and Next Steps