AWS Dojo - Workshop - Building AWS Glue Job using PySpark

The part-1 of the workshop was focused on setting up data lake, development environment and then creating a job to process the data. The part-2 will focus on PySpark to learn different methods of data transformation and processing. The following is the agenda for the part-2 workshop -

Check Schema of the Data
Query the Data from the Source
Update the Data
Aggregation Functions
Merge & Split Data Sets
Write / Load Data at the Destination

Click on the Building AWS Glue Job using PySpark - Part:2(of 2) to continue to the part-2 of the workshop.

Building AWS Glue Job using PySpark - Part:1(of 2)

12: What you learn in part-2?