Building AWS Glue Job using PySpark - Part:1(of 2)

   Go back to the Task List

  « 11: Clean up   

12: What you learn in part-2?

The part-1 of the workshop was focused on setting up data lake, development environment and then creating a job to process the data. The part-2 will focus on PySpark to learn different methods of data transformation and processing. The following is the agenda for the part-2 workshop -

  1. Check Schema of the Data
  2. Query the Data from the Source
  3. Update the Data
  4. Aggregation Functions
  5. Merge & Split Data Sets
  6. Write / Load Data at the Destination

Click on the Building AWS Glue Job using PySpark - Part:2(of 2) to continue to the part-2 of the workshop.