Building AWS Glue Job using PySpark - Part:1(of 2)

   Go back to the Task List

  « 7: Role Permission to the Catalog    9: PySpark Coding in Notebook »

8: Create Developer Endpoint

Developer endpoint provides development environment to create Glue Job using languages and frameworks like PySpark. In this task, you create a developer endpoint which you will use to code with PySpark.

  1. Goto the AWS Glue console, click on the Dev endpoints option in the left menu and then click on the Add endpoint button.

    Glue

  2. On the next screen, type in dojoendpoint as the name. Select dojogluerole as the IAM role. Then click on the Next button.

    Glue

  3. On the next screen, select Skip networking information as the option and click on the Next button.

    Glue

  4. On the next Add an SSH public key (Optional) screen, click on the Next button.

  5. On the next Review screen, click on the Finish button. The endpoint creation will start.

    Glue

  6. It will take some 8-10 mins for the developer endpoint to be ready. Wait till the status changes to READY.

    Glue

  7. Once the developer endpoint is ready, select it and click on Create Sagemaker notebook under the Action dropdown menu.

    Glue

  8. On the next screen, enter dojonotebook as notebook name, select Create an IAM role as the option, select dojosagemakerrole as the IAM role and then click on the Create notebook button.

    Glue

  9. The notebook creation will start.

    Glue

  10. Wait till the notebook status changes to Ready.

    Glue

  11. The development environment is ready. Let’s do PySpark programming in notebook which then you use later to create a Glue job.