Using AWS Glue ETL Job with Streaming Data

   Go back to the Task List

  « 2: Create IAM Roles for IoT and Glue    4. Create S3 Bucket »

3: Create Kinesis Data Stream

In this task, you create an Amazon Kinesis Data Stream which will be used to ingest data from IoT Device (simulated). Further Amazon Glue ETL job will read data from the Kinesis Data Stream and persist to the Amazon S3 bucket after transformation.

  1. Login to the AWS Console and choose an AWS region. Please make sure you are creating all AWS resources in the selected region only. The workshop uses Paris as the region.

  2. Goto Kinesis Streams Management Console. Select Kinesis Data Streams as the option and click on the Create data stream button.

    Kinesis

  3. On the next screen, type in dojostream as the stream name. Select 1 for the number of open shards field. Click on the Create data stream button.

    Kinesis

  4. It will take couple of minutes to complete the creation of the stream. Wait till the status changes to Active.

    Kinesis

  5. Kinesis data stream is ready. The next task is to create S3 bucket where the ETL data finally rests.