AWS AI Services Programming Series - Part6 (Forecast)

   Go back to the Task List

  « 3: Create IAM Role    5. Train a Predictor »

4. Create Dataset Group

The step is to create dataset group in Amazon Forecast using the training data. Amazon Forecast will import the training data from the S3 bucket and make use of it for the model training.

  1. Goto Amazon Forecast console, click on the Create dataset group button.

    Fraud Detector

  2. On the next screen, type in dojo_dataset_group as the dataset group name. Select Custom option for the forecasting domain. Click on the Next button.

    Forecast

  3. On the next screen, type in dojo_dataset as the dataset name. Keep the frequency of the data to 1 day as the house prices are recorded on daily basis. Update the data schema with the json provided below.

    Forecast

{
	"Attributes": [
		  {
			"AttributeName": "timestamp",
			"AttributeType": "timestamp"
		  },
		  {
			"AttributeName": "item_id",
			"AttributeType": "string"
		  },
		  {
			"AttributeName": "target_value",
			"AttributeType": "integer"
		  }
	]
}
  1. The data schema above matches to the sample data format. Recall the sample data we discussed earlier.

    1st field - datetime - date of house sell. In Amazon Forecast, this field is called timestamp.

    2nd field - string - zip code where the house is sold. In Amazon Forecast, this field is called item_id.

    3rd field - integer - the price of the house sold. In Amazon Forecast, this field is called target_value.

  2. With the configuration, click on the Next button.

  3. On the next screen, type in dojo_dataset_import for the import name. Update timestamp format to yyyy-MM-dd because the house prices are on daily basis and your timestamp should present a day. Choose Enter a custom IAM role ARN for the IAM Role field. Enter the Role ARN you make note of in one of the previous steps in the Custom IAM role ARN field. Enter sample data location in S3 bucket in the data location field. If you created bucket with a different name; use that one. Finally click on the Start Import button.

    Forecast

  4. The data import will start. You can see the import progress in the forecast dashboard. It will take a while. Wait till the status changes to Active.

    Forecast

  5. Once the data import status changes to Active, follow the next step to train the predictor model using the sample training data.