AWS Dojo - Workshop - Custom Text Classification using Amazon Comprehend

You use the sample data loaded in the S3 bucket to train a model for text classification. The model can predict whether a news title text is Real or Fake.

Goto the Amazon Comprehend console, click on the Custom classification menu in the left and then click on the Train classifier button.
On the next screen, type in dojotextclassifier for the name. Select English for the language. Select Using Multi-class mode option for the Classifier mode.
On the same screen, select CSV file for the Training data format. Select s3://dojo-text-records/news_test.csv for the S3 location. If you created bucket with a different name, then use that bucket.
On the same screen, select Create an IAM role option. Type in dojoclassifierrole for the name suffix. Finally, click on the Train classifier button.
It will start training the model. Wait till the status changes to Trained. It might take upto an hour to train the model.
Once the model is trained, on the classifier details screen, click on the Create endpoint button.
On the next screen, type in dojotextclassifierep as the endpoint name. Type in 1 for Inference units. Finally, click on the Create endpoint button.
It will throw pop up and ask for the confirmation, click on the Confirm button in the pop up. The endpoint creation will start. Wait till the status changes to Ready.
Once the endpoint is ready, make note of the endpoint ARN as you need it later in the workshop.
The model and endpoint are ready. Create SageMaker Notebook Instance in the next step which you use to call the endpoint.