Custom Text Classification using Amazon Comprehend

   Go back to the Task List

  « 4: Configure SageMaker Notebook    6. Clean up »

5: Create Client

The notebook instance is ready. You now write code which calls classifier endpoint to detect whether a text is Real or Fake.

  1. In the Amazon SageMaker console, select dojonotebook instance and click on the Open Jupyter link.

    Amazon Comprehend

  2. It will open Jupyter in a new browser tab or window. Select conda_python3 option under the New menu. Basically, you are starting a notebook with Python3. Such notebook also comes with Python Boto3 SDK deployed which will help in calling the endpoint.

    Amazon Comprehend

  3. It will open a notebook in a new browser tab or window.

    Amazon Comprehend

  4. Copy-paste and run the following code in the notebook to import boto3 module and initiate Amazon Comprehend client. Replace {ENDPOINT_ARN} with classifier endpoint ARN you make note of in the previous step.

    Amazon Comprehend

    import boto3
    client = boto3.client('comprehend')
    endpointarn = "{ENDPOINT_ARN}"
    

    `

  5. Copy-paste and run the following code in the notebook to classify a news title. It is calling classify_document method passing text and Endpoint ARN as parameters. The code then prints the classification. You can see the model predicts the news title text to be Fake with 52% confidence and real with 48% confidence. It is not a very confident prediction. Let’s use another example.

    Amazon Comprehend

    txt = "fantastic trumps  point plan to reform healthcare begins with a bombshell  percentfedupcom"
    response = client.classify_document(Text=txt,EndpointArn=endpointarn)
    response['Classes']
    

    `

  6. Copy-paste and run the following code in the notebook to classify another news title. You can see the model predicts this title to be Fake with 73% confidence.

    Amazon Comprehend

    txt = "fbi redux whats behind new probe into hillary clinton emails"
    response = client.classify_document(Text=txt,EndpointArn=endpointarn)
    response['Classes']
    

    `

  7. The workshop finishes here. Goto the next task to clean-up the resources so that you don’t incur any cost post the workshop.