The first important step is to build a fraud detector model using the training data available. The model is used to predict if certain event is fraud or not. You can also use the model to assess the level of risk of fraud with the event.
-
Goto Amazon Fraud Detector console, click on the Create event button.
-
On the next screen, type in dojo-event as the name. Select Create new entity… option for the entity.
-
It will open create entity popup, type in dojoentity as the entity name and click on the Create entity button.
-
In the event variables section, select Select variables from a training dataset as the option for the event variables field. Select Create IAM Role as the option for the IAM Role.
-
It will open a popup for create IAM role, type in dojo-fraud-records as the bucket name. If you created bucket with a different name then type in that name. Click on the Create role button.
-
The role is created. In the data location field, type in s3://dojo-fraud-records/sample-data.csv as the data location for the training file. If you created bucket with a different name then use that name in the location. Click on the Upload button.
-
The variables from the training data get uploaded. Match them to the variable type as shown below (ip_address to IP Address and email_address to Email Address).
-
Next in the Labels section, select Create new labels… option.
-
On the create label popup, type legit as the label name and click on the Create label button. Remember legit is a way to present the legitimate events in the training dataset.
-
Repeat steps 8 and 9 to create another label with the label name fraud. Again fraud is a way to present the fraud events in the training dataset.
-
Finally click on the Create event type button in the bottom of the screen. The event type is created. It shows a pop-up to build the model. Click on the Build a model button.
-
On the next screen, type in dojo_fraud_detection_model as the model name. Select Online Fraud Insights as the model type. Select dojo-event as the event type.
-
In the historical event data section, select the role automatically created in the previous steps. Type in s3://dojo-fraud-records/sample-data.csv as the training data location. If you created bucket with a different name then use that name in the location. Click on the Next button.
-
On the next screen, select fraud for the fraud labels and select legit for the legitimate labels. Keep rest of the configuration as the default. Click on the Next button.
-
On the next Review and create screen, click on the Create and train model button. It will start training a new model with version number 1.0. Wait till the status of the model version changes to Ready to deploy. It will take a while - more than 50 mins to finish training the model. You might want to take a break at this point of time. Once the status is Ready to deploy, click on the model version to see the details.
-
On the next screen, check the Score distribution in the model performance. After model training is complete, Amazon Fraud Detector validates model performance using 15% of the data that was not used to train the model. When you try to detect an event for fraud using this model, it returns a score. As per the score distribution shown here, if the score is 500 then it is 92.9% chance it is a fraud event and 13.6% chance it is a legitimate event. Later when you create rule for detection, you use score to determine if the event is fraud or not or use it to determine the fraud risk level for the event. We will deal with it in the next step.
-
Time to deploy the model. Goto the Actions menu in the top and click on the Deploy model version option. On the popup confirmation screen, click on the Deploy version button.
-
It will start the model deployment which will take a while. Wait till the status changes to Active.
-
The model is ready and deployed. Let’s create fraud detector which is used to detect if the event is fraud or legitimate.