You will use a sample data to train the model to detect frauds for the events. The workshop used a sample data provided by AWS. Download the data from the link. It is a csv file with the following fields:
ip_address - a variable field in the data.
email_address - a variable field in the data.
EVENT_TIMESTAMP - mandatory field. Presents the timestamp for the event.
EVENT_LABEL - mandatory field. Presents a fraud or legitimate event. If the value is legit, it means the event is legitimate. If the value is fraud, it represented a fraud event.
You can use your own data set as long as there are more than 10000 records and you have EVENT_TIMESTAMP and EVENT_LABEL fields in there.
You will upload the sample data file to a S3 bucket.
Login to AWS Console and choose Ireland as the region.
Create a bucket with name dojo-fraud-records. If this bucket name is not available, use a bucket name which is available.
Upload the sample-data.csv file into the bucket created.
The training data is ready. The next step is to build and train a model using this training data.