Title: Comparing Different Models for Credit Card Fraud Detection


Authors:

Bhupendra Singh

bbhupendra007@gmail.com
,

Mehul Mahrishi

mehul@skit.ac.in
Department ofInformation Technology, Swami Keshvanand Institute of Technology, Management & Gramothan Jaipur-302017 (INDIA)


Abstract:

This paper incorporates the Credit Card Fraud Detection models to study and identify legitimate and fraud transactions. This research intends to recognize the false transactions while avoiding incorrect fraud classifications. The informational collection or dataset (Credit Card Fraud Detection) utilized in the proposed work is given by Kaggle which can be at https://www.kaggle.com/mlg-ulb/creditcardfraud. Before uploading on the website (Kaggle.com), these features are renamed and re-defined as PCA (Principal Component Analysis). There are general features in which 28 out of them are renamed as V1 through V28 (all numeric qualities). Rest three of the features showcase the time, calculated amount and whether that transaction was fraudulent or not. The response variable is 1 for a false transaction and 0 for a safe transaction. The chosen data set does not contain missing qualities. The dataset contains 284,807 transactions in which most of the transactions are very small and very few of the transactions come even closer to the maximum.

Different algorithms are implemented in this study. Python Machine Learning libraries are used to perform those algorithms. The model studied in this research work are K-Nearest Neighbour, logistic regression, random forest model, XGBoost model. As the XGBoost is showing more accuracy than other models. Out of these algorithms, XGBoost model is preferable over the Random Forest model and Logistic Regression model.

Keywords: