View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Replication Data for: Credit scoring of thin file consumers |
Identification Number: |
doi:10.7910/DVN/6MLVVI |
Distributor: |
Harvard Dataverse |
Date of Distribution: |
2024-05-12 |
Version: |
1 |
Bibliographic Citation: |
Deepa Shukla, 2024, "Replication Data for: Credit scoring of thin file consumers", https://doi.org/10.7910/DVN/6MLVVI, Harvard Dataverse, V1, UNF:6:tIIKPwlCPuBzLVc1RbTvlQ== [fileUNF] |
Citation |
|
Title: |
Replication Data for: Credit scoring of thin file consumers |
Subtitle: |
Non-Traditional Data Sources to Enhance Creditworthiness |
Alternative Title: |
Innovative Data Approaches for Assessing Credit Risk in Limited Credit History Consumers |
Identification Number: |
doi:10.7910/DVN/6MLVVI |
Identification Number: |
API |
Authoring Entity: |
Deepa Shukla (Jaipur National University) |
Other identifications and acknowledgements: |
Shukla, Deepa |
Other identifications and acknowledgements: |
Gupta, Sunil |
Producer: |
Gupta, Sunil |
Date of Production: |
2024-05-13 |
Software used in Production: |
Python |
Distributor: |
Harvard Dataverse |
Access Authority: |
Sunil Gupta |
Depositor: |
Shukla, Deepa |
Date of Deposit: |
2024-05-12 |
Holdings Information: |
https://doi.org/10.7910/DVN/6MLVVI |
Study Scope |
|
Keywords: |
Business and Management, Computer and Information Science, Machine Learning Algorithms, Credit Score, Thin File, Behavioural Finance |
Topic Classification: |
Digital Credit Scoring |
Abstract: |
The rapid evolution of machine learning (ML) offers transformative potential for the credit scoring industry, especially in addressing the challenges faced by "thin-file" consumers who lack substantial credit histories. Traditional credit scoring models often fail to accurately assess these consumers due to insufficient data, leading to potential exclusion from crucial credit services. This research leverages a synthetically created dataset, generated using advanced Python libraries like Pandas, NumPy, and Faker, to develop and refine ML algorithms capable of evaluating such underserved consumer segments. The synthetic nature of the dataset ensures compliance with privacy norms while allowing the simulation of diverse consumer behaviors—from stable to erratic financial activities—typical of thin-file profiles. This initiative not only drives innovation in algorithmic credit scoring but also aligns with broader objectives of financial inclusivity, aiming to bridge service gaps by equipping the financial industry with tools to fairly evaluate creditworthiness across all consumer segments. Thus, this dataset forms a critical cornerstone for advancing research that enhances technical capabilities and fosters societal progress through improved financial inclusion. |
Kind of Data: |
Synthetic Data |
Notes: |
The dataset in question is designed to facilitate a study in the development of machine learning algorithms specifically tailored for credit scoring of "thin-file" consumers. "Thin-file" consumers are individuals who have little to no credit history, which makes traditional credit scoring models less effective or entirely inapplicable. These consumers often face difficulties in accessing credit products because they cannot be easily assessed by standard credit risk evaluation methods. |
Methodology and Processing |
|
Sources Statement |
|
Data Sources: |
https://github.com/Deezpa/credit-score |
Documentation and Access to Sources: |
https://github.com/Deezpa/credit-score |
Data Access |
|
Other Study Description Materials |
|
Related Publications |
|
Citation |
|
Title: |
Citation Type : BibTex @misc{deepa_shukla_2024, title={synthetic credit score of thin-file consumers}, url={https://www.kaggle.com/dsv/8378342}, DOI={10.34740/KAGGLE/DSV/8378342}, publisher={Kaggle}, author={Deepa Shukla}, year={2024} } |
Identification Number: |
10.34740/KAGGLE/DSV/8378342 |
Bibliographic Citation: |
Citation Type : BibTex @misc{deepa_shukla_2024, title={synthetic credit score of thin-file consumers}, url={https://www.kaggle.com/dsv/8378342}, DOI={10.34740/KAGGLE/DSV/8378342}, publisher={Kaggle}, author={Deepa Shukla}, year={2024} } |
File Description--f10198032 |
|
File: 500Credit_Score_Dataset.tab |
|
|
|
Notes: |
UNF:6:tIIKPwlCPuBzLVc1RbTvlQ== |
List of Variables: |
|
Variables |
|
f10198032 Location: |
Summary Statistics: Mean 499567.554; Valid 500.0; StDev 285229.799921329; Max. 997174.0; Min. 85.0 Variable Format: numeric Notes: UNF:6:/9UBTIvgNc+VEsmfHjxBxw== |
f10198032 Location: |
Summary Statistics: Max. 64.0; Mean 40.94; Min. 18.0; StDev 13.561571902435569; Valid 500.0 Variable Format: numeric Notes: UNF:6:LLZbWCypgm9Xtxk3GQkrBg== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:kUOD9uw3QuK64xUz45mY6Q== |
f10198032 Location: |
Summary Statistics: Min. 0.0; Mean 14.941999999999997; Valid 500.0; StDev 8.702994900635124; Max. 29.0 Variable Format: numeric Notes: UNF:6:JmpVvSopeTUW88TzNg9VhA== |
f10198032 Location: |
Summary Statistics: Max. 29.0; Min. 0.0; Valid 500.0; Mean 14.394; StDev 8.700008292443012 Variable Format: numeric Notes: UNF:6:KRHpC8XFv8ViypfiKa9M6w== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:KOhZ6cvB2pYirYjIVI/ZlQ== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:vSON07EeW2tn4tY7/2rJYg== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:MmpnqfR1Q/YLzXo3AGsOdA== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:dDlk1xOYJfzCNzMDkxD63w== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:iOdGFM3Ve5UURmNN/PStZg== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:P/nz2LEbqBombIOZ384Wqg== |
f10198032 Location: |
Variable Format: character Notes: UNF:6:hrUi0qsPXCQ+Z2c6fX1WTA== |