Home / Tutorials / Historical Data Preparation

Historical Data Preparation

Historical Data Preparation

1 Historical data should include a set of characteristics and a target variable. All of scorecard development methods quantify the relationship between the characteristics (input columns) and “Good/Bad” performance (target column).

2 Example of borrowers characteristics. Scorecard characteristics are similar to those used in subjective expert judgment.





3 Those characteristics, whose usage is not reasonable, are excluded. For example: on the picture you can see that the “Good/ Bad” distribution does not depend on the Home Ownership characteristics.

4 All borrowers should be marked in the target column as “Good” or “Bad” by a certain rule. For example: all the borrowers to pay in 30 days, are “Good”, but borrowers with a delay of more than 90 days are marked as “Bad”.



Exclusions

Certain types of accounts need to be excluded from the dataset. For example: bank workers or VIP clients records could be excluded from data set.





Data Cleansing

Borrowers portfolio data can contain the following anomalies that should be replaced or deleted:
  • Outliers - values that lie far outside the main volume
  • Data entry errors
  • Missing values







Plug&Score is the most easy-to-use and the fastest to integrate scoring system.



For more complex and versatile needs of larger credit institutions we recommend Scorto™: