Forecasting loan status

Photo by Sharon McCutcheon on Unsplash

Presentation of the problem

Model and assumptions

Proposed solution

sourceFile = “https://raw.githubusercontent.com/bot13956/Monte_Carlo_Simulation_Loan_Status/master/loan_timing.csv"
rawdata <- read.csv(header = TRUE, sep = “,”, file=sourceFile)
head(rawdata)
Head of the original data.
data <- na.omit(rawdata)
elapsedDaysChargeOff <- data[2]
dataTable <- data.frame(table(elapsedDaysChargeOff))
colnames(dataTable) <- c(“days”, “persons”)
dataTable$days <- as.numeric(as.character(dataTable$days))
head(dataTable)
head of the dataframe dataTable
Histogram of the dataTable.
Formula for the expected value of a sampling distribution.
frequency <- ggplot_build(output)$data[[1]]$count
xi <- c(1:max(dataTable$persons))
library(geometry)
meanValDay <- dot(xi,frequency)/max(dataTable$days)
meanValDay
meanVal3Yrs <- meanValDay*365*3
meanVal3Yrs
frac3Yrs <-meanVal3Yrs/length(rawdata$days.from.origination.to.chargeoff)
meanVal3Yrs

Alternative solution

newdataTable <- data.frame(c(0:725), rep(0,726))
colnames(newdataTable) <- c(“days”, “persons”)
newdataTable$persons[dataTable$days+1] <- dataTable$persons[]
library(fitdistrplus)
pois = fitdist(newdataTable$persons, ‘pois’, method = ‘mle’)
lambdaFit <- pois$estimate
lambdaFit
meanVal3YrsFit <- lambdaFit*365*3
meanVal3YrsFit
frac3YrsFit <- meanVal3YrsFit/length(rawdata$days.from.origination.to.chargeoff)
meanVal3YrsFit

I am a physicist who decided to move professionally into the fascinating world of Data Science.