Logistic Regression Analysis of Work Resume during Job Interview

There are not too many creative opportunities to leave a person applying for a quant role dumbfounded with the interview question directly related to his work resume. First of all, we want to test his quant skillset, check his ability to read a code, analyse the model, listen to his judgement on the correctness of assumptions, execution, and a selection of proposed modifications. This post is based on actual events. Here is a story of Mike$^*$.

During my preps to the interview with a candidate for a leading position in risk department, I spent a quality of time analysing his 5+ page long Resume which listed a number of professional positions around the world he held. The most striking feature was that he worked within his last two financial institutions for nearly two years and two years only. Since we were looking for someone who would stay at the proposed position much longer than that, I said:

Mike, from your Resume I learned that you had been starting your job positions in Jan 2009, Jun 2012, Jun 2014, and May 2016, respectively. Now you apply for a new role. Let’s assume we will hire you in May 2018, that’s another two year long position. I can see a pattern. And that pattern worries me. I have quickly applied a Logistic Regression model to your data and with the use of Machine Learning I asked computer “What the probability that Mike will leave us in May 2020?!”. Here is what computer predicted…

Next, I handed him the following Python code with the outputs:

import pandas as pd
import numpy as np
import statsmodels.api as sm
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import warnings
warnings.filterwarnings("ignore")

# Data taken from Mike's Resume
# Mike did not work at his new workplace more than 24 months
y = np.array([False, True, True, True, True])
# Timestamps when he started his new roles [years]
X = np.array([2008.083, 2012.500, 2014.500, 2016.417, 2018.417])

# Logistic Regression Model
logit_model = sm.Logit(y, X)
result = logit_model.fit()
print(result.summary2())

Optimization terminated successfully.
         Current function value: 0.499590
         Iterations 5
                        Results: Logit
==============================================================
Model:              Logit            No. Iterations:   5.0000 
Dependent Variable: y                Pseudo R-squared: 0.002  
Date:               2018-04-17 06:29 AIC:              6.9959 
No. Observations:   5                BIC:              6.6053 
Df Model:           0                Log-Likelihood:   -2.4979
Df Residuals:       4                LL-Null:          -2.5020
Converged:          1.0000           Scale:            1.0000 
-----------------------------------------------------------------
      Coef.     Std.Err.      z       P>|z|      [0.025    0.975]
-----------------------------------------------------------------
x1    0.0007      0.0006    1.2418    0.2143    -0.0004    0.0018
==============================================================

# Prediction for Mike leaving a new role in May 2020

# we add May 2020 with a flag True meaning that Mike 
# will not work more than 24 months as a out-of-sample test set
X_train, X_test, y_train, y_test = X.reshape(-1, 1), np.array([2020.417]), \
                                   y.reshape(-1, 1), np.array([True])
    
logreg = LogisticRegression()
logreg.fit(X_train, y_train)   # training computer we recognize the patterns

# prediction
y_pred = logreg.predict(X_test.reshape(-1, 1))
print("\nAccuracy of logistic regression classifier on test set: \
      {:.2f}%".format(100*logreg.score(X_test.reshape(-1, 1), y_test)))

Accuracy of logistic regression classifier on test set:       100.00%

I continued:

Computer tells me you gonna leave our bank in May 2020 and it is 100% sure. Given the model and code, what would you do to change the outcome? Can you tell me what is wrong with this model? Is it implemented correctly? If yes, why do you think so? If not, where is a catch?

First, Mike was laughing as he felt surprised a lot by this questions. This is that sort of challenge to approach the candidate in such a fancy way I like most. But you know, it’s always good to break the boredom and try something new.

Try to find all problems with this interview question. There are few. Drop me your answers in the comments. Good luck!

$^*$ – the name changed intentionally

Be first to know!

Logistic Regression Analysis of Work Resume during Job Interview

Share

Related Tags

Dr. Pawel Lachowicz

Leave a Reply Cancel reply