Computation of the Loss Distribution in Python

In the Operational Risk Management, given a number/type of risks or/and business line combinations, the quest is all about providing the risk management board with an estimation of the losses the bank (or any other financial institution, hedge-fund, etc.) can suffer from. Hence, they form a loss distribution. If you think for a second, the spectrum of things that might go wrong is wide, e.g. the failure of a computer system, an internal or external fraud, clients, products, business practices, a damage to physical goods, and so on. These ones blended with business lines, e.g. corporate finance, trading and sales, retail banking, commercial banking, payment and settlement, agency services, asset management, or retail brokerage return over 50 combinations of the operational risk factors one needs to consider. Separately and carefully. And it’s a tough one.

Why? A good question “why?”! Simply because of two main reasons. For an operational risk manager the sample of data describing the risk is usually insufficient (statistically speaking: the sample is small over the life period of the financial organ). Secondly, when something goes wrong, the next (of the same kind) event may take place in not-to-distant future or in far-distant future. The biggest problem the operational risk manager meets in his/her daily work regards the prediction of all/any losses due to operational failures. Therefore, the time of the (next) event comes in as an independent variable into that equation: the loss frequency distribution. The second player in the game is: the loss severity distribution, i.e. if the worst strikes, how much the bank/financial body/an investor/a trader might lose?!

From a perspective of a trader we well know that Value-at-Risk (VaR) and the Expected Shortfall are two quantitative risk measures that address similar questions. But from the viewpoint of the operational risk, the estimation of losses requires a different approach.

In this post, after Hull (2015), we present an algorithm in Python for computation of the loss distribution given the best estimation of the loss frequency and loss severity distributions. Though designed for operation risk analysts in mind, in the end we argue its usefulness for any algo-trader and/or portfolio risk manager.

1. Operational Losses: Case Study of the Vanderloo Bank

An access to operational loss data is much much harder than in case of stocks traded in the exchange. They usually stay within the walls of the bank, with an internal access only. A recommended practice for operational risk managers around the world is to share those unique data despite confidentiality. Only in that instance we can build a broader knowledge and understanding of risks and incurred losses due to operational activities.

Let’s consider a case study of a hypothetical Vanderloo Bank. The bank had been found in 1988 in Netherlands and its main line of business was concentrated around building unique customer relationships and loans for small local businesses. Despite a vivid vision and firmly set goals for the future, Vanderloo Bank could not avoid a number of operational roadblocks that led to a substantial operational losses:

	Year	Month	Day	Business Line	Risk Category	Loss ($M)
0	1989.0	1.0	13.0	Trading and Sales	Internal Fraud	0.530597
1	1989.0	2.0	9.0	Retail Brokerage	Process Failure	0.726702
2	1989.0	4.0	14.0	Trading and Sales	System Failure	1.261619
3	1989.0	6.0	11.0	Asset Managment	Process Failure	1.642279
4	1989.0	7.0	23.0	Corporate Finance	Process Failure	1.094545
5	1990.0	10.0	21.0	Trading and Sales	Employment Practices	0.562122
6	1990.0	12.0	24.0	Payment and Settlement	Process Failure	4.009160
7	1991.0	8.0	23.0	Asset Managment	Business Practices	0.495025
8	1992.0	1.0	28.0	Asset Managment	Business Practices	0.857785
9	1992.0	3.0	14.0	Commercial Banking	Damage to Assets	1.257536
10	1992.0	5.0	26.0	Retail Banking	Internal Fraud	1.591007
11	1992.0	8.0	9.0	Corporate Finance	Employment Practices	0.847832
12	1993.0	1.0	11.0	Corporate Finance	System Failure	1.314225
13	1993.0	1.0	19.0	Retail Banking	Internal Fraud	0.882371
14	1993.0	2.0	24.0	Retail Banking	Internal Fraud	1.213686
15	1993.0	6.0	12.0	Commercial Banking	System Failure	1.231784
16	1993.0	6.0	16.0	Agency Services	Damage to Assets	1.316528
17	1993.0	7.0	11.0	Retail Banking	Process Failure	0.834648
18	1993.0	9.0	21.0	Retail Brokerage	Process Failure	0.541243
19	1993.0	11.0	11.0	Asset Managment	Internal Fraud	1.380636
20	1994.0	11.0	22.0	Retail Banking	External Fraud	1.426433
21	1995.0	2.0	14.0	Commercial Banking	Process Failure	1.051281
22	1995.0	11.0	21.0	Commercial Banking	External Fraud	2.654861
23	1996.0	8.0	17.0	Agency Services	Process Failure	0.837237
24	1997.0	7.0	13.0	Retail Brokerage	Internal Fraud	1.107019
25	1997.0	7.0	24.0	Agency Services	External Fraud	1.513146
26	1997.0	8.0	8.0	Retail Banking	Process Failure	1.002040
27	1997.0	9.0	2.0	Agency Services	Damage to Assets	0.646596
28	1997.0	9.0	12.0	Retail Banking	Employment Practices	0.966086
29	1998.0	1.0	8.0	Retail Banking	Internal Fraud	0.938803
30	1998.0	1.0	12.0	Retail Banking	System Failure	0.922069
31	1998.0	2.0	5.0	Asset Managment	Process Failure	1.042259
32	1998.0	4.0	18.0	Commercial Banking	External Fraud	0.969562
33	1998.0	5.0	12.0	Retail Banking	External Fraud	0.683715
34	1999.0	1.0	3.0	Trading and Sales	Internal Fraud	2.035785
35	1999.0	4.0	27.0	Retail Brokerage	Business Practices	1.074277
36	1999.0	5.0	8.0	Retail Banking	Employment Practices	0.667655
37	1999.0	7.0	10.0	Agency Services	System Failure	0.499982
38	1999.0	7.0	17.0	Retail Brokerage	Process Failure	0.803826
39	2000.0	1.0	26.0	Commercial Banking	Business Practices	0.714091
40	2000.0	7.0	23.0	Trading and Sales	System Failure	1.479367
41	2001.0	6.0	16.0	Retail Brokerage	System Failure	1.233686
42	2001.0	11.0	5.0	Agency Services	Process Failure	0.926593
43	2002.0	5.0	14.0	Payment and Settlement	Damage to Assets	1.321291
44	2002.0	11.0	11.0	Retail Banking	External Fraud	1.830254
45	2003.0	1.0	14.0	Corporate Finance	System Failure	1.056228
46	2003.0	1.0	28.0	Asset Managment	System Failure	1.684986
47	2003.0	2.0	28.0	Commercial Banking	Damage to Assets	0.680675
48	2004.0	1.0	11.0	Asset Managment	Process Failure	0.559822
49	2004.0	6.0	19.0	Commercial Banking	Internal Fraud	1.388681
50	2004.0	7.0	3.0	Retail Banking	Internal Fraud	0.886769
51	2004.0	7.0	21.0	Retail Brokerage	Employment Practices	0.606049
52	2004.0	7.0	27.0	Asset Managment	Employment Practices	1.634348
53	2004.0	11.0	26.0	Asset Managment	Damage to Assets	0.983355
54	2005.0	1.0	9.0	Corporate Finance	Damage to Assets	0.969710
55	2005.0	9.0	17.0	Commercial Banking	System Failure	0.634609
56	2006.0	2.0	24.0	Agency Services	Business Practices	0.637760
57	2006.0	3.0	21.0	Retail Banking	Employment Practices	1.072489
58	2006.0	6.0	25.0	Payment and Settlement	System Failure	0.896459
59	2006.0	12.0	25.0	Trading and Sales	Process Failure	0.731953
60	2007.0	6.0	9.0	Commercial Banking	System Failure	0.918233
61	2008.0	1.0	5.0	Corporate Finance	External Fraud	0.929702
62	2008.0	2.0	14.0	Retail Brokerage	System Failure	0.640201
63	2008.0	2.0	14.0	Commercial Banking	Internal Fraud	1.580574
64	2008.0	3.0	18.0	Corporate Finance	Process Failure	0.731046
65	2009.0	2.0	1.0	Agency Services	System Failure	0.630870
66	2009.0	2.0	6.0	Retail Banking	External Fraud	0.639761
67	2009.0	4.0	14.0	Payment and Settlement	Internal Fraud	1.022987
68	2009.0	5.0	25.0	Retail Banking	Business Practices	1.415880
69	2009.0	7.0	8.0	Retail Banking	Business Practices	0.906526
70	2009.0	12.0	26.0	Agency Services	System Failure	1.463529
71	2010.0	2.0	13.0	Asset Managment	Damage to Assets	0.664935
72	2010.0	3.0	24.0	Payment and Settlement	Process Failure	1.848318
73	2010.0	10.0	16.0	Commercial Banking	External Fraud	1.020736
74	2010.0	12.0	27.0	Retail Banking	Employment Practices	1.126265
75	2011.0	2.0	5.0	Retail Brokerage	Process Failure	1.549890
76	2011.0	6.0	24.0	Corporate Finance	Damage to Assets	2.153238
77	2011.0	11.0	6.0	Asset Managment	System Failure	0.601332
78	2011.0	12.0	1.0	Payment and Settlement	External Fraud	0.551183
79	2012.0	2.0	21.0	Corporate Finance	External Fraud	1.866740
80	2013.0	4.0	22.0	Retail Brokerage	External Fraud	0.672756
81	2013.0	6.0	27.0	Payment and Settlement	Employment Practices	1.119233
82	2013.0	8.0	17.0	Commercial Banking	System Failure	1.034078
83	2014.0	3.0	1.0	Asset Managment	Employment Practices	2.099957
84	2014.0	4.0	4.0	Retail Brokerage	External Fraud	0.929928
85	2014.0	6.0	5.0	Retail Banking	System Failure	1.399936
86	2014.0	11.0	17.0	Asset Managment	Process Failure	1.299063
87	2014.0	12.0	3.0	Agency Services	System Failure	1.787205
88	2015.0	2.0	2.0	Payment and Settlement	System Failure	0.742544
89	2015.0	6.0	23.0	Commercial Banking	Employment Practices	2.139426
90	2015.0	7.0	18.0	Trading and Sales	System Failure	0.499308
91	2015.0	9.0	9.0	Retail Banking	Employment Practices	1.320201
92	2015.0	9.0	18.0	Corporate Finance	Business Practices	2.901466
93	2015.0	10.0	21.0	Commercial Banking	Internal Fraud	0.808329
94	2016.0	1.0	9.0	Retail Banking	Internal Fraud	1.314893
95	2016.0	3.0	28.0	Asset Managment	Business Practices	0.702811
96	2016.0	3.0	25.0	Payment and Settlement	Internal Fraud	0.840262
97	2016.0	4.0	6.0	Retail Banking	Process Failure	0.465896

Having a record of 97 events, now we can begin building a statistical picture on loss frequency and loss severity distribution.

2. Loss Frequency Distribution

For loss frequency, the natural probability distribution to use is a Poisson distribution. It assumes that losses happen randomly through time so that in any short period of time $\Delta t$ there is a probability of $\lambda \Delta t$ of a loss occurring. The probability of $n$ losses in time $T$ [years] is:
$$
\mbox{Pr} = \exp{(-\lambda T)} \frac{(\lambda T)^n}{n!}
$$ where the parameter $\lambda$ can be estimated as the average number of losses per year (Hull 2015). Given our table in the Python pandas’ DataFrame format, df, we code:

# Computation of the Loss Distribution not only for Operational Risk Managers
# (c) 2016 QuantAtRisk.com, Pawel Lachowicz

from scipy.stats import lognorm, norm, poisson
from matplotlib  import pyplot as plt
import numpy as np
import pandas as pd

# reading Vanderoo Bank operational loss data
df = pd.read_hdf('vanderloo.h5', 'df')

# count the number of loss events in given year
fre = df.groupby("Year").size()
print(fre)

where the last operation groups and displays the number of losses in each year:

Year
1989.0    5
1990.0    2
1991.0    1
1992.0    4
1993.0    8
1994.0    1
1995.0    2
1996.0    1
1997.0    5
1998.0    5
1999.0    5
2000.0    2
2001.0    2
2002.0    2
2003.0    3
2004.0    6
2005.0    2
2006.0    4
2007.0    1
2008.0    4
2009.0    6
2010.0    4
2011.0    4
2012.0    1
2013.0    3
2014.0    5
2015.0    6
2016.0    4
dtype: int64

The estimation of Poisson’s $\lambda$ requires solely the computation of:

# estimate lambda parameter
lam = np.sum(fre.values) / (df.Year[df.shape[0]-1] - df.Year[0])
print(lam)

3.62962962963

what informs us that during 1989–2016 period, i.e. over the past 27 years, there were $\lambda = 3.6$ losses per year. Assuming Poisson distribution as the best descriptor for loss frequency distribution, we model the probability of operational losses of the Vanderloo Bank in the following way:

# draw random variables from a Poisson distribtion with lambda=lam
prvs = poisson.rvs(lam, size=(10000))

# plot the pdf (loss frequency distribution)
h = plt.hist(prvs, bins=range(0, 11))
plt.close("all")
y = h[0]/np.sum(h[0])
x = h[1]

plt.figure(figsize=(10, 6))
plt.bar(x[:-1], y, width=0.7, align='center', color="#2c97f1")
plt.xlim([-1, 11])
plt.ylim([0, 0.25])
plt.ylabel("Probability", fontsize=12)
plt.title("Loss Frequency Distribution", fontsize=14)
plt.savefig("f01.png")

revealing:

3. Loss Severity Distribution

The data collected in the last column of $df$ allow us to plot and estimate the best fit of the loss severity distribution. In the practice of operational risk mangers, the lognormal distribution is a common choice:

c = .7, .7, .7  # define grey color

plt.figure(figsize=(10, 6))
plt.hist(df["Loss ($M)"], bins=25, color=c, normed=True)
plt.xlabel("Incurred Loss ($M)", fontsize=12)
plt.ylabel("N", fontsize=12)
plt.title("Loss Severity Distribution", fontsize=14)

x = np.arange(0, 5, 0.01)
sig, loc, scale = lognorm.fit(df["Loss ($M)"])
pdf = lognorm.pdf(x, sig, loc=loc, scale=scale)
plt.plot(x, pdf, 'r')
plt.savefig("f02.png")

print(sig, loc, scale)  # lognormal pdf's parameters

0.661153638163 0.328566816132 0.647817560825

where the lognormal distribution probability density function (pdf) we use is given by:
$$
p(x; \sigma, loc, scale) = \frac{1}{x\sigma\sqrt{2\pi}} \exp{ \left[ -\frac{1}{2} \left(\frac{\log{x}}{\sigma} \right)^2 \right] }
$$
where $x = (y – loc)/scale$. The fit of pdf to the data returns:

4. Loss Distribution

The loss frequency distribution must be combined with the loss severity distribution for each risk type/business line combination in order to determine a loss distribution. The most common assumption here is that loss severity is independent of loss frequency. Hull (2015) suggests the following steps to be taken in building the Monte Carlo simulation leading to modelling of the loss distribution:

1. Sample from the frequency distribution to determine the number of loss events ($n$)
2. Sample $n$ times from the loss severity distribution to determine the loss experienced
for each loss event ($L_1, L_2, …, L_n$)
3. Determine the total loss experienced ($=L_1 + L_2 + … + L_n$)

Be first to know!

Computation of the Loss Distribution in Python

Share

1. Operational Losses: Case Study of the Vanderloo Bank

2. Loss Frequency Distribution

3. Loss Severity Distribution

4. Loss Distribution

5. Beyond Operational Risk Management

Download

References

Explore Further

Related Tags

Dr. Pawel Lachowicz

Leave a Reply Cancel reply

Computation of the Loss Distribution in Python

Share

1. Operational Losses: Case Study of the Vanderloo Bank

2. Loss Frequency Distribution

3. Loss Severity Distribution

4. Loss Distribution

5. Beyond Operational Risk Management

Download

References

Explore Further

Subscribe to QuantAtRisk Newsletter!

Related Tags

Dr. Pawel Lachowicz

Leave a Reply Cancel reply