CS2 Risk Modelling & Survival Analysis

WHAT IS A COUNTING PROCESS?
A counting process is a stochastic process, X , in discrete or continuous time, whose state space is the collection of natural numbers {0,1,2,…}.

WHAT IS MEANT BY SAMPLE PATH OF A PROCESS?
A sample path is the sequence of outcomes of a particular set of experiments i.e. THEIR JOIN REALISATION.

STATE HOW MARKOV CHAIN IS USEFUL?
When a Markov process has a discrete state space and a discrete time set it is called a Markov chain.it is mainly used in general and life insurance companies to decide on a no-claim discount system.

HOW MARKOV CHAIN PROCESS IS DIFFERENT FROM MARKOV JUMP?
When a Markov process has a discrete state space and a discrete time set it is called a Markov chain but when the time set is continuous, it is known as the Markov jump process.

WHAT IS DIFFERENCE BETWEEN CURTATE FUTURE LIFETIME AND COMPLETE FUTURE LIFETIME?
The curtate future lifetime k_x of a life aged exactly x, is the whole number of years lived after age x i.e. it takes only discrete values where the expected future lifetime after age x which is continuous is complete future lifetime.

WHAT IS MEANT BY FORCE OF MORTALITY? HOW IS IT USEFUL?
Force of mortality is the instantaneous rate of mortality acting on a certain age, also known as hazard rate. It is useful in calculating the survival functions such as expected future lifetime, death rates, etc

WHAT ARE COVARIATES?
different factors that are used to split the population into subgroups are called covariates such as age, sex, gender, etc

HOW PROPORTIONAL HAZARD MODELS ARE DEVELOPED?
proportional hazards models inculcates calculating hazard rates where the formula incorporates an adjustment to reflect the characteristics of each particular individual of the population.

STATE DIFFERENT TYPES OF PARAMETRIC MODELS?
Most common parametric models are developed by distributions such as the exponential (constant hazard), Weibull (monotonic hazard), Gompertz-Makeham (exponential hazard) and log-logistic (‘humped’ hazard).

STATE SOME FACTORS BY WHICH LIFE INSURANCE MORTALITY IS GROUPED INTO?
(a) Sex (b) Age (c) Type of policy (which often reflects the reason for insuring) (d) Smoker/non-smoker status (e) Level of underwriting (e.g have they undergone a medical examination?) (f) Duration in force.

WHAT ARE GRADUATED MORTALITY RATES? HOW GRADUATION IS DONE?

The crude estimates usually progress erratically from age to age. We therefore  smooth the crude estimates, to produce a set of graduated estimates that do progress smoothly with age. This is done because smoothing reduces the sampling errors at each age.

WHAT ARE DIFFERENT TYPES OF CENSORING? GIVE FEW EXAMPLES.
Censoring is present when we do not observe the exact length of a lifetime, but observe only that its length falls within some interval.
Right Censoring- ending of a mortality investigation before all the lives being observed have died.
Left censoring- In medical studies patients are subject to regular examinations. Discovery of a condition tells us only that the onset fell in the period since the previous examination; the time elapsed since onset has been left censored.
Interval censoring- In actuarial investigations, where we might know only the calendar year of death.
Random censoring- life insurance withdrawals
Type I censoring- Lives censored at the end of an investigation period
Type II censoring -When a medical trial is ended after 100 lives on a particular course of treatment have died.

WHAT IS PURELY IN-DETERMINISTIC TIME SERIES?
A process X is called purely indeterministic if knowledge of the previous values is progressively less useful at predicting the value of future events as N tends to infinity.

WHAT ARE INTEGRATED TIME SERIES?
X is said to be  ‘integrated of order 0’ if it is a stationary time series process, X is I(1) if X itself is not stationary but the increments X_t-X_(t-1) form a stationary process,

HOW COPULAS ARE USED IN INSURANCE INDUSTRY?
A copula is a function that takes as inputs marginal CDFs and outputs a joint CDF. For example, suppose an insurer wants to work out the joint probability that annual losses on its household portfolio will be less than or equal to £5m and that annual losses on its motor portfolio will be less than or equal to £3m. For simplicity of calculation, we assume that the two portfolios give rise to losses independently, then using copula their joint CDF can be calculated as a product of marginal CDFs.

WHY REINSURANCE IS NECESSARY FOR AN INSURANCE COMPANY?
The claims on an insurance company must be met in full, but, to protect itself from large claims, the company itself may take out an insurance policy; such a policy is called a reinsurance policy.

WHAT IS MACHINE LEARNING?
Machine learning is a method of training computers where computer algorithms are developed and applied to data to generate information. This information can consist simply of hidden patterns in the data, but often the information is applied to solve a specific problem.

WHAT ARE VARIOUS MACHINE LEARNING METHODS?
Supervised, Unsupervised, Semi-supervised and reinforcement learning methods.

ARE DECISION TREES AND RANDOM FOREST SAME?
Decision trees are used to make a prediction where each root node on a tree represents a single input variable and the leaf nodes of the tree contain an output variable. Random forests apply a method based on averaging a number of randomly generated decision trees.

WHAT IS MEANT BY PRUNING OF DECISION TREES?
Pruning refers to the stopping criteria which influences the performance of the tree such as using hold-out tests where only key leaf nodes are kept.

HOW PRECISION AND ACCURACY ARE DIFFERENT?
Accuracy measures how close the result is to the actual value required whereas precision measures how close results are to one another. While accuracy can be used in one instance, precision will be measured over time.

HOW DATA IS DIVIDED INTO TRAINING AND TEST DATA? WHY IS IT NECESSARY?
In machine learning, the ‘training’ data is often split into a part used to estimate the parameters of the model, and a part used to validate the model. Also, a test data set is created where the sample of data used to provide an unbiased evaluation of the final model fit.

WHAT IS OVER AND UNDER GRADUATION?
If the graduation process results in rates that are smooth but show little adherence to the data, then we say that the rates are over graduated. However if they adhere to the data but progress erratically, then they seem to be under-graduated.

HOW IS GINI INDEX A USEFUL TOOL IN MACHINE LEARNING?
Gini index measures the purity of the decision tree in such a way that it depicts an indication of how ‘pure’ the leaf nodes are. For each node, Gini index is weighted by the total number of instances in the parent node.