This summarizes some very basic concepts in survival analysis from this amazing tutorial, which helps me a lot.
Chapter 1: Basic Knowledge and Survival Data
Basic Definitions
Failure/Event time random variables . T can be both discrete or continuos.
Censored random variables is a non-negative variable indicating the censored time.
Censored failure/event time random variable is defined as
Issues with Survival Data
- staggered entry:
- people do not attend the study at the same time
- censoring
- people may experience no event until the end of study
- people may drop out in the middle of the study
Censoring is the main problem we care about.
Tyeps of Censoring
There are two ways of categorizing censoring.
The first way is
- Right censoring
- Left censoring
- Interval censoring
The second way is
- Independent censoring
- Informative censoring
Right Censoring
This is the most common type of censorng.
Only is observed.
In addition the , we also know the failure indicator
Left Censoring
Only is observed and the failure indicater is defined as
That is, we know the event happens before a certain time point, but have no idea of the exact event time.
Interval Censoring
is observed, where .
Independent Censoring
Censoring is independent if is independent o .
Informative Censoring
Distrobution of caontains information about the parameters characterizing the distribution of .
Chapter 2 Survival Distribution
Definitions
These four terms below are the keys in survival analysis.
- density function:
- survival function:
- hazard function:
- cumulative hazard function:
Density Function
For Discrete R.V.
Suppose
For Continuos R.V.
Survival Function
.
For Discrete R.V.
For Continuos R.V.
We have
Hazard Function
Hazard function is sometimes called instantaneous failure rate. It represents the probability of some event’s happening right now, given the fact that the event has not happened yet. The hazard function is restricted to be non-negative .
It is define as
For Continuos R.V.
For Discrete R.V.
Some example hazard shapes
- Increasing: age after 65
- Decreasing: after effective treatment
- Bathtub: age-specific mortality
- Constant: patients with advanced chronic disease
Cumulative Hazard Function
As time , .
For Continuos R.V.
For Discrete R.V.
Relationship: and
and
For continuos r.v., We know that
- .
- for left-continuos survival function ,
Then it is easy to show that
That is,
For discrete r.v.,
and
Continuous case
That is
.
Remember that we require that , then we can guanrantee that . This makes lots of sense, as people cannot live forever.
In summary, More hazard cumulative, less chance of survival.
Discrete case
Suppose that
The equations for discrete case and continuos case here are different. Therefore, instead of using the definition that , Cox defines that
so that holds for discrete case too.
Reference
[1] http://www.amstat.org/chapters/northeasternillinois/pastevents/presentations/summer05_Ibrahim_J.pdf [2] http://en.wikipedia.org/wiki/Failure_rate