Deadlock

Definition

A set of two or more processes are deadlocked if they are blocked (i.e., in the waiting state) each holding a resource and waiting to acquire a resource held by another process in the set.

A process is deadlocked if it is waiting for an event which is never going to happen.

Example:

- a system has two tape drives
- two processes are deadlocked if each holds one tape drive and has requested the other

Example: semaphores A and B, each initialized to 1:

P_0 P_1 --- --- A.wait(); B.wait(); B.wait(); A.wait(); A.signal(); B.signal(); B.signal(); A.signal();

Deadlock depends on the dynamics of the execution.

Illustrates that it is difficult to identify and test for deadlocks which may occur only under certain circumstances.

System model:

- resource types: R₁, R₂, ..., R_n
- each resource R has W_i instances
- each process utilizes a resource as follows:

// request (e.g., open() system call) // use // release (e.g., close() system call)

Any instance of a resource type can be used to satisfy a request of that resource.

Conditions Necessary for Deadlock

All of the following four necessary conditions must hold simultaneously for deadlock to occur:

- mutual exclusion: only one process can use a resource at a time.
- hold and wait: a process holding at least one resource is waiting to acquire additional resources which are currently held by other processes.
- no preemption: a resource can only be released voluntarily by the process holding it.
- circular wait: a cycle of process requests exists (i.e., P₀ is waiting for a resource hold by P₁ who is waiting for a resource held by P_j ... who is waiting for a resource held by P₍_n_-1) which is waiting for a resource held by P_n which is waiting for a resource held by P₀).

Circular wait implies the hold and wait condition. Therefore, these conditions are not completely independent.

Resource Allocation Graph Syntax

A resource allocation graph contains a set of vertices V and a set of edges E.

V is partitioned into two types:

- P = {P₁, P₂, ..., P_n} is the set of all processes.
- R = {R₁, R₂, ..., R_m} is the set of all resources.

A request is represented by a directed edge from P_i to R_j.

An assignment is represented by a directed edge from R_j to P_i.

- resource type with four instances:

- P_i requests an instance of R_j

- P_i is holding an instance of R_j

Sample Resource Allocation Graphs

- resource allocation graph without deadlock:
  - - P₁ wants a resource held by P₂
    - no process is requesting an instance of R₄

- - (regenerated from [OSC] Fig. 7.2 on p. 288)

resource allocation graph with a cycle and deadlock:

- (regenerated from [OSC] Fig. 7.3 on p. 289)
- resource allocation graph with a cycle but no deadlock:

- (regenerated from [OSC] Fig. 7.4 on p. 289)

Possibility of Deadlock

(regenerated from [OSC] Fig. 7.5 on p. 295)

If a resource allocation graph contains no cycles, then no process is deadlocked.

If a resource allocation graph contains a cycle, then a deadlock may exist.

Therefore, a cycle means deadlock is possible, but not necessarily present.

A cycle is not sufficient proof of the presence of deadlock. A cycle is a necessary condition for deadlock, but not a sufficient condition for deadlock.

difference between necessary and sufficient

- getting a 4.0 GPA is sufficient to graduate, but it is not necessary
- passing CPS 346 (OS) is necessary, but not sufficient

Resource Allocation Graph Summary

- if a resource allocation graph does not contain a cycle, then there is absolutely no possibility of deadlock
- if a resource allocation graph contains a cycle, then there is the possibility of deadlock
- if each resource type has exactly one instance, then a cycle implies that deadlock has occurred
- if the cycle involves only a set of resource types, each of which has only a single instance, then a deadlock has occurred
- if all instances of a resource are allocated to a process in a cycle, then there is deadlock

Methods for Handling Deadlock

The following are methods for addressing the possibility of deadlock:

- ensure that the system never enters a deadlocked state:
  - - deadlock prevention
    - deadlock avoidance
- deadlock detection and recovery: allow the system to enter a deadlocked state, then deal with and eliminate the problem
- ignore the problem: approached used by many operating systems including UNIX and Windows, and the Java VM

Deadlock Prevention

Restrain the ways resource requests are made so to prevent one of the four conditions necessary for deadlock.

prevent mutual exclusion
- - use only sharable resources (e.g., a read-only file)
  - impossible for practical systems
prevent hold and wait
- - methods
    - - preallocate
        do not pick up one chopstick if you cannot pick up the other
        for a process that copies data from DVD drive to a file on disk and then prints it from there:
        request DVD drive
        request disk file
        request printer
        all system calls requesting resources must proceed all other system calls
      - a process can request resources only when it has none

- - - - request DVD drive and disk file
        release DVD drive and disk file
        request disk file and printer (no guarantee data will still be there)
        release disk file and printer

- - inefficient
  - starvation possible

- prevent no preemption (i.e., allow preemption, and permit the OS to take away resources from a process)
  - - when a process must wait, it must release its resources
    - some resources cannot be feasibly preempted (e.g., printers, tape drives)

prevent circular wait
- - impose a total ordering on resources
  - only allow requests in an increasing order

Usually a deadlock prevention approach is simply unreasonable.

Deadlock Avoidance

This requires that the system has some information available up front. Each process declares the maximum number of resources of each type which it may need. Dynamically examine the resource allocation state to ensure that there can never be a circular-wait condition.

The system's resource-allocation state is defined by the number of available and allocated resources, and the maximum possible demands of the processes. When a process requests an available resource, the system must decide if immediate allocation leaves the system in a safe state.

The system is in a safe state if there exists a safe sequence of all processes:

Sequence < P₁, P₂, ... P_n > is safe for the current allocation state if, for each P_i, the resources which P_i can still request can be satisfied by

- the currently available resources plus
- the resources held by all of the P_j's, where j < i.

If the system is in a safe state, there can be no deadlock. If the system is in an unsafe state, there is the possibility of deadlock.

Example: consider a system with 12 magnetic tapes and 3 processes (P₀, P₁, and P₂):

Is the system in a safe state? If so, which sequence satisfies the safety criteria?

In this scheme, a process which requests a resource that is currently available, may still have to wait. Thus, resource utilization may be lower than it would otherwise be.

Deadlock Avoidance Algorithms

Two deadlock avoidance algorithms:

- resource-allocation graph algorithm
- Banker's algorithm

Resource-allocation graph algorithm

- only applicable when we only have 1 instance of each resource type
- claim edge (dotted edge), like a future request edge
- when a process requests a resource, the claim edge is converted to a request edge
- when a process releases a resource, the assignment edge is converted to a claim edge
- cycle detection: O(n²)

Banker's Algorithm

- a classic deadlock avoidance algorithm
- more general than resource-allocation graph algorithm (handles multiple instances of each resource type), but
- is less efficient

Resource-allocations graphs for deadlock avoidance

(regenerated from [OSC] Fig. 7.6 on p. 297)

(regenerated from [OSC] Fig. 7.7 on p. 297)

Banker's Algorithm

We call Banker's algorithm when a request for R is made. Let n be the number of processes in the system, and m be the number of resource types.

Define:

- available[m]: the number of units of R currently unallocated (e.g., available[3] = 2)
- max[n][m]: describes the maximum demands of each process (e.g., max[3][1] = 2)
- allocation[n][m]: describes the current allocation status ( e.g., allocation[5][1] = 3)
- need[n][m]: describes the remaining possible need (i.e., need[i][j] = max[i][j] - allocation[i][j])

Resource-request algorithm:

Define:

- request[n][m]: describes the current outstanding requests of all processes (e.g., request[2][1] = 3)

1. If request[i][j] <= need[i][j], to to step 2; otherwise, raise an error condition.
2. If request[i][j] > available[j], then the process must wait.
3. Otherwise, pretend to allocate the requested resources to P_i :
  1. available[j] = available[j] - request[i][j] allocation[i][j] = allocation[i][j] + request[i][j] need[i][j] = need[i][j] - request[i][j]
4. Once the resources are allocated, check to see if the system state is safe. If unsafe, the process must wait and the old resource-allocated state is restored.

Safety algorithm (to check for a safe state):

1. Let work be an integer array of length m, initialized to available.
2. Let finish be a boolean array of length n, initialized to false.
3. Find an i such that both:
  - finish[i] == false
  - need[i] <= work
4. If no such i exists, go to step 4

work = work + allocation[i];
1. finish[i] = true;
2. Go to step 2
3. If finish[i] == true for all i, then the system is in a safe state, otherwise unsafe.

Run-time complexity: O(m × n²).

Example: consider a system with 5 processes (P₀ ... P₄) and 3 resources types (A(10) B(5) C(7))

resource-allocation state at time t₀:

Is the system in a safe state? If so, which sequence satisfies the safety criteria?

< P₁, P₃, P₄, P₂, P₀ >

Now suppose, P₁ requests an additional instance of A and 2 more instances of type C.

request[1] = (1,0,2)

1. check if request[1] <= need[i] (yes)
2. check if request[1] <= available[i] (yes)
3. do pretend updates to the state

Is the system in a safe state? If so, which sequence satisfies the safety criteria?

<P₁, P₃, P₄, P₀, P₂>

Hence, we immediately grant the request.

Will a request of (3,3,0) by P₄ be granted?

Will a request of (0,2,0) by P₀ be granted?

Deadlock Detection

- requires an algorithm which examines the state of the system to determine whether a deadlock has occurred
- requires overhead
  - - run-time cost of maintaining necessary information and executing the detection algorithm
    - potential losses inherent in recovering from deadlock

Single instance of each resource type

- wait-graph
- P_i → P_j = P_i > R_q and R_q → P_j
- detect cycle: O(n²)
- overhead: maintain the graph + invoke algorithm

Resource-allocations graphs for deadlock detection

resource-allocation graph:

corresponding wait-for graph:

(regenerated from [OSC] Fig. 7.8 on p. 302)

Multiple instances of a resource type: use an algorithm similar to Banker's, which simply investigates every possible allocation sequence for the processes which remain to be completed.

Define:

available[m]
allocation[n][m]
request[n][m]
- with their usual semantics.

Algorithm:

1. Let work be an integer array of length m, initialized to available.
2. Let finish be a boolean array of length n.
3. For all i, if allocation[i] != 0, then finish[i] = false;
4. Otherwise finish[i] = true.
5. Find an i such that both
  1. - finish[i] == false // P_i is currently not involved in a deadlock

- - request[i] <= work

1. If no such i exists, go to step 4
2. // reclaim the resources of process P_i
3. work = work + allocation[i];
4. finish[i] = true;
5. Go to step 2
6. If finish[i] == false for some i,
7. Then the system is in a deadlocked state.
8. Moreover, if finish[i] == false, then process P_i is deadlocked.

Run-time complexity: O(m × n²).

Example: consider a system with 5 processes (P₀ .. P₄) and 3 resources types (A(7) B(2) C(6))

resource-allocation state at time t₀:

Is the system in a deadlocked state?

If not, which sequence results in finish[i] == true for all i ?

< P₀, P₂, P₃, P₁, P₄ >

Now suppose, P₂ requests an additional instance of C:

Is the system in a deadlocked state? Yes.

If not, which sequence results in finish[i] == true for all i ?

Although we can reclaim the resources held by P₀, the number of available resources is insufficient to fulfill the requests of the other processes.

Thus, a deadlock exists, consisting of processes P₁, P₂, P₃, and P₄.

When should we invoke the detection algorithm? Depends on:

- how often is a deadlock likely to occur
- how many processes will be affected by deadlock when it happens

If deadlocks occur frequently, then the algorithm should be invoked frequently.

Deadlocks only occur when some process makes a request which cannot be granted (if this request is the completes a chain of waiting processes).

- Extreme: invoke the algorithm every time a request is denied
- Alternative: invoke the algorithm at less frequent time intervals:
  - - once per hour
    - whenever CPU utilization < 40%
    - disadvantage: cannot determine exactly which process 'caused' the deadlock

Deadlock Recovery

How to deal with deadlock:

- inform operator and let them decide how to deal with it manually
- let the system recover from the deadlock automatically:
  - - abort or more of the deadlocked processes to break the circular wait
    - preempt some resources from one or more of the processes to break the circular wait

Process termination

Aborting a process is not easy; involves clean-up (e.g., file, printer).

- abort all deadlocked processes (disadvantage: wasteful)
- abort one process at a time until the circular wait is eliminated
  - - disadvantage: lot of overhead; must re-run algorithm after each kill
    - how to determine which process to terminate? minimize cost
      - priority of the process
        how long has it executed? how much more time does it need?
        how many and what type of resources has the process used?
        how many more resources will the process need to complete?
        how many processes will need to be terminated?
        is the process interactive or batch?

Resource Preemption

Incrementally preempt and re-allocate resources until the circular wait is broken.

- selecting a victim (see above)
- rollback: what should be done with process which lost the resource?
- clearly it cannot continue; must rollback to a safe state (???) => total rollback
- starvation: pick victim only small (finite) number of times; use number of rollbacks in decision

Summary

- definition of deadlock
- three methods of addressing deadlock
  - - ensure deadlock never arises
      - deadlock prevention: ensure at least one of the four necessary conditions for deadlock never holds
        mutual exclusion
        hold and wait
        no preemption
        circular wait
        deadlock avoidance
        less stringent, but requires a priori information
        Banker's algorithm
    - deadlock detection and recovery
      - terminate processes: selecting a victim
        preempt resources
        selecting a victim
        rollback
        starvation
    - ignore the problem (most common approach)
- there is no `silver bullet'
- often a combination of approaches should be employed to permit us to use an optimal approach for each class of resources in the system

References

- - - - [OSC]
      - [OSCJ]

- - - - A. Silberschatz, P.B. Galvin, and G. Gagne. Operating Systems Concepts. John Wiley and Sons, Inc., Eighth edition, 2009.
      - A. Silberschatz, P.B. Galvin, and G. Gagne. Operating Systems Concepts with Java. John Wiley and Sons, Inc., Seventh edition, 2007.