Process

Overview

    • processes (state, creation, termination)

    • (logical) layout of the program image

    • memory allocation/deallocation (malloc family of memory allocation functions)

Process

    • a process is a program in execution

    • executable module of program resides in a program image in main memory

    • each process id (do a ps in UNIX) and a process state (ready, running, blocked, and so on)

Logical layout of program image

(modified version of [USP] Fig. 2.1, p. 24; image courtesy Robbins & Robbins' [USP] Chapter 2 webpage)

Activation record

(modified version of [USP] Fig. 1.1, p. 16)

A stack of these activation records at run-time is called the execution stack

Process control block

For the duration of its existence, each process is represented by a process control block (PCB) which contains information such as:

    • state of the process (ready, running, blocked)

    • register values

    • priorities and scheduling information

    • memory management information

    • accounting information such as open files

    • allocated resources

Specifics of a PCB depend on the system and techniques used.

(ref. [OSIDP] Fig. 3.1 on p. 110; image courtesy [OSIDP] webpage)

Process life cycle

A process moves through various states during its life:

    • new,

    • ready,

    • running,

    • blocked, and

    • done states, and

    • transitions between them

(regenerated from [USP] Fig. 3.1, p. 62)

Many processes may be ready or waiting at the same time, but only one can be running at a time in a uniprocessor system.

Process and process lifecycle nomenclature

    • batch process: a process which is not connected to a terminal, or a non-interactive process

    • resident monitor: first software support for an operating system; transferred CPU control from process to process; provided automatic process sequencing

    • multiprogramming: multiple processes are kept in main memory and contend for the CPU with the goal of reducing the time the processor is idle

    • timesharing (or preemptive multi-tasking): an extension of multiprogramming in which the CPU is switched between processes very quickly; a small amount of CPU time is given to each process (called a time slice or quantum); gives rise to interactive systems and enables communication between a user and a process

    • non-preemptive multi-tasking = multiprogramming without timesharing

  • job scheduling: policy by which processes are loaded into main memory or, in other words, the ready queue

    • ready queue: contains all processes in main memory ready to execute; a process is dispatched from the ready queue to the CPU (called process scheduling); its processing may cause it to put on a device queue

  • process scheduling (or CPU scheduling): policy by which processes are granted access to the CPU

    • context switch: each process change requires a context switch which is purely overhead (i.e., no useful work is being accomplished); switching the CPU to another process requires i) saving the state of the old process and ii) loading the state for the new process

    • context switch time: time required to perform a context switch

    • system call: call to a core OS service, such as a read or write, and typically triggers a context switch;

    • `a system call is a request to the operating system for a service' ([USP] p. 119)

    • or

    • `a system call is a request to the operating system for service that causes the normal CPU cycle to be interrupted and control to be given to the operating system' ([USP] p. 7)

    • a system call, unlike a library call, often involves a context-switch; library calls are usually jackets for system calls (see [USP] exercise 4.19, p. 119 for more information).

    • interrupt (hardware): a hardware notification of an event, such as completion of I/O (e.g., read or write), error condition (e.g., divide by zero), request for an OS service (e.g., in response to a system call), typically generated by hardware; once loaded, an OS waits for an event to occur; events are signaled by interrupts; therefore, an OS is event-driven (or interrupt-driven) system

    • interrupt service routine: when an interrupt is generated, control is transferred to an interrupt service routine, which deals with the particular situation; after the interrupt is processed, control is returned to the interrupted process

    • (asynchronous or synchronous) signal (software): a software notification of an event

    • asynchronous event: any event which can occur at any time; interrupts and signals are asynchronous events in that there is no way to predict when they will occur

    • synchronous event

    • device driver: a program which identifies with and deals with a device (usually sends a signal in response to an interrupt)

  • UNIX is a multiprogramming and timeshared OS.

System calls cause context switches

(ref. [OSIDP] Fig. 1.5 on p. 16; image courtesy [OSIDP] webpage)

Process scheduling

    • allocating the CPU to a different process to reduce idle time

    • each process change requires a context switch

    • a context switch is pure overhead (i.e., involves no useful work)

Process identification

    • getuid, getgid, geteuid, getegid

    • process id, parent process id

    • getpid and getppid

    • real vs. effective user and group ids

    • cast results of these functions to long

    • output of ps command provides many of these details, including process state; experiment with the -a, -A, -l, and -o options

  • top

Process creation

    • an initial OS process begins running at boot time, which spawns (creates) other processes

    • a process hierarchy (tree) evolves with parent-child relationships

fork

    • system call used for process creation in UNIX

    • caller becomes parent and newly created process is child

    • processes are the same except for the process id

    • child inherits many attributes of its parent (e.g., environment, privileges, open files and devices)

    • returns 0 to the child and the child pid to the parent

    • execution proceeds after the call to fork in each process image

    • parent code and child code

    • effect of concurrency

    • sleep function, takes seconds to block as an int (e.g., sleep(30))

Graphical depiction of fork

(ref. [ATT] 6-11)

Process termination

    • returned to OS (in shell variable $? in UNIX system)

    • why 0 for success?

        • does not seem to follow C boolean convention

        • many ways to fail; see bottom of manpage for meanings of particular exit status

    • echo is the UNIX print command

    • for instance, echo hello world, echo $?

    • exit vs. return

        • in main they are the same

        • outside of main

            • exit can be used to exit the program (from any function)

            • return will transfer control to the caller

    • absence of return or exit in main has the same effect as exit(0)

    • to install an exit handler use atexit

    • how can a process terminate abnormally? external event (e.g., a <ctrl-c> (an asynchronous event)) or an internal error (e.g., illegal memory access (a synchronous event))

NULL pointer

    • it is illegal to read or write to address 0

    • a null pointer

        • any pointer with value 0

        • returned by pointer-returning functions to signify failure

    • NULL is defined in stdio.h as

        • #define NULL 0, or

        • #define NULL (void*) 0 in the new ANSI standard

    • not to be confused with the null character \0

Memory allocation/deallocation

malloc, calloc, realloc, and free

    • a general-purpose memory allocation package

    • prototyped in stdlib.h

    • malloc family allocates storage from the heap

    • malloc returns a pointer to a block of at least size bytes suitably aligned for any use

    • malloc returns a pointer of type void*; cast to appropriate type

    • to declare and array of 10 ints: int* int_ptr = (int*) malloc (sizeof(int)*10);

  • PCB* process_str_ptr = (PCB*) malloc (sizeof(PCB));

    • the argument to free is a pointer to a block previously allocated by malloc

    • after free is executed, this space is made available for further allocation by the application, though not returned to the system; memory is returned to the system only upon termination of the process (there is more to the story than this)

    • when passed NULL, recalloc acts like malloc

    • when passed a non-NULL pointer and a size of 0, recalloc acts like free

References

Threads

(ref. [OSIDP] Fig. 4.1 on p. 162; image courtesy [OSIDP] webpage)

Introduction

A thread is

    • an ADT within a process

    • has its own stack, program counter value, register set, and state

    • share process resources

(heavyweight) process vs. (lightweight) thread

We also have a thread control block (TCB).

Relationship between process and thread states

(ref. [OSIDP] Fig. 4.7 on p. 170; image courtesy [OSIDP] webpage)

variables are either static (exist for the life of the process) or automatic (allocated/deallocated on block entry/exit)

Declarations and definitions

    • defining a variable entails allocating storage for it (once per program)

    • declaring a variable entails announcing its type so that it can be used

    • every definition is also a declaration (e.g., the definition int x; is also a declaration)

    • some declarations are not definitions (e.g., the declaration extern int x; is not a definition)

extern modifier in C

    • /* x.c */ int x = 10; #include<stdio.h> /* main.c */ extern int x; main() { printf("%d\n", x); }

Storage and linkage classes

    • storage classes

        • static storage: once allocated, persists throughout lifetime of program (e.g., a global variable)

        • automatic storage: allocated and deallocated on block/function entry and exit, resp. (e.g., a variable local to a function)

    • linkage classes

        • internal linkage: visible only within defining file

        • external linkage: visible outside of defining file as well as within it

    • statics and externals have 0 as a default value

    • automatics have no default value

Storage class summary

notes:

    1. speed advantage

    2. for function arguments, value passed

    3. can be accessed from other files

    4. cannot be accessed from other files

(ref. [C])

static modifier in C

    • can you write an object-oriented program in C?

    • before a variable within a function

        • if present, static storage

        • if absent, automatic storage (default)

    • before a variable outside of a function

        • if present, internal linkage (like private in OO languages)

        • if absent, external linkage (default)

    • before a function

        • if present, internal linkage (like private in OO languages)

        • if absent, external linkage (default)

static modifier summary

(courtesy Robbins & Robbins' [USP] Chapter 2 webpage)

    • variables:

    • (ref. [USP] Table A.3, p. 814; HTML table courtesy Robbins & Robbins' [USP] Chapter 2 webpage)

    • functions:

    • (HTML table courtesy Robbins & Robbins' [USP] Chapter 2 webpage)

Summary of static reserved word

  • static keyword used in a variable declaration:

  • outside of any function:

    • /* x is static data and allocated in the static region of the memory image, and it has external linkage */ /* linkage class: ? storage class: ? */ int x; /* x is STILL static data, but now has internal linkage and thus cannot be referenced by another module (.o file) */ /* akin to "private" in C++ or Java */ /* linkage class: ? storage class: ? */ static int x;

  • inside of any function:

    • void f() { /* x is allocated on the stack (i.e., it is not static data) and this particular x can only be referenced within the body of this function */ /* linkage class: ? storage class: ? */ int x; /* x is now static data and allocated in the static region of the memory image */ /* linkage class: ? storage class: ? */ static int x; }

  • static keyword used in a function definition/declaration:

    • /* f() has external linkage and thus can be referenced by another module (i.e., .o file) */ /* linkage class: ? storage class: ? */ void f(); /* f() has internal linkage and thus cannot be referenced by another module (i.e., .o file) */ /* linkage class: ? storage class: ? */ void static f();

Bubblesort storage class exercise

    • for each object and function in the following code, give the storage and linkage class where appropriate

    • ref. [USP] program 2.5 (p. 41) and exercise 2.20 (p. 42)

    • /* a function which sorts an array of integers and counts the number of interchanges made in the process */ static int count = 0; /* return true if interchanges are made */ static int onepass (int a[], int n) { int i; int interchanges = 0; int temp; for (i = 0; i < n-1; i++) if (a[i] > a[i+1]) { temp = a[i]; a[i] = a[i+1]; a[i+1] = temp; interchanges = 1; count++; } return interchanges; } void clearcount (void) { count = 0; } int getcount (void) { return count; } /* sort a in ascending order */ void bubblesort (int a[], int n) { int i; for (i = 0; i < n-1; i++) if (!onepass (a, n-i)) break; }

    • count is a static variable with internal linkage (note use of static modifier for linkage, not storage)

    • all other variables use automatic storage with no linkage

    • onepass does not have a storage class; its linkage is internal

    • all other functions have external linkage; functions do not have a storage class

Synchronization

Thread-safe functions

    • static variables make the use of multiple threads unsafe

  • char* strtok (char* restrict s1, const char* restrict delimiter);

      • first and subsequent calls are different

      • tokenizes string in place (i.e., does not allocate new space for tokens)

      • strtok is not thread-safe because it uses a static pointer

      • graphical depiction of string before and after call to strtok

      • before:

      • (regenerated with minor modifications from [USP] Fig. 2.3, p. 36)

      • after:

      • (regenerated with minor modifications from [USP] Fig. 2.4, p. 36)

    • index into the string being tokenized (e.g., wordcount, wordaverage)

      • use POSIX thread-safe function strtok_r (_r stands for reentrant)

      • moral of the story: avoid using static storage

makeargv

    • ref. [USP] §2.6 (pp. 31-38)

    • three versions

        • char** makeargv (char* s);

        • int makeargv (char* s, char*** argvp);

      • int makeargv (const char* s, const char* delimiters, char*** argvp);

    • for instance, ./a.out -c tom and -m jerry

    • importance of cleaning up after yourself; [USP] example 2.19 on p. 38 (freemakeargv.c)

    • [USP] exercise 2.24 on p. 51 (getpaths.c)

Self-study

    • re-read [USP] §2.9 (pp. 42-48)

    • download, compile, and run codes

    • convince yourself you understand how the list works

    • exercise in implementing the third version of makeargv

References