The goal here is to help those who are interested learn the mathematics for computer programming.

You will learn how to use multiprocessing.

You will build and coordinate lots of very small, simple programs. The programs are intentionally small to strip out all excess distractions and focus on one particular mathematical idea at a time.

Multi-processing is difficult to understand. This is not for beginners. I will create lessons for beginners in the near future.

Computers are mathematics machines. Understanding mathematics will make you a better programmer.

Learn this because you want to learn it for your own benefit. And financial, business, or job benefit is a bonus.

NOTE: At the moment the signal and pipe code is very rudimentary, barely above the level of the examples from the MAN pages.

This simplicity is intentional. I could hand you fully implemented software and it would be too complex for you to easily figure out what is going on.

By building one small part at a time, we can build up to that same fully implemented software, but you will be able to understand it by the time we get there.

methodolgy

There are numerous books that teach mathematics for computers. M.I.T. has a free PDF

This web page shows how to build programs based on those ideas.

The approach here is to do things in their simplest form (to cut down on distractions) and then build up complexity.

You will need a computer (unfortunately I don’t have access to a smartphone or tablet to test these materials), terminal (with BASH or zsh), a text editor (or the discipline to output only as pure ASCII from a word processor), and a C compiler

This material is much easier to understand if you have already learned programming in any programming language.

This web page starts with several small examples to show how to build up to minimal multi-programming. You can use the button below to jump right to the actual use of that simple multi-programming.

first program

The first program (first.c) outputs the digit “1” and a newline character.

#include <stdio.h>

int main()

{

    printf("1\n");

    return 0;

}

For those unfamiliar with C:

The line #include <stdio.h> gives your program access to the standard input/output functions.

All C programs must include a funciton named main: int main(). This is the starting point of the program when it is run.

The curly braces ({ and }) mark the beginning and end of the main function.

The line printf("1\n"); transmits the digit one and the new line character (\n) to standard out.

Save this text into a file, compile it, and run it from Terminal, it will output the 1 and a new line.

separate programs

We will be writing a bunch of small programs rather than functions of subprograms located all in one big program.

This reduces the complexity at any given moment. Less distraction makes it easier to focus on the main topic at any given moment.

The tradeoff is dramatically reduced efficiency. Optimization should always occur after correct code has been created. Metrics from working code is much better at identifying which code needs to be optimized. Optimization often (not always) makes code less readable, which also means the code is more difficult to maintain and more likely to be fragile.

Further, we will be making the jump from sequential-only code to code that works in parallel. That calls for separate threads or separate programs or both.

identity function

The identity function returns an exact copy of its input (identity.c).

#include <stdio.h>

int main()

{

    char character = '0';

    character = getchar();

    putchar(character);

    putchar('\n');

    return 0;

}

The statement starting with the keyword char is a storage declaration. Space is set aside for a single charatcer with an initial value of '0' (that is the character zero, not the integer value of zero).

The getchar function inputs one character from STDIN.

The putchar function outputs one character to STDOUT.

If you save this text into a file and compile it. From Terminal you can pipe the first program into identity and get an output of 1.

    ./first | ./identity

always true function

The always true function always returns true as its output (alwaystrue.c).

This version ignores input.

#include <stdio.h>

int main()

{

    char character = '1';

    putchar(character);

    putchar('\n');

    return 0;

}

The // comments out an entire line, starting at the //. We could have just deleted the line, but I wanted to show those new to C how this is done.

Save this text into a file and compile it. Run the program from Terminal and you will get a one (true).

    ./alwaystrue

always false function

The always false function always returns true as its output (alwaysfalse.c).

This version ignores input.

#include <stdio.h>

int main()

{

    char character = '0';

    putchar(character);

    putchar('\n');

    return 0;

}

Save this text into a file and compile it. Run the program from Terminal and you will get a zero (false).

    ./alwaysfalse

not function

The not function produces true if the input is false and false if the input is true (not.c).

We introduce one more level of complexity. This program includes a case or switch structure to evaluate the input and determine the correct matching output. For convenience, we indicate errors with a question mark (?).

#include <stdio.h>

int main()

{

    char character = '?'; /* default to an error condition */

    character = getchar();

    switch (character)

      {

        case '0':

          character = '1';

          break;

        case '1':

          character = '0';

          break;

        case 'F':

          character = 'T';

          break;

        case 'T':

          character = 'F';

          break;

        case 'f':

          character = 't';

          break;

        case 't':

          character = 'f';

          break;

        default:

          character = '?';

      }

    putchar(character);

    putchar('\n');

    return 0;

}

The switch statement is followed by an expression to evalute inside parenthesis and then a block of code between curly braces ({ and }).

The block has one or more case statements (followed by a constant value and a colon (:) and zero or more statements.

Each block starts with an assignment statement. An expression (in this case, a constant character value) is assigned to a variable (in this case, character). The assignment is indicated by the equal sign (=). You will hear the portion to the left of the equals sign called the “lvalue” and the portion to the right of the equals sign called the “rvalue”.

The block’s statements can optionally end with a break keyword. If the break is used, then the program continues from the end of the switch block. If the break is left out, the program continues at the next following statement (even if there is an intervening case). This behavior often causes trouble for those who are new to C.

The optional default case is code that runs if none of the stated cases match the evaluated expression.

We examine the single input character. If we recognize it, we negate it and output the new value. If we don’t recognize it, we output the question mark as an error code.

While C++ can throw and catch exceptions, ANSI C does not. This is a throwback to an era before object oriented programming was invented.

You can save this text into a file and compile it. From Terminal you can pipe the alwaystrue or alwaysfalse program into not and get a negated output.

    ./alwaystrue | ./not

    ./alwaysfalse | ./not

fork a process

Now we are making a quick jump in complexity. We are going to start a child process.

This involves the fork() call from 1962 (Multics).

fork acts in a somewhat counterintuitive manner.

Each of the previous small programs ran as its own process. Each process occupied its own section of memory (that section can be moved around by the operating system).

Normally when a program or process is started, the program is loaded into memory and then the operating system starts it running.

fork makes a copy of the currently running program into its own separate memory. The program that runs the fork is called the parent and the program that is newly created is called the child.

The child program starts as an exact copy of the parent child, as the parent process was at the moment the fork function was run.

The operating system then starts the running the child program as an independent process. The parent and child processes do not normally share any memory (there are ways to create shared memory) and can not modify each other.

You can create communications between the parent and child processes, and a lot of the early part of this discussion will involve how you create and use communications between processes.

There is more important quirk about fork you need to know. Sometimes will explain this to programmers and they will be stunned by how counter-intuitive this is.

The fork() function returns zero if it is the child process and returns a non-zero value (usually the process ID) if it is the parent process.

Both parent and child will run in the exact same manner until the fork call is reached. That means that the child will do its own copy of any initialization that occurs before the fork. Anything that is supposed to be different between the parent and child must come after the fork.

The fork function will return an integer result.

If the interger result is zero, then it is the child running. This information is typically used to branch to the code unique to the child program.

If the interger result is non-zero (usually the process ID of the newly created and running child), then it is the parent running. This information is typically used to branch to the code unique to the parent program.

A child program can fork its own children, but the coding to keep track of where you are in the sequence of forks can get messy in a hurry.

Here is an example program:

#include <stdio.h>

#include <sys/types.h>

#include <unistd.h>

int main()

{

    if (fork() == 0)

      printf("Child process\n");

    else

      printf("Parent process\n");

    return 0;

}

The fork() function returns zero if it is the child process and returns a non-zero value (usually the process ID) if it is the parent process.

Note that both the parent and child processes are directly outputting to STDOUT. In this case we know the parent will run before the child, but this is not always guaranteed. We will later introduce code to prevent a bunch of processes intermixing their outputs in a different manner each time you run the parent program.

fork subroutines

We break the single combined operation into two different subroutines. This allows each to grow in complexity independently without ending up with a tangled mess of combined code.

it is to our advantage to keep the two sets of code neatly separated.

Changes are marked like this.

#include <stdio.h>

#include <sys/types.h>

#include <unistd.h>

 

int childprocess()

{

    printf("Child process\n");

    return 0;

}

 

int parentprocess()

{

    printf("Parent process\n");

    return 0;

}

 

int main()

{

    char character = '0';

    if (fork() == 0)

      childprocess();

    else

      parentprocess();

    return 0;

}

In C, all functions called need to be already declared. This can be done in a header file, so you can write recursive code.

Because of this requirement, you will often see C source code files sequentially starting with the lowest level functions and ggradually building up to the main() function. This is probably the opposite pattern of what you have been using in modern programming languages (or even some of the early programming languages).

You can see the pattern C uses for declaring functions.

There is an extra variable declaration and intitialization thrown in before the fork. Storage will be declared and initialized before the fork call and both the parent and child process will have a copy of that same information. This can be used for passing information on to the child process.

passing information

We add a little more to the forking program so that important information to both the parent and the child.

Our parent program potentially has an array of command line arguments. The array is argv and the integer argc has a count of how many elements are in the array.

Our parent program potentially also has an array of environment variables. This is the envp array.

We will later see how either the parent or child process can modify, delete, or add environment variables.

This new modification passes the command line and environment information on to the functions (subroutines) for both the parent and child processes.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <termios.h>

 

int childprocess(int argc, char *argv[], char *envp[])

{

    printf("Child process\n");

    return 0;

}

 

int parentprocess(int argc, char *argv[], char *envp[])

{

    printf("Parent process\n");

    return 0;

}

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    if (fork() == 0)

      result = childprocess(argc, argv, envp);

    else

      result = parentprocess(argc, argv, envp);

    return 0;

}

We changed the list of included header files in anticipation of upcoming coding.

We store the results of the parent and child processes (although we don’t use that result yet).

We are now passing variables to the functions. In this case, the variables are the number of command line arguments (int argc), a pointer to the array of command line arguments (char *argv[]), and a pointer to a zero-terminated array of enviromental variables (char *argv[]).

exec

Now we exec one of our little programs.

As previously mentioned, an operating system normally loads a program into memory and then runs it.

exec allows us to replace the current running prorgam (normally the child) with any program (within possible limits imposed by operating system security) and switch to running that new program.

This is the essence of how your shell (in terminal) runs whatever program you request.

You can optionally have the new program run using the parent’s command line and/or environment variables. You can modify the new prorgam’s copy to be slightly (or greatly) different than the parent’s. You can even create whole new command line and/or environment variables for the new program.

There are several different versions of exec, each operating slightly differently. The following are the choices from Linux:

We are going to use execv for our sample code.

You can read the man page for your operating system to see which variations you have available and what each does.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <termios.h>

 

/*************/

/* CONSTANTS */

/*************/

 

/*************/

/* VARIABLES */

/*************/

 

/*************/

/* FUNCTIONS */

/*************/

 

int childprocess(int argc, char *argv[], char *envp[])

{

    char *const argument_array[] = { "/Users/userid/Sites/cgi/alwaystrue", NULL };

    printf("Child process\n");

    execv("/Users/userid/Sites/cgi/alwaystrue", argument_array);

    return 0;

}

 

int parentprocess(int argc, char *argv[], char *envp[])

{

    printf("Parent process\n");

    wait(NULL); /* to prevent child going zombie */

    return 0;

}

 

/****************/

/* MAIN PROGRAM */

/****************/

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    if (fork() == 0)

      result = childprocess(argc, argv, envp);

    else

      result = parentprocess(argc, argv, envp);

    return 0;

}

In the child function we now use execv to run one of our little programs. Make sure to use the full path name from root for your own computer.

In the parent function we have a wait(NULL). This is blocking. This prevents the parent program from completing before the child program does (which would turn the child into a zombie process).

process ID

Now we will gather the process IDs for both the parent and child processes.

On UNIX and Linux systems, every process has its own process ID. This is an integer. Often the intergers are incremented by one each time the operating system starts a new process.

The process ID is often called the PID.

BASH and most modern shells have a variable for finding out the PID of the current process.

There are several different utility prorgams that can give you the PID of any running process (or a list of all of them).

Both our parent and child processes can determine their own process if with the getpid function.

In the parent process (only), the PID of the child is normally the interger result of the fork function.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <termios.h>

 

/*************/

/* CONSTANTS */

/*************/

 

/*************/

/* VARIABLES */

/*************/

 

/*************/

/* FUNCTIONS */

/*************/

 

/****************/

/* CHILDPROCESS */

/****************/

 

int childprocess(int argc, char *argv[], char *envp[])

{

    pid_t childprocessID;

    char *const argument_array[] = { "/Users/userid/Sites/cgi/alwaystrue", NULL };

    childprocessID = getpid();

    printf("Child process with ID %d\n", childprocessID);

    execv("/Users/userid/Sites/cgi/alwaystrue", argument_array);

    return 0;

}

 

/*****************/

/* PARENTPROCESS */

/*****************/

 

int parentprocess(int argc, char *argv[], char *envp[], pid_t childprocessID)

{

    pid_t parentprocessID;

    parentprocessID = getpid();

    printf("Parent process running with ID of %d and child process of %d\n", parentprocessID , childprocessID);

    wait(NULL); /* to prevent child going zombie */

    return 0;

}

 

/****************/

/* MAIN PROGRAM */

/****************/

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    pid_t newprocessID;

    newprocessID = fork();

    if (newprocessID == 0)

    if (fork() == 0)

      result = childprocess(argc, argv, envp);

    else

      result = parentprocess(argc, argv, envp, newprocessID);

    return 0;

}

The getpid function gets the process ID of the current running process (parent or child, whichever is running). In the case of the result of the fork function, in the parent process it returns the ID of the new child process and in the child process it returns zero.

We also used the ability for the printf function to include inserted variables. Numbers are converted into character strings.

signals

Now we start sending and catching signals. We will eventually want to reliably send user signals between processes and we want to catch the signal indicating that a child process has ended.

Signals are a low level method for sending infromation between running processes.

Each operating system has its own list of signals.

There is a default action (which might be ignore) for each possible signal.

A program can tell the operating system that it wants to ignore specific signals. Some signals, such as KILL, can not be ignored.

A program can set up its own handler to handle incoming signals.

The following program handles SIGCHLD, SIGUSR1, and SIGUSR2.

SIGCHLD is generated when a child process ends (whether normal or aborted). This lets us know when any child proceses we created have ended.

SIGUSR1 and SIGUSR2 are reserved for programmer use and can mean anything the programmer wants them to mean.

You will notice that we have one handler and we have to check to see which signal we received.

Two very important things to realize are that the signal handler runs asynchronously and is an interrupt.

Asynchronous means that the signal can come in at any time. This opens up all the complexity of multi-processing. You now have to deal with the psssibility of signals arriving in any order and even simultaneously (or near enough to simultaneous that it acts as the same thing).

Interrupt means that your main program suspends action and the interrupt process runs. Once the interrupt process ends, the main program resumes where it was suspended.

In some cases the operating system may actually let the main program run simultaneously. From our perspective, that makes no significant difference.

Because the signal handler is running as an interrupt, it is essential that it do as little procesing as possible so it can end and get out of the way as soon as possible. During some testing we will use printf. That technique should never be used in deployed software, only for testing.

It also means that communications between the interrupt handler and the main program are asynchronous. You will have to plan carefully to make sure that the handler and the main program don’t write over the same variable and create a system for the main prorgam to know a signal has arrived.

We will defer handling these complexities.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <signal.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <sys/resource.h>

#include <termios.h>

 

/*************/

/* CONSTANTS */

/*************/

 

/*************/

/* VARIABLES */

/*************/

 

/*************/

/* FUNCTIONS */

/*************/

 

/*******************/

/* SIGNAL FUNCTION */

/*******************/

 

/******************/

/* SIGNAL_HANDLER */

/******************/

void signal_handler(int signumber)

{

    switch (signalnumber)

      {

        case SIGCHLD:

          printf("Inside signal child handler function with signal number %d\n", signalnumber);

          break;

        case SIGUSR1:

          printf("Inside signal user 1 handler function with signal number %d\n", signalnumber);

          break;

        case SIGUSR2:

          printf("Inside signal user 2 handler function with signal number %d\n", signalnumber);

          break;

      }

}

 

/****************/

/* CHILDPROCESS */

/****************/

 

int childprocess(int argc, char *argv[], char *envp[])

{

    pid_t childprocessID;

    char *const argument_array[] = { "/Users/userid/Sites/cgi/alwaystrue", NULL };

    childprocessID = getpid();

    printf("Child process with ID %d\n", childprocessID);

    execv("/Users/userid/Sites/cgi/alwaystrue", argument_array);

    return 0;

}

 

/*****************/

/* PARENTPROCESS */

/*****************/

 

int parentprocess(int argc, char *argv[], char *envp[], pid_t childprocessID)

{

    pid_t parentprocessID;

    parentprocessID = getpid();

    printf("Parent process running with ID of %d and child process of %d\n", parentprocessID , childprocessID);

    wait(NULL); /* to prevent child going zombie */

    return 0;

}

 

/****************/

/* MAIN PROGRAM */

/****************/

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    pid_t newprocessID;

    printf("running main process\n");

    /* Register signal handlers */

    signal(SIGUSR1,signal_handler);

    signal(SIGUSR2,signal_handler);

    signal(SIGCHLD,sigchld_handler);

    raise(SIGUSR1);

    raise(SIGUSR2);

    printf("Inside main function after raising SIGUSR1 and SIGUSR2\n");

    newprocessID = fork();

    if (newprocessID == 0)

    // FORK

    if (fork() == 0)

      result = childprocess(argc, argv, envp);

    else

      result = parentprocess(argc, argv, envp, newprocessID);

    return 0;

}

The signal function sets up a interrupt handler to catch signals. It takes the signal number and the pointer to the handler function as its parameters.

The raise function generates a signal. It takes the signal number as its parameter.

For right now, all we do in the signal handler function is report which signal we caught.

Note that signal handlers are not supposed to use printf, but we are temproarily breaking the rules for testing purposes.

These signals are not actually doing anything (other than a debugging report that lets you know the signals are working). This is because of multi-processing.

Imagine that the child process sends two of the same signal. What happens if you are still processing the first signal and the second signal came in? If you aren’t very careful, the two interrupt processes can conflict with each other, potentially breaking both and anything that depends on them.

For now we are going to avoid this complexity by simply not using the signals.

Because any process can only end running once, we can use the child process death signal safely if we make sure that each child process is handled by separate and independent variables. You can use the signals for this purpose as long as you are careful.

signal action

We switch from the basic signal to the more sophisticated sigaction.

All of the caveats discussed for signal handler still apply.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <signal.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <sys/resource.h>

#include <termios.h>

 

/*************/

/* CONSTANTS */

/*************/

 

/*************/

/* VARIABLES */

/*************/

    struct sigaction signalstructure;

 

/*************/

/* FUNCTIONS */

/*************/

 

/*******************/

/* SIGNAL FUNCTION */

/*******************/

 

/******************/

/* SIGNAL_HANDLER */

/******************/

void signal_handler(int signalnumber, siginfo_t * signalinfo, void *unused)

{

    switch (signalnumber)

      {

        case SIGCHLD:

          printf("Inside signal handler: child with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

        case SIGUSR1:

          printf("Inside signal handler: user 1 with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

        case SIGUSR2:

          printf("Inside signal handler: user 2 with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

      }

}

 

/****************/

/* CHILDPROCESS */

/****************/

 

int childprocess(int argc, char *argv[], char *envp[])

{

    pid_t childprocessID;

    char *const argument_array[] = { "/Users/userid/Sites/cgi/alwaystrue", NULL };

    childprocessID = getpid();

    printf("Child process with ID %d\n", childprocessID);

    execv("/Users/userid/Sites/cgi/alwaystrue", argument_array);

    return 0;

}

 

/*****************/

/* PARENTPROCESS */

/*****************/

 

int parentprocess(int argc, char *argv[], char *envp[], pid_t childprocessID)

{

    pid_t parentprocessID;

    parentprocessID = getpid();

    printf("Parent process running with ID of %d and child process of %d\n", parentprocessID , childprocessID);

    wait(NULL); /* to prevent child going zombie */

    return 0;

}

 

/****************/

/* MAIN PROGRAM */

/****************/

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    pid_t newprocessID;

    printf("running main process\n");

 

    /* Register signal handlers */

    signalstructure.sa_flags = SA_SIGINFO;

    sigemptyset(&signalstructure.sa_mask);

    signalstructure.sa_sigaction = signal_handler;

    sigaction(SIGUSR1, &signalstructure, NULL);

    sigaction(SIGUSR2, &signalstructure, NULL);

    sigaction(SIGCHLD, &signalstructure, NULL);

 

    raise(SIGUSR1);

    raise(SIGUSR2);

    printf("Inside main function after raising SIGUSR1 and SIGUSR2\n");

 

    // FORK

    newprocessID = fork();

    if (newprocessID == 0)

    if (fork() == 0)

      result = childprocess(argc, argv, envp);

    else

      result = parentprocess(argc, argv, envp, newprocessID);

    return 0;

}

The sigaction function is the advanced version of signal handling. In particular, we want to know which process (by PID) generated the signal. We will eventually need to know which of more than one possible process we are interacting with.

Note that signal handlers are not supposed to use printf, but we are temproarily breaking the rules for testing purposes.

For now, we will move on to testing other features needed for multiprocessing.

anonymous pipes

The use of anonymous pipes for sending information between processes is the last basic building block needed before we return to work with mathematical concepts. We will be able to assemble multiple small mathematical concepts together and start doing some interesting things.

You have probably used anonymous pipes in terminal when you used the | symbol to pipe the outpur of one prorgam to the input of another.

We can set up our own anonymous pipes between our parent and child processes.

It is possible to set up the anonymous pipes so that they are connected to the child process’s STDIN, STDOUT, and STDERR. We won't be doing that because we want to preserve the ability for our child processes to printf testing mesages directly to terminal.

The write function is used to send data through a pipe.

The read function is used to receive data from a pipe.

The read and write functions are the same ones used for reading and writing ordinary files. In UNIX and Linux, anonymous pipes are treated as files.

Normally on UNIX and Linux, the file descriptors are 0 for STDIN, 1 for STDOUT, and 2 for STDERR, with the file descriptor being incremented for each new pipe or file opened.

The pipe function serves the same basic purpose as a file open function. A key difference is that it creates a two element array of integers rather than a single fiel descriptor. The zero element is the read end and the one element is the write end.

You will need to close the unused end of the anonymous pipe.

If the pipe is for sending data from the parent to the child, on the parent side close the read (0) end and on the child side close the write (1) end.

If the pipe is for sending data from the child to the parent, on the parent side close the write (1) end and on the child side close the read (0) end.

Before we use the pipe, we also use the fcntl function to set the pipe to be non-blocking.

Normally the read and write functions block. That is, the program stops running until the read or write has finished.

Because we are building multi-procesing, we need for all the prorgams to run independently from each other and not have them come to a screeching halt while they are waiting for another program (unless we intentionally want them to halt so we can synchronize some operation(s).

The sleep function is used to put the process to sleep for a designated number of seconds. We are using this to simulate time passing from some intricate set of computations.

#include <stdlib.h>

#include <string.h>

#include <errno.h>

#include <stdio.h>

#include <signal.h>

#include <unistd.h>

#include <sys/types.h>

#include <sys/wait.h>

#include <sys/stat.h>

#include <sys/resource.h>

#include <termios.h>

#include <fcntl.h>

 

/*************/

/* CONSTANTS */

/*************/

#define MSGSIZE 27 /* This is too small to be practical, this is just for testing purposes */

 

/*************/

/* VARIABLES */

/*************/

    struct sigaction signalstructure;

 

/*************/

/* FUNCTIONS */

/*************/

 

/*******************/

/* SIGNAL FUNCTION */

/*******************/

 

/******************/

/* SIGNAL_HANDLER */

/******************/

void signal_handler(int signalnumber, siginfo_t * signalinfo, void *unused)

{

    switch (signalnumber)

      {

        case SIGCHLD:

          printf("Inside signal handler: child with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

        case SIGUSR1:

          printf("Inside signal handler: user 1 with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

        case SIGUSR2:

          printf("Inside signal handler: user 2 with signal number %d from process %d\n", signalnumber, signalinfo->si_pid);

          break;

      }

}

 

/****************/

/* CHILDPROCESS */

/****************/

 

int childprocess(int argc, char *argv[], char *envp[], int pipearray1[])

{

    pid_t childprocessID;

    char *const argument_array[] = { "/Users/userid/Sites/cgi/alwaystrue", NULL };

    char *msg1 = "message from child count 1";

    char *msg2 = "message from child count 2";

    char *msg3 = "message from child count 3";

    char *msg4 = "bye from child ";

    int i = 0;

 

    childprocessID = getpid();

    printf("Child process with ID %d\n", childprocessID);

    // read link

    close(pipearray1[0]); /* close the unused end of the pipe */

    // write 3 messages in 3 second intervals

    for (i = 0; i < 3; i++)

      {

        switch (i)

          {

            case 0:

              write(pipearray1[1], msg1, MSGSIZE);

              break;

            case 1:

              write(pipearray1[1], msg2, MSGSIZE);

              break;

            case 2:

              write(pipearray1[1], msg3, MSGSIZE);

              break;

          } /* END SWITCH */

          sleep(3);

      } /* END FOR */

    // write "bye" one time

    write(pipearray1[1], msg4, MSGSIZE);

    /* NOTE: We have not used the pipe in the following program. We have not closed one end of the pipe because we intend to have two way communication. */

    execv("/Users/userid/Sites/cgi/alwaystrue", argument_array);

    return 0;

}

 

/*****************/

/* PARENTPROCESS */

/*****************/

 

int parentprocess(int argc, char *argv[], char *envp[], pid_t childprocessID, int pipearray1[])

{

    pid_t parentprocessID;

    int endflag = 0;

    int nread;

    char buf[MSGSIZE];

 

    // write end

    close(pipearray1[1]); /* close unused end */

    while (endflag == 0)

        // read call if return -1 then pipe is

      // empty because of fcntl

      nread = read(pipearray1[0], buf, MSGSIZE);

      switch (nread)

        {

          // case -1 means pipe is empty and errono

          case -1:

            // set EAGAIN

            if (errno == EAGAIN)

              {

                printf("([parent pipe empty)\n");

                sleep(1);

                break;

              }

            else

              {

                perror("read");

                exit(4);

              }

            break;

          // case 0 means all bytes are read and EOF(end of conv.)

          case 0:

            printf("End of conversation\n");

            // read link

            close(pipearray1[0]);

            endflag = 1;

            break;

          default:

            // text read

            // by default return no. of bytes

            // which read call read at that time

            printf("MSG = %s\n", buf);

        } /* END SWITCH */

    } /* END WHILE */

    parentprocessID = getpid();

    printf("Parent process running with ID of %d and child process of %d\n", parentprocessID , childprocessID);

    wait(NULL); /* to prevent child going zombie */

    return 0;

}

 

/****************/

/* MAIN PROGRAM */

/****************/

 

int main(int argc, char *argv[], c)

{

    char character = '0';

    int result;

    pid_t newprocessID;

    int pipearray1[2];

 

    printf("running main process\n");

 

    /* Register signal handlers */

    signalstructure.sa_flags = SA_SIGINFO;

    sigemptyset(&signalstructure.sa_mask);

    signalstructure.sa_sigaction = signal_handler;

    sigaction(SIGUSR1, &signalstructure, NULL);

    sigaction(SIGUSR2, &signalstructure, NULL);

    sigaction(SIGCHLD, &signalstructure, NULL);

 

    raise(SIGUSR1);

    raise(SIGUSR2);

    printf("Inside main function after raising SIGUSR1 and SIGUSR2\n");

 

    //PIPE

    // error checking for pipe

    if (pipe(pipearray1) < 0)

      {

        printf("creation of pipe failed\n");

        exit(1);

      }

    // error checking for fcntl

    if (fcntl(pipearray1[0], F_SETFL, O_NONBLOCK) < 0)

      {

        printf("file control on pipe failed\n");

        exit(2);

      }

 

    // FORK

    newprocessID = fork();

    switch (newprocessID)

      {

        // error

        case -1:

          printf("fork failed\n");

          exit(3);

          break;

        // 0 for child process

        case 0:

          //child_write(p);

          result = childprocess(argc, argv, envp, pipearray1);

          break;

        // integer for parent process

        default:

          //parent_read(p);

          result = parentprocess(argc, argv, envp, newprocessID, pipearray1);

          break;

      } /* END SWITCH */

    return 0;

}

The process of forking has changed from an if and and else pair to a switch to include some error checking.

The main function now creates a two way pipe or a pair of one way pipes (depending on the underlying OS) to allow communciations between the parent and child processes. This occurs between the handling of signals and the handling of fork.

We need an integer array to hold the file descriptors for the read and the write ends of the pipe(s). This is pipearray1.

The pipe function creates the pipe and puts the integer file descriptors into our array. A nonzero result indicates an error.

We pass this pipe file descriptor array to both the parent and child functions.

The fcntl function is used to make the pipes asynchronous (non-blocking).

On the parent side we go through a while loop checking for any data in the pipe. If we find data, we read it into a temporary array and print the data to STDOUT. If the pipe is empty, we print a message about that to STDOUT.

On the child side we write three simple messages every three seconds, followed by a temrination message.

The pipes are connected to the alwaystrue program that we run with execv, but they aren’t used yet. The alwaystrue program is still writing its output directly to STDOUT.

We have just one more minor thing to take care of before we read directly from the child program.

If you are wondering why it is taking so long to get this code published, for every five to ten minutes of coding and testing, it takes several hours of conversion to HTML with all of the indendation in place. Check daily for updates.

environment variables

Environment variables are normally used for passing information from a parent process to a child process. The parent can’t read any changes the child makes to its environment variables and can only send the environment variables at the time it creates the child.

The same restrictions apply when the child process replaces itself with another program using exec (for those forms of exec that pass on environment variables).

The most common place where prorgammers encounter environment variables is when a web server (such as Apache, LiteSpeed, Microsoft IIS, Nginx, or OpenResty) sends environemnt variables to a programming language (such as ASP.NET/C#, C, ColdFusion, Erlang, Golang, Java, JavaScript (Node.JS), Kotlin, Perl, PHP, Python, Ruby, Rust, Scala, or Solidity).

We are going to use a simplified (not practical) version to show how this is done. We will set environment variables so the child program will know the file descriptior numbers for the pipe.

Unlike a child function in a single source file, a separate program will not have access to the variables from the parent program because exec will move the child program into memory and overwrite the original parent program.

Unlike the previous examples, I will only show the parts that change, because we are making very few changes.

These changes occur in the function childprocess.

The setenv function is used to set an environment variable (which must have a character string name and a character string content).

 

/****************/

/* CHILDPROCESS */

/****************/

 

int childprocess(int argc, char *argv[], char *envp[], int pipearray1[])

Add these to the storage declarations:

    char pipe0[80];

    char pipe1[80];

And add this code right after the printf announcing that the child process has started:

    childprocessID = getpid();

    printf("Child process starting with ID %d\n", childprocessID);

    // ENVIRONMENT VARIABLES

    sprintf(pipe0, "%d", pipearray1[0]);

    sprintf(pipe1, "%d", pipearray1[1]);

    setenv("PIPEREAD", pipe0, 1);

    setenv("PIPEWRITE", pipe1, 1);

The sprintf function works like the printf function, except it writes the results to a string rather than to STDOUT. We use this funciton to convert an integer to the ASCII equivalent.

Set setenv function sets an environment variable. We use execcv, so the environement variables are passed to the new program. The first parameter is the name of the environment variable. The second parameter is the string holding the character value of the environment variable. The third parameter says to overwrite any previous value.

Our next step will be modifying the simple programs we previously created so that they all write their results to the pipe.

read environment variables

Now we modify the alwaystrue.c program to make sure that it correctly received the environment variables containing the file descriptors for the pipe.

The getenv function is used to read an environment variable.

Note that if you use the getenv on an enviroment variable name that doesn’t exist, you will get a NULL.

#include <stdio.h>

#include <stdlib.h>

int main()

{

    const char* pipe0;

    const char* pipe1;

    char character = '1';

 

    pipe0 = getenv("PIPEREAD");

    printf("PIPE0 :%s\n",(pipe0 != NULL)? pipe0 : "getenv returned NULL");

 

    pipe1 = getenv("PIPEWRITE");

    printf("PIPE1 :%s\n",(pipe1 != NULL)? pipe1 : "getenv returned NULL");

 

    putchar(character);

    putchar('\n');

    return 0;

}

The getenv function moves a copy of an environment variable to a buffer (if the environment variable with that name exists) and returns the number of bytes.

convert from character string to integer

As previosuly mentioned the environment variable smust be character strings.

File descriptors are integers and environment variables are character strings. So, we need to make an ASCII string to integer conversion.

We will use the atoi function to convert the environment character string into an integer for the file descriptor.

I left out error checking. In the real world you should include error checking.

Note that if you use the getenv on an enviroment variable name that doesn’t exist, you will get a NULL.

#include <stdio.h>

#include <stdlib.h>

int main()

{

    const char* pipe0;

    const char* pipe1;

    char character = '1';

    int fd0;

    int fd1;

 

    pipe0 = getenv("PIPEREAD");

    if (pipe0 != NULL)

      {

         printf("PIPE0 :%s\n",pipe0);

        fd0 = atoi(pipe0);

      }

    else

      {

        printf("getenv returned NULL\n");

      }

 

    pipe1 = getenv("PIPEWRITE");

    if (pipe1 != NULL)

      {

        printf("PIPE1 :%s\n",pipe1);

        fd1 = atoi(pipe1);

      }

    else

      {

        printf("getenv returned NULL\n");

      }

 

    putchar(character);

    putchar('\n');

    return 0;

}

The atoi function converts an ASCII string to an integer. If no valid conversion can be performed, it returns zero. All of the characters must be digits, except there can be a leading plus or minus sign. If there is a decimal point, the decimal point and everything after it is ignored.

write to pipe

This code doesn’t introduce anything new. It confirms that the pipe has been correctly identified and used by the chlld program after exec.

This program is pipetest.c.

I had an error in my code (one incorrect character in a variable name).

I simplified the program to examine only the one thing that was failing. And I inserted a bunch of printf and write to help me see where to look for the error.

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <ctype.h>

#include <string.h>

#include <signal.h>

 

int main()

{

 

    const char* pipe0 = NULL;

    const char* pipe1;

    char character = '1';

    char message[6];

    int fd0;

    int fd1;

    #define MSGSIZE 27

    char *msg1 = "message through pipe ";

 

        printf("start of pipetest\n");

 

    /* read */

    pipe0 = getenv("PIPEREAD");

    if (pipe0 != NULL)

      {

        printf("PIPE0 :%s\n",pipe0);

        fd0 = atoi(pipe0);

      }

    else

      {

        printf("getenv returned NULL\n");

      }

 

    /* write */

    pipe1 = getenv("PIPEWRITE");

    if (pipe1 != NULL)

      {

        printf("printed from pipetest PIPE1 :%s\n",pipe1);

        fd1 = atoi(pipe1);

        printf("from pipetest: PIPE1 file descriptor is :%i\n",fd1);

        write(fd1, msg1, MSGSIZE);

        message[0] = '*';

        message[1] = '1';

        message[2] = '*';

        message[3] = '\n';

        write(fd1, message, 5);

      }

    else

      {

        printf("getenv returned NULL\n");

      }

 

    exit(0);

    return 0;

 

}

The one thing missing here is that it is good practice to close unused sides of a pipe.

At this point we are successfully opening a pipe, creating an environment variable, starting a child program, reading the environment variable, and writing data through the pipe. We are almost ready to create some real multi-programming.

We need to make sure the child can read from the parent and we have two way communication.

start

You may notice that we are still slowly adding the building blocks for the first real software. Please be patient.

The following is very simple code for a pair of programs.

The program main.c makes a call to a program called pipetest.c and sends a two character message (the digit one and a newline character). The program pipetest.c reads the message and then sends back the same two character message followed by a two character message of the letter Q and a newline character. The main program shuts down the child program and itself after receiving the second message.

We will build upon this very simple foundation.

Initially our messages will be extremely simple in a very strict format sent through pipes. We will eventually move on to more flexible message formats and several other ways to send messages, including shared memory, sockets, files, message queues, and databases.

Our child programs will start with the four possible unary logic operations and the 16 possible binary logic operations. Each will be run as a one-shot program (which is highly inefficient, but easy to code). it will be possible to string together a series of sequential calls from the main program, allowing some simple logic computations.

We will be able to use test driven software. We write one program that runs all of our tests and reports pass or fail.

We willl extend our child programs to also include a basic set of lambda calculus functions, again operated in a very inefficient sequential manner.

With the basic operations running, we will move to multi-processing.

This will include changes to the child programs so that they continue to run instead of being repeatedly started and stopped.

The changes in the child programs will require additional coding in the the main program to keep track of which child programs are running and eventually shutting down child programs when they are no longer needed.

Most of the child programs will need additional code to keep track of state, as well as code for resetting the child program to the initial state.

Once this work is done we will look at how we have to write the parallel procesing code to use the child programs without creating deadlocks, livelocks, race conditions, synchronization, and a myriad of other problems that occur with multi-processing that you may not have experienced with ordinary sequential coding.

We will then use this code to do several simple operations, including proofs of some basic logic theorems (such as De Morgan’s Theorem), creating some simple logic circuits (such as flip flops, decoders, multiplexers, adders, and state machines), and simple lambda calculus (such as identity function, expression evaluation, numbers, successor function, addition, multilication, conditionals, and logic operations.

The next steps will include set theory, linear algebra, calculus, probability, algebraic structures, and other topics that you might typically find in a class on mathematics for computer science or discrete mathematics.

main program

The following is the code for the main program called main.c:

child program

The following is the code for the child program called pipetest.c:



open source MIT License

All of the materials on this web page are released under the MIT open source License.

That means you can use it any way you want as long as you include my original copyright notice and the MIT License permission notice.

Copyright 2022 Milo

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.