Advanced Programming in the UNIX Environment, 3rd Edition

Category: Programming
Author: W. Richard Stevens, Stephen A. Rago
4.6
All Stack Overflow 13
This Year Stack Overflow 2
This Month Stack Overflow 4

Comments

by _osorin_   2022-10-11
I would like to take it a step further and ask a question that has been bothering me a while. On my time in the academy I studied the following two books (regarding C):

[1] Advanced Programming in the UNIX Environment https://www.amazon.com/Advanced-Programming-UNIX-Environment...

[2] C Programming Language https://www.amazon.com/Programming-Language-2nd-Brian-Kernig...

In combination with other classes (and books) on networking, operation systems, data structures we covered a big variance of use cases with C. My question is: How do I take this to the next level? For example I feel I never missed a concept from those classes but when I see C code posted on a thread here it's probably something completely unreadable and complex. Can anyone provide resources to advance a little? What should I study if my goal is to make good and useful modern C software?

I feel like my typing is a bit abstract but I will be happy to clarify.

PS Yes, I've heard of C++ and Rust ;P

by bradfitz   2022-06-26
The Linux Programming Interface: https://man7.org/tlpi/

Advanced Programming in the UNIX Environment, 3rd Edition: https://www.amazon.com/gp/product/0321637739

by acomjean   2021-02-05
neat. Great way to explore the systems.

For me out of university, it was a co-workers books (some of which I ended up buying) that help me understand the role of UNIX better (I used the alpha machines at university, but was thrown into HPUX/Solaris at work..)

I think this was one: https://www.amazon.com/Advanced-Programming-UNIX-Environment...

Of course a little out of date now, but a lot of the general concepts are the same.

Unix Power Tools helped me a lot as well. https://www.oreilly.com/library/view/unix-power-tools/059600...

How did others learn this stuff?

by anonymous   2019-07-21

Here are some hints:

  1. Use select() with a timeout
  2. Set the FD to O_NONBLOCK with fcntl
  3. Only read from the FD when FD_ISSET returns true
  4. Read until you get EWOULDBLOCK or EAGAIN (which indicate timeout). Repeat the loop if you see EINTR.

Here is a better answer: go to your library and get a copy of Stephens out. I believe it's this book: http://www.amazon.com/Programming-Environment-Addison-Wesley-Professional-Computing/dp/0321637739 you want (all of his are great). However, this is still the canonical reference volume to teach you how to do this stuff and should be a core text for your course.

by anonymous   2019-07-21

System.in is an InputStream (i.e. a stream) associated with the stdin (a file descriptor that comes standard with an operating system process) of the JVM process that runs your program on the given operating system.

What is happening under the hood is quite eloquently described in the seminal work Advanced programming in Unix Environment by Stevens and Rago. Basically, the Java implementation delegates to the standard I/O library (written/ported by Dennis Ritchie some 40 years ago!) implementation on your operating system.

Two characteristics of the standard I/O library are of essence:

  1. It deals with streams, rather than files.
  2. It provides buffering, an intermediate place in the RAM of your computer that is utilized before the input stream is read from or the output stream is written to.

The standard I/O library chooses the defaults for the buffer carefully and the whole thrust is to minimize the number of read and write system calls thereby reducing the CPU time required to carry out the I/O operation. Based on how the buffering occurs, there are three flavors of the streams: fully-buffered, line-buffered and unbuffered.

Now, in the above book, following appears in section 5.4:

Most implementations default to the following types of buffering: Standard error is always unbuffered. All other streams are line buffered if they refer to a terminal device; otherwise, they are fully buffered. The four platforms discussed in this book follow these conventions for standard I/O buffering: standard error is unbuffered, streams open to terminal devices are line buffered, and all other streams are fully buffered.

This means that the standard input, by default, is going to be blocked (as if nothing happens) till you press the newline character. If you redirect the input from a file (e.g. java MyProgram < foo.txt) then you are reading from a stream that is fully-buffered by default.

There are some low-level details here, but when the program reads from the terminal device, it blocks for the newline character or EOF character to be pressed to flush the buffer. When reading from a file, since the stream is fully buffered, you don't notice that as the buffer is filled and flushed by the time your program starts reading it. When an EOF is read, in both cases, hasNext() returns false.

by anonymous   2019-07-21

Its probably better to do this with threads in general. However, there is a way that will work with limitations for simple applications.

Please forgive the use of a C++11 lambda, I think it makes things a little clearer here.

namespace 
{
   sigjmp_buf context;
} // namespace

void nonBlockingCall(int timeoutInSeconds)
{
    struct sigaction* oldAction = nullptr;
    struct sigaction newAction;
    if (sigsetjmp(::context,0) == 0)
    {
        // install a simple lambda as signal handler for the alarm that
        // effectively makes the call time out.
        // (e.g. if the call gets stuck inside something like poll() )
        newAction.sa_handler = [] (int) {
           siglongjmp(::context,1);
        };
        sigaction(SIGALRM,&newAction,oldAction);
        alarm(timeoutInSeconds); //timeout by raising SIGALM
        BLOCKING_LIBRARY_CALL
        alarm(0); //cancel alarm
        //call did not time out
    }
    else
    {
        // timer expired during your call (SIGALM was raised)
    }
    sigaction(SIGALRM,oldAction,nullptr);
}

Limitations:

  • This is unsafe for multi-threaded code. If you have multi-threaded code it is better to have the timer in a monitoring thread and then kill the blocked thread.

  • timers and signals from elsewhere could interfere.

  • Unless BLOCKING_LIBRARY_CALL documents its behaviour very well you may be in undefined behaviour land.

    • It will not free resources properly if interrupted.
    • It might install signal handlers or masks or raise signals itself.
  • If using this idiom in C++ rather than C you must not allow any objects to be constructed or destroyed between setjmp and longjmp.

Others may find additional issues with this idiom.

I thought I'd seen this in Steven's somewhere, and indeed it is discussed in the signals chapter where it discusses using alarm() to implement sleep().

by anonymous   2019-07-21

Firstly, if you fork(), you will be creating additional processes, not additional threads. To create additional threads, you want to use pthread_create.

Secondly, as you are a student, the canonical answer here is 'read Stephens'. Not only is this an invaluable tool even for those of us experienced in writing socket I/O routines, but also it contains examples of non-threaded non-forking async I/O, and various ways to add threads and forking to them. I believe the one you want is: http://www.amazon.com/Programming-Environment-Addison-Wesley-Professional-Computing/dp/0321637739 (chapter 14 if memory serves). This should be in your college library.

by hideo   2018-11-25
Unix System Programming. A course largely based around https://www.amazon.com/Advanced-Programming-UNIX-Environment...

An interesting side-effect of this course - I got _really_ good at using a console/command prompt and handling text with vim and pipes and filters and other text manipulation. I think this has helped me me productive at my jobs way more than I thought it would :)

by sureaboutthis   2018-10-04
Advanced Programming in the Unix Environment - Stevens, Rago [0]

Unix Network Programming - Stevens [1]

[0] https://www.amazon.com/Advanced-Programming-UNIX-Environment...

[1] https://www.amazon.com/Unix-Network-Programming-Sockets-Netw...

by anonymous   2018-08-16
`mydroplet.example.com` is ***not*** a FQDN. FQDN's end in a dot (`.`) to denote the top of the DNS tree. `mydroplet.example.com.` and `localhost.` are FQDN's. When the dot is present the resolver *should not* add suffixes to search paths. Whose DNS you use is a different story. Also see W. Richard Stevens' [Advanced Programming in the UNIX Environment](https://www.amazon.com/dp/0321637739).
by anonymous   2018-03-19

Functions like fread are intended to deal with files.

For various reasons, some of them historical, these functions are structured with the idea of a "file pointer" or FILE.

FILE contains information about which file it's presently pointing at, it's length and, critically, where in the file it's pointing. When you call fread with a particular size/nmemb (this means "number of members") combination, it will internally increment FILE without your help.

In fact, as your program shows later, the only way to access arbitrary regions of a file is to seek (fseek) to them.

Just like this function below doesn't actually use i to increment it's value, it just has information about num internal to the stack frame of main and logic internal to increment to run.

void increment(int *num) {
    *num = *num + 1;
}

int main() {
    int num = 0;
    for (int i = 0; i < 100; i++) {
        increment(&num);
    }
    printf("%d\n", num); // this prints 100
}

This way of dealing with file input / output is part of what could be called a universal model of file I/O, which plays an absolutely critical role in the philosophy of Unix-based operating systems like Mac OSX and Linux and an even more important in some of the later attempts to refine these systems like Plan9.

If you want to be a skilled programmer, it's critical that you understand the concepts of these APIs and the reasoning behind them by cracking books like "The Linux Programming Interface", "The Art of Unix Programming", "Advanced Programming in the Unix Environment", etc.

by forkandwait   2018-03-04
I used the first edition, but I think all you need is this:

https://www.amazon.com/Advanced-Programming-UNIX-Environment...

by anonymous   2018-01-07
I recommend you get a book about Unix (or rather POSIX) systems programming, and read that. It should tell you all you need to know. I haven't read the latest edition, but I heartily recommend [Advanced Programming in the UNIX environment](https://www.amazon.com/Advanced-Programming-UNIX-Environment-3rd/dp/0321637739)
by anonymous   2017-12-11
I suggest any of these 3 books (1 is enough to start with; you can get the others later): W Richard Stevens, Stephen A Rago [Advanced Programming in the Unix Environment, 3rd Edn](http://smile.amazon.com/Advanced-Programming-UNIX-Environment-Edition/dp/0321637739) — Marc J Rochkind [Advanced Unix Programming, 2nd Edn](http://smile.amazon.com/Advanced-UNIX-Programming-2nd-Edition/dp/0131411543) — Michael Kerrisk [The Linux Programming Interface: A Linux and Unix System Programming Handbook](http://smile.amazon.com/The-Linux-Programming-Interface-Handbook/dp/1593272200).
by techjuice   2017-08-20
If you want to become a professional and not just a dabbler I would recommend reading some of the following books I have in my bookshelf:

[0] RHCSA & RHCE Training and Exam Preparation Guide by Asghar Ghori. This book will help insure you know your stuff as your system engineer/administrator wise.

[1] A Practical Guide to Linux Commands, Editor and Shell Programming Third Edition. This book will cover the majority of what you would need and want to know when connecting to a remote linux system over ssh.

If you want to get under the hood and become an expert, the following books should help get you started:

[2] Advanced Programming in the UNIX Environment

[3] The Linux Programming Interface: A Linux and UNIX System Programming Handbook

[4] Linux Kernel Development 3rd Edition

To get a nice general overview and get up and going quickly:

[5] How Linux works: What every superuser should know

[6] The Linux Command Line

[7] Python Crash Course

[8] Automate the boring stuff with Python. This is a great book to help you think about how to automate most of the repetitive things you will end up doing on a regular basis.

[0] https://www.amazon.com/RHCSA-RHCE-Red-Enterprise-Linux/dp/14...

[1] https://www.amazon.com/Practical-Guide-Commands-Editors-Prog...

[2] https://www.amazon.com/Advanced-Programming-UNIX-Environment...

[3] https://www.amazon.com/Linux-Programming-Interface-System-Ha...

[4] https://www.amazon.com/Linux-Kernel-Development-Robert-Love/...

[5] https://www.amazon.com/How-Linux-Works-Superuser-Should/dp/1...

[6] https://www.amazon.com/Linux-Command-Line-Complete-Introduct...

[7] https://www.amazon.com/Python-Crash-Course-Hands-Project-Bas...

[8] https://www.amazon.com/Automate-Boring-Stuff-Python-Programm...

by generic_user   2017-08-20
Its a bit tricky I think.

> A secure coding standard form CERT should focus entirely on describing conventions and program properties that do not already follow from the standard as a matter of correctness.

from CERT 1.7 "The wiki also contains two platform-specific annexes at the time of this writing; one annex for POSIX and one for Windows. These annexes have been omitted from this standard because they are not part of the core standard."

So while the CERT does use some examples from system interfaces its not a standard for programming the system interfaces for POSIX or Windows. It looks like there trying to limit the standard to ISO C. The examples you gave fall into the system interface category. POSIX is huge and the same for Windows, much bigger then ISO C.

I think in order to explain conventions for a system interface you really need a longer form publication like a book. So you can take 50 pages to describe an interface and how to use it and show examples etc.

The best way that I have found to figure this stuff out is the standard way. You get a copy of all the relevant standards as a foundation, ISO, POSIX, Window and stuff like CERT. Then you you get some of the system programming books (listed below). Then you find get some good reference code that show best practice. usually code from the operating system or utilities. Lastly read all the compiler docs and tool docs to set up the best code analysis framework you can.

These are a few system programming books that I use.

(best intro book) GNU/Linux Application Programming https://www.amazon.com/GNU-Linux-Application-Programming/dp/...

UNIX Systems Programming https://www.amazon.com/UNIX-Systems-Programming-Communicatio...

Advanced Programming in the UNIX Environment https://www.amazon.com/Advanced-Programming-UNIX-Environment...

Windows System Programming https://www.amazon.com/Programming-Paperback-Addison-Wesley-...

The Linux Programming Interface http://www.man7.org/tlpi/

edit: I'm not sure your skill level, you may have seen all of those but I posted them regardless. There is a lot of security and convention in those books.

by AviewAnew   2017-08-20

Reference Style - All Levels

Beginner

Intermediate

Above Intermediate

Uncategorized Additional C Programming Books

  • Essential C (Free PDF) - Nick Parlante
  • The new C standard - an annotated reference (Free PDF) - Derek M. Jones
by anonymous   2017-08-20

It may be simplest to use the expect program; it does most of the necessary work for you.

The necessary work is fiddly. It involves using pseudo-ttys, which are devices that look to programs like terminals. If you're going to roll your own, then the POSIX system calls you need to know about are:

  • posix_openpt()
  • ptsname()
  • granpt()
  • unlockpt()

The posix_openpt() interface is relatively new (Issue 6, compared with Issue 4, Version 2 for the other functions listed). If your system doesn't have posix_openpt(), you need to get yourself one of the Unix books (Stevens or Rochkind, probably) to find out how else to do open the master side of a pty, or read your system manuals rather carefully. However, the rationale for posix_openpt() at the link above may also help — it also has guidelines for using the other functions. Linux has posix_openpt(); so does Mac OS X and by inference the BSD systems generally.

Books:

by anonymous   2017-08-20

Userspace open function is what you are thinking of, that is a system call which returns a file descriptor int. Plenty of good references for that, such as APUE 3.3.

Device driver "open method" is a function within file_operations structure. It is different than userspace "file open". With the device driver installed, when user code does open of the device (e.g. accessing /dev/scull0), this "open method" would then get called.

by anonymous   2017-08-20

The nohup command is the poor man's way of running a process as a daemon. As Bruno Ranschaert noted, when you run a command in an interactive shell, it has a controlling terminal and will receive a SIGHUP (hangup) signal when the controlling process (typically your login shell) exits. The nohup command arranges for input to come from /dev/null, and for both output and errors to go to nohup.out, and for the program to ignore interrupts, quit signals, and hangups. It actually still has the same controlling terminal - it just ignores the terminals controls. Note that if you want the process to run in the background, you have to tell the shell to run it in the background - at least on Solaris (that is, you type 'nohup sleep 20 &'; without the ampersand, the process runs synchronously in the foreground).

Typically, a process run via nohup is something that takes time, but which does not hang around waiting for interaction from elsewhere.

Typically (which means if you try hard, you can find exceptions to these rules), a daemon process is something which lurks in the background, disconnected from any terminal, but waiting to respond to some input of some sort. Network daemons wait for connection requests or UDP messages to arrive over the network, do the appropriate work and send a response back again. Think of a web server, for example, or a DBMS.

When a process fully daemonizes itself, it goes through some of the steps that the nohup code goes through; it rearranges its I/O so it is not connected to any terminal, detaches itself from the process group, ignores appropriate signals (which might mean it doesn't ignore any signals, since there is no terminal to send it any of the signals generated via a terminal). Typically, it forks once, and the parent exits successfully. The child process usually forks a second time, after fixing its process group and session ID and so on; the child then exits too. The grandchild process is now autonomous and won't show up in the ps output for the the terminal where it was launched.

You can look at Advanced Programming in the Unix Environment, 3rd Edn by W Richard Stevens and Stephen A Rago, or at Advanced Unix Programming, 2nd Edn by Marc J Rochkind for discussions of daemonization.

I have a program daemonize which will daemonize a program that doesn't know how to daemonize itself (properly). It was written to work around the defects in a program which was supposed to daemonize itself but didn't do the job properly. Contact me if you want it - see my profile.