Advanced Programming in the UNIX Environment, 3rd Edition
All
Stack Overflow 13
This Year
Stack Overflow 2
This Month
Stack Overflow 4
[1] Advanced Programming in the UNIX Environment https://www.amazon.com/Advanced-Programming-UNIX-Environment...
[2] C Programming Language https://www.amazon.com/Programming-Language-2nd-Brian-Kernig...
In combination with other classes (and books) on networking, operation systems, data structures we covered a big variance of use cases with C. My question is: How do I take this to the next level? For example I feel I never missed a concept from those classes but when I see C code posted on a thread here it's probably something completely unreadable and complex. Can anyone provide resources to advance a little? What should I study if my goal is to make good and useful modern C software?
I feel like my typing is a bit abstract but I will be happy to clarify.
PS Yes, I've heard of C++ and Rust ;P
Advanced Programming in the UNIX Environment, 3rd Edition: https://www.amazon.com/gp/product/0321637739
For me out of university, it was a co-workers books (some of which I ended up buying) that help me understand the role of UNIX better (I used the alpha machines at university, but was thrown into HPUX/Solaris at work..)
I think this was one: https://www.amazon.com/Advanced-Programming-UNIX-Environment...
Of course a little out of date now, but a lot of the general concepts are the same.
Unix Power Tools helped me a lot as well. https://www.oreilly.com/library/view/unix-power-tools/059600...
How did others learn this stuff?
Here are some hints:
select()
with a timeoutO_NONBLOCK
withfcntl
FD_ISSET
returns trueEWOULDBLOCK
orEAGAIN
(which indicate timeout). Repeat the loop if you seeEINTR
.Here is a better answer: go to your library and get a copy of Stephens out. I believe it's this book: http://www.amazon.com/Programming-Environment-Addison-Wesley-Professional-Computing/dp/0321637739 you want (all of his are great). However, this is still the canonical reference volume to teach you how to do this stuff and should be a core text for your course.
System.in is an InputStream (i.e. a stream) associated with the
stdin
(a file descriptor that comes standard with an operating system process) of the JVM process that runs your program on the given operating system.What is happening under the hood is quite eloquently described in the seminal work Advanced programming in Unix Environment by Stevens and Rago. Basically, the Java implementation delegates to the standard I/O library (written/ported by Dennis Ritchie some 40 years ago!) implementation on your operating system.
Two characteristics of the standard I/O library are of essence:
The standard I/O library chooses the defaults for the buffer carefully and the whole thrust is to minimize the number of
read
andwrite
system calls thereby reducing the CPU time required to carry out the I/O operation. Based on how the buffering occurs, there are three flavors of the streams: fully-buffered, line-buffered and unbuffered.Now, in the above book, following appears in section 5.4:
This means that the standard input, by default, is going to be blocked (as if nothing happens) till you press the newline character. If you redirect the input from a file (e.g.
java MyProgram < foo.txt
) then you are reading from a stream that is fully-buffered by default.There are some low-level details here, but when the program reads from the terminal device, it blocks for the newline character or EOF character to be pressed to flush the buffer. When reading from a file, since the stream is fully buffered, you don't notice that as the buffer is filled and flushed by the time your program starts reading it. When an EOF is read, in both cases,
hasNext()
returns false.Its probably better to do this with threads in general. However, there is a way that will work with limitations for simple applications.
Please forgive the use of a C++11 lambda, I think it makes things a little clearer here.
Limitations:
This is unsafe for multi-threaded code. If you have multi-threaded code it is better to have the timer in a monitoring thread and then kill the blocked thread.
timers and signals from elsewhere could interfere.
Unless BLOCKING_LIBRARY_CALL documents its behaviour very well you may be in undefined behaviour land.
If using this idiom in C++ rather than C you must not allow any objects to be constructed or destroyed between setjmp and longjmp.
Others may find additional issues with this idiom.
I thought I'd seen this in Steven's somewhere, and indeed it is discussed in the signals chapter where it discusses using alarm() to implement sleep().
Firstly, if you
fork()
, you will be creating additional processes, not additional threads. To create additional threads, you want to usepthread_create
.Secondly, as you are a student, the canonical answer here is 'read Stephens'. Not only is this an invaluable tool even for those of us experienced in writing socket I/O routines, but also it contains examples of non-threaded non-forking async I/O, and various ways to add threads and forking to them. I believe the one you want is: http://www.amazon.com/Programming-Environment-Addison-Wesley-Professional-Computing/dp/0321637739 (chapter 14 if memory serves). This should be in your college library.
An interesting side-effect of this course - I got _really_ good at using a console/command prompt and handling text with vim and pipes and filters and other text manipulation. I think this has helped me me productive at my jobs way more than I thought it would :)
Unix Network Programming - Stevens [1]
[0] https://www.amazon.com/Advanced-Programming-UNIX-Environment...
[1] https://www.amazon.com/Unix-Network-Programming-Sockets-Netw...
Functions like
fread
are intended to deal with files.For various reasons, some of them historical, these functions are structured with the idea of a "file pointer" or
FILE
.FILE
contains information about which file it's presently pointing at, it's length and, critically, where in the file it's pointing. When you callfread
with a particularsize
/nmemb
(this means "number of members") combination, it will internally incrementFILE
without your help.In fact, as your program shows later, the only way to access arbitrary regions of a file is to seek (
fseek
) to them.Just like this function below doesn't actually use
i
to increment it's value, it just has information aboutnum
internal to the stack frame ofmain
and logic internal toincrement
to run.This way of dealing with file input / output is part of what could be called a universal model of file I/O, which plays an absolutely critical role in the philosophy of Unix-based operating systems like Mac OSX and Linux and an even more important in some of the later attempts to refine these systems like Plan9.
If you want to be a skilled programmer, it's critical that you understand the concepts of these APIs and the reasoning behind them by cracking books like "The Linux Programming Interface", "The Art of Unix Programming", "Advanced Programming in the Unix Environment", etc.
https://www.amazon.com/Advanced-Programming-UNIX-Environment...
[0] RHCSA & RHCE Training and Exam Preparation Guide by Asghar Ghori. This book will help insure you know your stuff as your system engineer/administrator wise.
[1] A Practical Guide to Linux Commands, Editor and Shell Programming Third Edition. This book will cover the majority of what you would need and want to know when connecting to a remote linux system over ssh.
If you want to get under the hood and become an expert, the following books should help get you started:
[2] Advanced Programming in the UNIX Environment
[3] The Linux Programming Interface: A Linux and UNIX System Programming Handbook
[4] Linux Kernel Development 3rd Edition
To get a nice general overview and get up and going quickly:
[5] How Linux works: What every superuser should know
[6] The Linux Command Line
[7] Python Crash Course
[8] Automate the boring stuff with Python. This is a great book to help you think about how to automate most of the repetitive things you will end up doing on a regular basis.
[0] https://www.amazon.com/RHCSA-RHCE-Red-Enterprise-Linux/dp/14...
[1] https://www.amazon.com/Practical-Guide-Commands-Editors-Prog...
[2] https://www.amazon.com/Advanced-Programming-UNIX-Environment...
[3] https://www.amazon.com/Linux-Programming-Interface-System-Ha...
[4] https://www.amazon.com/Linux-Kernel-Development-Robert-Love/...
[5] https://www.amazon.com/How-Linux-Works-Superuser-Should/dp/1...
[6] https://www.amazon.com/Linux-Command-Line-Complete-Introduct...
[7] https://www.amazon.com/Python-Crash-Course-Hands-Project-Bas...
[8] https://www.amazon.com/Automate-Boring-Stuff-Python-Programm...
> A secure coding standard form CERT should focus entirely on describing conventions and program properties that do not already follow from the standard as a matter of correctness.
from CERT 1.7 "The wiki also contains two platform-specific annexes at the time of this writing; one annex for POSIX and one for Windows. These annexes have been omitted from this standard because they are not part of the core standard."
So while the CERT does use some examples from system interfaces its not a standard for programming the system interfaces for POSIX or Windows. It looks like there trying to limit the standard to ISO C. The examples you gave fall into the system interface category. POSIX is huge and the same for Windows, much bigger then ISO C.
I think in order to explain conventions for a system interface you really need a longer form publication like a book. So you can take 50 pages to describe an interface and how to use it and show examples etc.
The best way that I have found to figure this stuff out is the standard way. You get a copy of all the relevant standards as a foundation, ISO, POSIX, Window and stuff like CERT. Then you you get some of the system programming books (listed below). Then you find get some good reference code that show best practice. usually code from the operating system or utilities. Lastly read all the compiler docs and tool docs to set up the best code analysis framework you can.
These are a few system programming books that I use.
(best intro book) GNU/Linux Application Programming https://www.amazon.com/GNU-Linux-Application-Programming/dp/...
UNIX Systems Programming https://www.amazon.com/UNIX-Systems-Programming-Communicatio...
Advanced Programming in the UNIX Environment https://www.amazon.com/Advanced-Programming-UNIX-Environment...
Windows System Programming https://www.amazon.com/Programming-Paperback-Addison-Wesley-...
The Linux Programming Interface http://www.man7.org/tlpi/
edit: I'm not sure your skill level, you may have seen all of those but I posted them regardless. There is a lot of security and convention in those books.
Reference Style - All Levels
Beginner
Intermediate
Above Intermediate
Uncategorized Additional C Programming Books
It may be simplest to use the
expect
program; it does most of the necessary work for you.The necessary work is fiddly. It involves using pseudo-ttys, which are devices that look to programs like terminals. If you're going to roll your own, then the POSIX system calls you need to know about are:
posix_openpt()
ptsname()
granpt()
unlockpt()
The
posix_openpt()
interface is relatively new (Issue 6, compared with Issue 4, Version 2 for the other functions listed). If your system doesn't haveposix_openpt()
, you need to get yourself one of the Unix books (Stevens or Rochkind, probably) to find out how else to do open the master side of a pty, or read your system manuals rather carefully. However, the rationale forposix_openpt()
at the link above may also help — it also has guidelines for using the other functions. Linux hasposix_openpt()
; so does Mac OS X and by inference the BSD systems generally.Books:
W Richard Stevens, Stephen A Rago Advanced Programming in the Unix Environment, 3rd Edn
Marc J Rochkind Advanced Unix Programming, 2nd Edn
Userspace open function is what you are thinking of, that is a system call which returns a file descriptor int. Plenty of good references for that, such as APUE 3.3.
Device driver "open method" is a function within file_operations structure. It is different than userspace "file open". With the device driver installed, when user code does open of the device (e.g. accessing /dev/scull0), this "open method" would then get called.
The
nohup
command is the poor man's way of running a process as a daemon. As Bruno Ranschaert noted, when you run a command in an interactive shell, it has a controlling terminal and will receive a SIGHUP (hangup) signal when the controlling process (typically your login shell) exits. Thenohup
command arranges for input to come from/dev/null
, and for both output and errors to go tonohup.out
, and for the program to ignore interrupts, quit signals, and hangups. It actually still has the same controlling terminal - it just ignores the terminals controls. Note that if you want the process to run in the background, you have to tell the shell to run it in the background - at least on Solaris (that is, you type 'nohup sleep 20 &
'; without the ampersand, the process runs synchronously in the foreground).Typically, a process run via
nohup
is something that takes time, but which does not hang around waiting for interaction from elsewhere.Typically (which means if you try hard, you can find exceptions to these rules), a daemon process is something which lurks in the background, disconnected from any terminal, but waiting to respond to some input of some sort. Network daemons wait for connection requests or UDP messages to arrive over the network, do the appropriate work and send a response back again. Think of a web server, for example, or a DBMS.
When a process fully daemonizes itself, it goes through some of the steps that the
nohup
code goes through; it rearranges its I/O so it is not connected to any terminal, detaches itself from the process group, ignores appropriate signals (which might mean it doesn't ignore any signals, since there is no terminal to send it any of the signals generated via a terminal). Typically, it forks once, and the parent exits successfully. The child process usually forks a second time, after fixing its process group and session ID and so on; the child then exits too. The grandchild process is now autonomous and won't show up in theps
output for the the terminal where it was launched.You can look at Advanced Programming in the Unix Environment, 3rd Edn by W Richard Stevens and Stephen A Rago, or at Advanced Unix Programming, 2nd Edn by Marc J Rochkind for discussions of daemonization.
I have a program
daemonize
which will daemonize a program that doesn't know how to daemonize itself (properly). It was written to work around the defects in a program which was supposed to daemonize itself but didn't do the job properly. Contact me if you want it - see my profile.