Programming Pearls (2nd Edition)
All
Stack Overflow 24
This Year
Stack Overflow 3
This Month
Stack Overflow 3
Programming Pearls might be another good book that promotes enjoyment of computing; certainly a bit more technical, but I think plenty readable for someone interested.
https://www.amazon.com/New-Hackers-Dictionary-3rd/dp/0262680...
https://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp...
Binary search is notoriously tricky to get exactly right. There is a very thorough analysis on the various problems and edge cases, along with a correct implementation, in Programming Pearls, a book that every programmer should probably have read at least once.
Not true, in extreme case it's possible that one file contains all the numbers.
Create the files based on the first or last x digits of the numbers (ignore the starting 1). When creating those files you can actually chop those digits because they are equal within a file. This is a lot better than hashing because although all the numbers can still end up in one file, now the range of those numbers is limited, so you can fit it into 10MB.
Each number can be represeted by a simple bit because the only information you need is whether the number occured previously. You don't have to store the actual numbers, the address of the bit is the number. In 10MB you can store 80M bits, so you will need 1G/80M = 12.5 files, but remember, those digits must differ so actually you will need 100 files (x=2).
Finally, you don't have to create those files, you can also scan the whole file multiple times. In this case you can have multiple bit-maps in memory as one doesn't occupy 10MB.
I strongly suggest reading this book, it starts with an almost identical example: http://www.amazon.co.uk/Programming-Pearls-ACM-Press-Bentley/dp/0201657880
* Code Complete https://www.amazon.com/Code-Complete-Practical-Handbook-Cons...
* Programming Pearls https://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp...
* Pragmatic Programmer https://www.amazon.com/Pragmatic-Programmer-Journeyman-Maste...
I've wiki'd this post - could those with sufficient rep add in items to it.
System administration, general usage books
Nemeth et. al, Linux System Administration
The Armadillo book, as mentioned by Bill The Lizard below.
Anything by Mark Sobell. He does a sort of theme-and-variations for various flavours of unix, so pick the book most appropriate to the environment in hand. The books are quite good. One of his was a prescribed text when I did my B.Sc.
Stevens' TCP/IP illustrated, vol. 1: The Protocols for a comprehensive run down on how TCP/IP works in detail.
I've never read this particular book, but many people here are recommending Unix Power Tools as mentioned by Hortitude.
Programming:
Anything by the late W. Richard Stevens, in particular Advanced Programming in the Unix Environment and Unix Network Programming Vol. 1 and vol. 2
Various classic c/unix books, such as The Unix Programming Environment, Advanced Unix Programming, Programming Pearls and of course K&R. The C/Unix books tend to go into the underlying architecture, and will give a fair degree of insight that's relevant across the board - these are the underlying mechanisms within the system. Anyone trying to do system-level programming (basically anything using system services, no matter what the language) will find a grounding in this to be beneficial.
Specific tools (e.g. Sendmail)
Various of the books from O'Reilly and other publishers cover specific topics. Some of the key ones are:
The Bat book on sendmail - if you have occasion to experience the joys of working with sendmail.cf. If you have a choice on MTA, postfix or qmail are somewhat easier to work with (I've been using postfix since about 2000). O'reilly publish guides to both of them.
Some classic works on perl: the Camel and Llama books (the latter written by none other than Randal Schwartz).
Sed and awk. Not sure what the critters on the cover are. My copy went south a while ago. While on the subject of this, Mastering Regular Expressions has also gotten a mention here and is a good book on the subject.
Samba. The hornbill (?) book covers this; there is also quite a lot of on-line documentation.
NFS/NIS for those using or maintaining unix or linux clients.
Some of these books have been in print for quite a while and are still relevant. Consequently they are also often available secondhand at much less than list price. Amazon marketplace is a good place to look for such items. It's quite a good way to do a shotgun approach to topics like this for not much money.
As an example, in New Zealand technical books are usurously expensive due to a weak kiwi peso (as the $NZ is affectionately known in expat circles) and a tortuously long supply chain. You could spend 20% of a week's after-tax pay for a starting graduate on a single book. When I was living there just out of university I used this type of market a lot, often buying books for 1/4 of their list price - including the cost of shipping to New Zealand. If you're not living in a location with tier-1 incomes I recommend this.
E-Books and on-line resources (thanks to israkir for reminding me):
The Linux Documentation project (www.tldp.org), has many specific topic guides known as HowTos that also often concern third party OSS tools and will be relevant to other Unix variants. It also has a series of FAQ's and guides.
Unix Guru's Universe is a collection of unix resources with a somewhat more old-school flavour.
Google. There are many, many unix and linux resources on the web. Search strings like unix commands or learn unix will turn up any amount of online resources.
Safari. This is a subscription service, but you can search the texts of quite a large number of books. I can recommend this as I've used it. They also do site licences for corporate customers.
Some of the philosophy of Unix:
The Art of UNIX Programming by E S Raymond (available online and in print).
The Practice of Programming by B W Kernighan and R Pike.
Quoth cb3k
Here's your code with the minimal (necessary, but not sufficient) fix diagnosed by templatetypedef and a test harness.
Here's the output:
It is returning 0 regardless of whether the value sought is present in the array or not. This is incorrect behaviour.
You should take time out to study Programming Pearls by Jon Bentley. It covers a lot the basics of the testing of binary searches in a variety of forms — the test harness shown is a variant on what he describes. Also take the time to read Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken. Maybe you should take reassurance that lots of other people have got binary search wrong over time. (IIRC, the first versions of binary search were published in the 1950s, but it wasn't until the early 1960s that a correct version was published — and then there's the Extra information from 2006, too.)
When I added a
printf()
in the block afterelse if (values[middle] == values[low] || values[middle] == values[high])
, it printed on every search that should have failed. Note that the interface makes it hard to spot what's happening — it doesn't report where the element is found, just whether it is found. You can add the debugging and code changes necessary to deal with the residual problems. (Hint: that condition is probably not part of the solution. However, when you do remove it, the code goes into a permanent loop because you don't eliminate the value known not to be in the range from the range that you check recursively.)This seems to work — note that
return 2;
is never executed (because the finalelse if
is never false.Output:
I will give you general solution with some Python pseudo code. What you are trying to solve here is the classical problem from the book "Programming Pearls" by Jon Bentley.
This is solved very efficiently with just a simple bit array, hence my comment, how long is (how many digits does have) the phone number.
Let's say the phone number is at most 10 digits long, than the max phone number you can have is:
9 999 999 999
(spaces are used for better readability). Here we can use 1bit per number to identify if the number is in set or not (bit is set or not set respectively), thus we are going to use9 999 999 999
bits to identify each number, i.e.:bits[0]
identifies the number0 000 000 000
bits[193]
identifies the number0 000 000 193
bits[6592344567]
Doing so we'd need to pre-allocate
9 999 999 999
bits initially set to0
, which is:9 999 999 999 / 8 / 1024 / 1024
= around 1.2 GB of memory.I think that holding the intersection of numbers at the end will use more space than the bits representation => at most 600k ints will be stored =>
64bit * 600k
= around 4.6 GB (actually int is not stored that efficiently and might use much more), if these are string you'll probably end with even more memory requirements.Parsing a phone number string from CSV file (line by line or buffered file reader), converting it to a number and than doing a constant time memory lookup will be IMO faster than dealing with strings and merging them. Unfortunately, I don't have these phone number files to test, but would be interested to hear your findings.
I used
BitArray
from bitstring pip package and it needed around 2 secs to initialize the entire bitstring. Afterwards, scanning the file will use constant memory. At the end I used aset
to store the items.Note 1: This algorithm can be modified to just use the
list
. In that case a second loop as soon as but number matches must reset the bit, so that duplicates do not match.Note 2: Storing in the
set
/list
occurs lazy, because we use the generator in the second for loop. Runtime complexity is linear, i.e.O(N)
.I would add the following two books if you haven't read them already:
Even as it stands, you're going to have a very busy, hopefully productive break. Good luck!
Response to question edit: If you're interested in learning about databases, then I recommend Database in Depth by Chris Date. I hope by "create a GUI-based database" you mean implementing a front-end application for an existing database back end. There are plenty of database solutions out there, and it will be well worth it for your future career to learn a few of them.
Your solution is O(n^2). The optimal solution is linear. It works so that you scan the array from left to right, taking note of the best sum and the current sum:
This problem was also discussed thourougly in Programming Pearls: Algorithm Design Techniques (highly recommended). There you can also find a recursive solution, which is not optimal (O(n log n)), but better than O(n^2).
I suggest Programming Pearls, 2nd edition, by Jon Bentley. He talks a lot about algorithm design techniques and provides examples of real world problems, how they were solved, and how different algorithms affected the runtime.
Throughout the book, you learn algorithm design techniques, program verification methods to ensure your algorithms are correct, and you also learn a little bit about data structures. It's a very good book and I recommend it to anyone who wants to master algorithms. Go read the reviews in amazon: http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880
You can have a look at some of the book's contents here: http://netlib.bell-labs.com/cm/cs/pearls/
Enjoy!
Bentley is also well known for his book "Programming Pearls." The 2nd edition is still in print.
http://www.amazon.com/Programming-Pearls-2nd-ACM-press/dp/02...
1. Programming pearls, http://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp/...
2. Effective C++, http://www.amazon.com/Effective-Specific-Improve-Programs-De...
3. Programming Problems, http://www.amazon.com/Programming-Problems-Primer-Technical-...
The reason for these texts is not because they are overtly insightful or well written, it is because they have a large number of problems with completely coded solutions. After working through these basics, programming interviews are much more enjoyable.
http://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp/...