All Comments
TopTalkedBooks posted at August 19, 2017
Thank you for your answers. To add something myself, the book Programming Pearls contains exactly this - short, elegant solutions, plus some extra food for thought and exercises.

http://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp/...

TopTalkedBooks posted at August 19, 2017
There are a number of programming books that I use to prepare for technical interviews. These are

1. Programming pearls, http://www.amazon.com/Programming-Pearls-2nd-Jon-Bentley/dp/...

2. Effective C++, http://www.amazon.com/Effective-Specific-Improve-Programs-De...

3. Programming Problems, http://www.amazon.com/Programming-Problems-Primer-Technical-...

The reason for these texts is not because they are overtly insightful or well written, it is because they have a large number of problems with completely coded solutions. After working through these basics, programming interviews are much more enjoyable.

TopTalkedBooks posted at August 19, 2017
The bibliography link in the original article was unfortunately broken.

Bentley is also well known for his book "Programming Pearls." The 2nd edition is still in print.

http://www.amazon.com/Programming-Pearls-2nd-ACM-press/dp/02...

TopTalkedBooks posted at August 20, 2017

I suggest Programming Pearls, 2nd edition, by Jon Bentley. He talks a lot about algorithm design techniques and provides examples of real world problems, how they were solved, and how different algorithms affected the runtime.

Throughout the book, you learn algorithm design techniques, program verification methods to ensure your algorithms are correct, and you also learn a little bit about data structures. It's a very good book and I recommend it to anyone who wants to master algorithms. Go read the reviews in amazon: http://www.amazon.com/Programming-Pearls-2nd-Edition-Bentley/dp/0201657880

You can have a look at some of the book's contents here: http://netlib.bell-labs.com/cm/cs/pearls/

Enjoy!

TopTalkedBooks posted at August 20, 2017

Your solution is O(n^2). The optimal solution is linear. It works so that you scan the array from left to right, taking note of the best sum and the current sum:

def get_max_sum_subset(x):
    bestSoFar = 0
    bestNow = 0
    bestStartIndexSoFar = -1
    bestStopIndexSoFar = -1
    bestStartIndexNow = -1
    for i in xrange(len(x)):
        value = bestNow + x[i]
        if value > 0:
            if bestNow == 0:
                bestStartIndexNow = i
            bestNow = value
        else:
            bestNow = 0

        if bestNow > bestSoFar:
            bestSoFar = bestNow
            bestStopIndexSoFar = i
            bestStartIndexSoFar = bestStartIndexNow

    return bestSoFar, bestStartIndexSoFar, bestStopIndexSoFar

This problem was also discussed thourougly in Programming Pearls: Algorithm Design Techniques (highly recommended). There you can also find a recursive solution, which is not optimal (O(n log n)), but better than O(n^2).

TopTalkedBooks posted at August 20, 2017

I would add the following two books if you haven't read them already:

Programming Pearls by Jon Bentley

Code by Charles Petzold

Even as it stands, you're going to have a very busy, hopefully productive break. Good luck!

Response to question edit: If you're interested in learning about databases, then I recommend Database in Depth by Chris Date. I hope by "create a GUI-based database" you mean implementing a front-end application for an existing database back end. There are plenty of database solutions out there, and it will be well worth it for your future career to learn a few of them.

TopTalkedBooks posted at August 20, 2017

I will give you general solution with some Python pseudo code. What you are trying to solve here is the classical problem from the book "Programming Pearls" by Jon Bentley.

This is solved very efficiently with just a simple bit array, hence my comment, how long is (how many digits does have) the phone number.

Let's say the phone number is at most 10 digits long, than the max phone number you can have is: 9 999 999 999 (spaces are used for better readability). Here we can use 1bit per number to identify if the number is in set or not (bit is set or not set respectively), thus we are going to use 9 999 999 999 bits to identify each number, i.e.:

  • bits[0] identifies the number 0 000 000 000
  • bits[193] identifies the number 0 000 000 193
  • having a number 659 234-4567 would be addressed by the bits[6592344567]

Doing so we'd need to pre-allocate 9 999 999 999 bits initially set to 0, which is: 9 999 999 999 / 8 / 1024 / 1024 = around 1.2 GB of memory.

I think that holding the intersection of numbers at the end will use more space than the bits representation => at most 600k ints will be stored => 64bit * 600k = around 4.6 GB (actually int is not stored that efficiently and might use much more), if these are string you'll probably end with even more memory requirements.

Parsing a phone number string from CSV file (line by line or buffered file reader), converting it to a number and than doing a constant time memory lookup will be IMO faster than dealing with strings and merging them. Unfortunately, I don't have these phone number files to test, but would be interested to hear your findings.

from bitstring import BitArray


max_number = 9999999999

found_phone_numbers = BitArray(length=max_number+1)


# replace this function with the file open function and retrieving
#  the next found phone number
def number_from_file_iteator(dummy_data):
    for number in dummy_data:
        yield number


def calculate_intersect():
    # should be open a file1 and getting the generator with numbers from it
    #  we use dummy data here
    for number in number_from_file_iteator([1, 25, 77, 224322323, 8292, 1232422]):
        found_phone_numbers[number] = True

    # open second file and check if the number is there
    for number in number_from_file_iteator([4, 24, 224322323, 1232422, max_number]):
        if found_phone_numbers[number]:
            yield number


number_intersection = set(calculate_intersect())

print number_intersection

I used BitArray from bitstring pip package and it needed around 2 secs to initialize the entire bitstring. Afterwards, scanning the file will use constant memory. At the end I used a set to store the items.

Note 1: This algorithm can be modified to just use the list. In that case a second loop as soon as but number matches must reset the bit, so that duplicates do not match.

Note 2: Storing in the set/list occurs lazy, because we use the generator in the second for loop. Runtime complexity is linear, i.e. O(N).

TopTalkedBooks posted at August 20, 2017

Quoth cb3k

That seemed to make it work...what might the other problems be?

Here's your code with the minimal (necessary, but not sufficient) fix diagnosed by templatetypedef and a test harness.

#include <stdio.h>

static
int search(int value, int values[], int n, int high, int low)
{
    if (n <= 0)
    {
        return 1;
    }

    int middle = (high + low) / 2;

    ///condition #1
    if (value == values[middle])
    {
        return 0;
    }

    // conditions #2 and #3 (account for the maxes and mins of the array because the operation to find middle truncates)
    else if (values[middle] == values[low] || values[middle] == values[high])
    {
        return 0;
    }

    else if (value > values[middle])
    {
        low = middle;
        return search(value, values, n, high, low);
    }

    else if (value < values[middle])
    {
        high = middle;
        return search(value, values, n, high, low);
    }

    return 2;
}

int main(void)
{
    int data[15];
    for (int i = 0; i < 15; i++)
        data[i] = 2 * i + 1;

    printf("Data:");
    for (int i = 0; i < 15; i++)
        printf("%3d", data[i]);
    putchar('\n');

    for (int i = -1; i < 2 * 15 + 3; i++)
        printf("Search for %2d - result %d\n", i, search(i, data, 15, 14, 0));
    return 0;
}

Here's the output:

Data:  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29
Search for -1 - result 0
Search for  0 - result 0
Search for  1 - result 0
Search for  2 - result 0
Search for  3 - result 0
Search for  4 - result 0
Search for  5 - result 0
Search for  6 - result 0
Search for  7 - result 0
Search for  8 - result 0
Search for  9 - result 0
Search for 10 - result 0
Search for 11 - result 0
Search for 12 - result 0
Search for 13 - result 0
Search for 14 - result 0
Search for 15 - result 0
Search for 16 - result 0
Search for 17 - result 0
Search for 18 - result 0
Search for 19 - result 0
Search for 20 - result 0
Search for 21 - result 0
Search for 22 - result 0
Search for 23 - result 0
Search for 24 - result 0
Search for 25 - result 0
Search for 26 - result 0
Search for 27 - result 0
Search for 28 - result 0
Search for 29 - result 0
Search for 30 - result 0
Search for 31 - result 0
Search for 32 - result 0

It is returning 0 regardless of whether the value sought is present in the array or not. This is incorrect behaviour.

You should take time out to study Programming Pearls by Jon Bentley. It covers a lot the basics of the testing of binary searches in a variety of forms — the test harness shown is a variant on what he describes. Also take the time to read Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken. Maybe you should take reassurance that lots of other people have got binary search wrong over time. (IIRC, the first versions of binary search were published in the 1950s, but it wasn't until the early 1960s that a correct version was published — and then there's the Extra information from 2006, too.)

When I added a printf() in the block after else if (values[middle] == values[low] || values[middle] == values[high]), it printed on every search that should have failed. Note that the interface makes it hard to spot what's happening — it doesn't report where the element is found, just whether it is found. You can add the debugging and code changes necessary to deal with the residual problems. (Hint: that condition is probably not part of the solution. However, when you do remove it, the code goes into a permanent loop because you don't eliminate the value known not to be in the range from the range that you check recursively.)

This seems to work — note that return 2; is never executed (because the final else if is never false.

#include <stdio.h>

static
int search(int value, int values[], int n, int high, int low)
{
    //if (n <= 0)
    if (n <= 0 || high < low)
    {
        return 1;
    }

    int middle = (high + low) / 2;

    ///condition #1
    if (value == values[middle])
    {
        return 0;
    }

#if 0
    // conditions #2 and #3 (account for the maxes and mins of the array because the operation to find middle truncates)
    else if (values[middle] == values[low] || values[middle] == values[high])
    {
        //printf(" (#2 || #3) ");
        return 0;
    }
#endif

    else if (value > values[middle])
    {
        //low = middle;
        low = middle + 1;
        return search(value, values, n, high, low);
    }

    else if (value < values[middle])
    {
        //high = middle;
        high = middle - 1;
        return search(value, values, n, high, low);
    }

    return 2;
}

int main(void)
{
    int data[15];
    for (int i = 0; i < 15; i++)
        data[i] = 2 * i + 1;

    printf("Data:");
    for (int i = 0; i < 15; i++)
        printf("%3d", data[i]);
    putchar('\n');

    for (int i = -1; i < 2 * 15 + 3; i++)
        printf("Search for %2d - result %d\n", i, search(i, data, 15, 14, 0));
    return 0;
}

Output:

Data:  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29
Search for -1 - result 1
Search for  0 - result 1
Search for  1 - result 0
Search for  2 - result 1
Search for  3 - result 0
Search for  4 - result 1
Search for  5 - result 0
Search for  6 - result 1
Search for  7 - result 0
Search for  8 - result 1
Search for  9 - result 0
Search for 10 - result 1
Search for 11 - result 0
Search for 12 - result 1
Search for 13 - result 0
Search for 14 - result 1
Search for 15 - result 0
Search for 16 - result 1
Search for 17 - result 0
Search for 18 - result 1
Search for 19 - result 0
Search for 20 - result 1
Search for 21 - result 0
Search for 22 - result 1
Search for 23 - result 0
Search for 24 - result 1
Search for 25 - result 0
Search for 26 - result 1
Search for 27 - result 0
Search for 28 - result 1
Search for 29 - result 0
Search for 30 - result 1
Search for 31 - result 1
Search for 32 - result 1
TopTalkedBooks posted at August 20, 2017

I've wiki'd this post - could those with sufficient rep add in items to it.

System administration, general usage books

Programming:

Specific tools (e.g. Sendmail)

Various of the books from O'Reilly and other publishers cover specific topics. Some of the key ones are:

Some of these books have been in print for quite a while and are still relevant. Consequently they are also often available secondhand at much less than list price. Amazon marketplace is a good place to look for such items. It's quite a good way to do a shotgun approach to topics like this for not much money.

As an example, in New Zealand technical books are usurously expensive due to a weak kiwi peso (as the $NZ is affectionately known in expat circles) and a tortuously long supply chain. You could spend 20% of a week's after-tax pay for a starting graduate on a single book. When I was living there just out of university I used this type of market a lot, often buying books for 1/4 of their list price - including the cost of shipping to New Zealand. If you're not living in a location with tier-1 incomes I recommend this.

E-Books and on-line resources (thanks to israkir for reminding me):

  • The Linux Documentation project (www.tldp.org), has many specific topic guides known as HowTos that also often concern third party OSS tools and will be relevant to other Unix variants. It also has a series of FAQ's and guides.

  • Unix Guru's Universe is a collection of unix resources with a somewhat more old-school flavour.

  • Google. There are many, many unix and linux resources on the web. Search strings like unix commands or learn unix will turn up any amount of online resources.

  • Safari. This is a subscription service, but you can search the texts of quite a large number of books. I can recommend this as I've used it. They also do site licences for corporate customers.

Some of the philosophy of Unix:

TopTalkedBooks posted at September 05, 2017

Repeated reallocation for a change of one in the array size is not particularly good. You'd likely do better with allocating as much space as you'll need all at once, or keep track of what's allocated and what's in use and only allocate more when necessary. Jon Bentley's Programming Pearls, 2nd Edn from 1999 and More Programming Pearls from 1988 include code for handling array-based heaps as you are using. You can find an outline of that code in this answer of mine.

Top Books
We collected top books from hacker news, stack overflow, Reddit, which are recommended by amazing people.