The task of determining the proper part of speech for a word in a text is called Part of Speech Tagging. The Brill tagger, for example, uses a mixture of dictionary(vocabulary) words and contextual rules. I believe that some of the important initial dictionary words for this task are the stop words.
Once you have (mostly correct) parts of speech for your words, you can start building larger structures. This industry-oriented book differentiates between recognizing noun phrases (NPs) and recognizing named entities.
About textbooks: Allen's Natural Language Understanding is a good, but a bit dated, book. Foundations of Statistical Natural Language Processing is a nice introduction to statistical NLP. Speech and Language Processing is a bit more rigorous and maybe more authoritative. The Association for Computational Linguistics is a leading scientific community on computational linguistics.
Without using NLP libraries you'll have to write some of their functionality yourself. Although this can be educational you should know it can also be very time consuming.
Some academic resources:
There's a short and a long answer to this.
The short answer:
Use the WordNet::Similarity Perl package. If Perl is not your language of choice, check the WordNet project page at Princeton, or google for a wrapper library.
The long answer:
Determining word similarity is a complicated issue, and research is still very hot in this area. To compute similarity, you need an appropriate represenation of the meaning of a word. But what would be a representation of the meaning of, say, 'chair'? In fact, what is the exact meaning of 'chair'? If you think long and hard about this, it will twist your mind, you will go slightly mad, and finally take up a research career in Philosophy or Computational Linguistics to find the truth™. Both philosophers and linguists have tried to come up with an answer for literally thousands of years, and there's no end in sight.
So, if you're interested in exploring this problem a little more in-depth, I highly recommend reading Chapter 20.7 in Speech and Language Processing by Jurafsky and Martin, some of which is available through Google Books. It gives a very good overview of the state-of-the-art of distributional methods, which use word co-occurrence statistics to define a measure for word similarity. You are not likely to find libraries implementing these, however.