To me, the key intuitive aspect that distinguishes XOR from the other logical operators is that it is lossless, or non-lossy, meaning that, unlike AND, and OR (and more similar to NOT in this regard), it is deterministcally reversible: You can exactly recover one of the input values given the rest of the computation history.
The following diagrams illustrate that AND and OR each have at least one case where the state of one of the inputs is irrecoverable, given a certain value of the other input. I indicate these as "lost" inputs.
For the XOR gate, there is no condition in which an input or output value cannot be recovered, given the rest of the computation history. In fact, there's a symmetry that knowing any two values of the tuple (in0, in1, out) allows you to recover the third; specifically, whichever one you choose as the third value is the XOR of the other two!
(in0, in1, out)
This picture suggests that another way to think of the XOR operation is as a controllable NOT gate. By toggling one of the inputs (the upper one in the example above), you can control whether or not the other (lower) one is negated.
Yet another equivalent view is that XOR implements the positive-logic not-equals (≠) function with respect to its two inputs. And thus also the equals function (=) under negative-logic.
In accordance with its symmetry and information-preserving properties, XOR should come to mind for problems that require reversibility or perfect data recovery. The most obvious example is that XORing a dataset with a constant 'key' trivially obscures the data such that knowing the key (which might be kept "secret"), allows for exact recovery.
Preserving all of the available information is also desirable in hashing. Because you want hash values that maximally discriminate amongst the source items, you want to make sure that as many of their distinguishing characteristics as possible are incorporated, minimizing loss, in the hash code. For example, to hash a 64-bit value into 32 bits, you would use the programming language XOR operator ^ because it's an easy way to guarantee that each of the 64 input bits has an opportunity to influence the output:
uint GetHashCode(ulong ul)
return (uint)ul ^ (uint)(ul >> 32);
Note that in this example, information is lost even though XOR was used. (In fact, ‘strategic information loss’ is kind of the whole point of hashing). The original value of ul is not recoverable from the hash code, because you don't have two out of the three 32-bit values that where used in the internal computation. Out of the resulting hash code and the two values that were XORed, you may have saved the result, but typically do not save either of the latter to use as a key value for obtaining the other.
As an amusing aside, XOR was uniquely helpful in the days of bit-twiddling hacks. My contribution back then was a way to Conditionally set or clear bits without branching in C/C++:
unsigned int v; // the value to modify
unsigned int m; // mask: the bits to set or clear
int f; // condition: 0 to 'set', or 1 to 'clear'
v ^= (-f ^ v) & m; // if (f) v |= m; else v &= ~m;
On a more serious note, the fact that XOR is non-lossy has important information-theoretical implications for futuristic computing, due to an important relationship between information processing and the Second Law of Thermodynamics. As explained in an excellent and accessible book by Charles Seife, "Decoding the Universe", it turns out that the loss of information during computation has an exact mathematical relationship with the black-body radiation emanated by a processing system. The notion of entropy plays a central role in quantifying how information loss is expressed as heat. Noting that modern CPU development faces fundamental toleration limitations on the watts/cm² of semiconducting materials, a solution (described by Seife) would be to design reversible, or lossless, computing systems, where XOR's ability to preserve information—and thus shunt away heat—would play an invaluable role.