One of the most useful things about this function is its differentiability. That is to say, the first derivative of the function is an easily expressible; you can easily figure out how this function changes as its input changes.
The derivative of the function

Where f(x) would be the function of the net input in our neural network
So, we would differentiation the function of the negative net input and multiply that by (1-(-net input)) in order to get the derivative of our sigmoid function. This result tells us how the function changes as the input changes. The derivative tells us the rate of change. This becomes useful because the easy calculation of f(x)(1-f(x)) can tell us exactly how our network is changing for any given f(x).
The sigmoid function is also useful because it is bounded between 0 and 1. If an equation doesn't have an upper bound, real world numbers can push the function into some strange locations that can cause some damage to the usefulness of a network.
A study of sigmoid functions will sometimes lead you to look at some of the other types of sigmoid functions. I'd be interested to see how useful a double sigmoid function is. It still squashes, but it can be bounded between -1 and 1, maintaining our original values. A double sigmoid function essentially bonds two sigmoid functions together. It has the useful property of providing normalization to a function. It has some problems because it has four inflection points rather than one, so the curve will change signs several times. However, this only related to the second derivative; the first derivative would not change signs. The only issue would be that the values might not be perfect at some of these inflection points.
Below is the formula for a specific double sigmoid function and its result in Mac OS X's grapher application:


Not the flat point at about (1,0). This is the "bonding" location between the two sigmoid functions.