Answer:
its derivative is always nonzero, so Gradient Descent can always roll down the slope.
Step-by-step explanation:
The logistic activation function was a key ingredient in training the first MLPs because its derivative is always nonzero, so Gradient Descent can always roll down the slope. When the activation function is a step function, Gradient Descent cannot move, as there is no slope at all.