Mean Squared Error squares the difference between the correct output and the predicted , and takes the average of all these squared errors. The square plays a similar role to the absolute value in Mean Absolute Error, making the sign of the error irrelevant.
The interesting thing about the formulas for Evaluation Metrics is that using different formulas creates different biases. For example, MSE dramatically increases for larger and larger errors, which makes it very, very biased against larger errors.
If you look carefully, this is the same as the cost function we used for linear regession!