You saw that the cost function for linear regression generates a surface with large flat regions when it is applied to logistic regression. Such a cost function is not suitable for gradient descent.

The improved cost function is:

The three-dimensional graph below shows the values of the improved cost function for the same dataset and values of and .

You can click and drag to rotate the graph, scroll to zoom in and out, and hover over the data points in the graph to see each value of , , and .

As you can see, for a logistic model, the improved cost function generates a surface with single “fold.” The process of gradient descent can quickly and easily find the values of and that minimize this function.