Unraveling the Role of Derivatives in Machine Learning: I thought I was done with math!

By Daniel Glass

Originally Posted on LinkedIn

In our previous post, we discussed the role of approximating functions. In this post we are going to discuss everyone's favorite topic, derivatives.

A derivative is typically thought of as a rate of change. For our purposes, we are going to think of derivatives a bit differently. Let's define a derivative as the measure of how sensitive the output of a function is to changes in its input. In other words, how much does `y` change when we nudge `x` up or down? This is the key to training our model.

Let's say we have a task that took 54 seconds to complete. We push the input through our model and we get 21 seconds as our predicted output. Not quite right is it? We established that a derivative is the measure of how much the output of a function changes when its input is changed. In this case, we aren't going to change anything about the input. Instead, we are going to change things within our model -- the pieces of it that influence the output. So, if we need our output to go from 21 seconds to 54 seconds, how much do we need to change each piece to get us there? We can use derivatives to answer this question.

Let's use our `y = 2x + 5` example. We will define it symbolically as `y = mx + b` -- where `x` is the input, `y` is the output, and `m` and `b` are pieces of the function.

If `x` is 3, `m` is 2, and `b` is 5, then `y` will be 11 (11 = 2 * 3 + 5). Now, if we increase `m` by 1, what happens to `y`? If we decrease `m` by 1, what happens? The measure of how much `y` changes when `m` is changed represents the derivative of `y` with respect to `m`.

Going back to our task prediction model -- we need our predicted output to go up (21 -> 54). Therefore, we need to figure out how much each piece of our model needs to be changed in order for our output to increase. Each piece will influence the output differently. Once we have changed each piece, we try pushing the input through our model again to see if our predicted output is any closer to the actual output. We repeat these steps again and again, for each task example that we have, until we are satisfied with how well our model predicts task times. And there you go! Training in a nutshell.

In the next posts, we are going to talk about large language models and how they utilize these concepts. Stay tuned!

check out how Cellaware has implemented real world applications for ML/AI in the warehousing and distribution industry. Schedule your free demo today!

Unraveling the Role of Derivatives in Machine Learning: I thought I was done with math!

Recent Posts

Comments

LINKS

ABOUT

SOCIAL