(A journey for dummies into logical thinking)

Students from high school are very fascinated by machine learning and artificial intelligence. But the most of the time, they get confused by misinterpretation and misleading ideas about machine learning which is spread through our media. Today we will explain the simplest form of machine learning to eradicate confusions and will enable the student to get started with machine learning.

Machine Learning is the new art of mathematics and data where we can solve a problem using data, math, and algorithms of the computer science. When we were in high school we used to find the best formula by thinking and scrapping with pen and paper. Then we used to plug in some value of a certain domain into that determined function or formula to get some output.

In machine learning we do just the opposite, we just collect the input values of a certain domain and outputs values. These is our food or training set for machine learning on which machine will be trained to predict future outcomes. Then we try to estimate the best function or formula or estimated model; which will give us similar output value when an input value from the training set will be inserted into our estimated model. That means just like humans, machines can be trained by reading data themselves like reading books of humans. They can do new tasks on the basis of training data. It’s like machines are getting educated from data.

The whole process can be simplified like this.

If we try to find the best formula for doing a task in a classical method. We would find y = f(x) . Then we will predict new outcomes by putting any value of x in the function f(x).

But in Machine Learning, we will collect some value for x and y, which will be true for y=f(x). Then, we will form a set,

Training set = {x,y} , we will plug in the training set in a training function called T(Training set) , For which, F(X) = T(Training set) .

Now we will train our model to estimate the best function F(X).

T({x,y}) = F(X). output is F(X)

Now Example –

Suppose that, we have some x value = {2,3,4,5,6}

and y value = {4,6,8,10,12}, we don’t know the function, when y = f(x). Our task is determines a machine mechanism or function which follow the data above and it’s condition , y = f(x)

Solve according to above explanation::

x = {2,3,4,5,6}

y = {4,6,8,10,12}

Training Function , T({x,y}) = 2*x

Answer: 2*x as it will fulfill the above condition.

The training function and training process guess and try different function and methods over and over to find the optimum condition for our given data and condition.

I hope you have the very basic idea of how machine learning works in it’s simplest form.

For the curious mind who want to see the actual solve, the solution is given below

average of x = ( 2+3+4+5+6 ) / 5 = 4

average of y = (4+6+8+10+12) / 5 = 8

distance or deviation from the average of x, Dx = { (2–4) , (3–4) , (4–4) , (5–4),(6–4) }

distance or deviation from the average of y, Dy => {(4–8),(6–8),(8–8),(10–8),(12–8)}

Multipy two distance set,

{ 2, 1, 0, 1, 2 } x {4,2,0,2,4} = {2*4 , 1*2, 0*0, 1*2, 2*4} = { 8, 2, 0, 2, 8 } , then sum them, result is 20 . divide it with number of element , n = 5 , 20 / 5 = 4, thus we get our covariance of x and y

The formula of this process is called covariance.

Covarience(x,y) = Sum[ (y — Average_of_y ) * (x — Average_of_x ) ] / n = 4

=> sum(Dx*Dy) / n

=> sum{2*4 , 1*2, 0*0, 1*2, 2*4} / 5

=> 20/5 = 4

Covariance(x,y) = 20/5 = 4

Square the distance or deviation from the average of x, Dx² = {4,1,0,1,4}, sum it, sum( Dx² ) = 10 , divide by number of element, n =5 , 10/5 = 2. Thus we get our variance of x = 2

The formula of this process is called variance.

variance(x) = sum( ( x — Average_of_x ) ^ 2 ) / n = 2

=> sum(Dx²)/n

=> sum{4,1,0,1,4}/5

=> 10 / 5 = 2

variance(x) = 10/5 = 2

Now, assume that our function is linear. Then the equation of a line is,

y = m*x + c

Then, c = y — m*x

and m = covarience(x,y) / varience(x)

m = 4 / 2 = 2 , m =2

, c = average_of_y — m*average_of_x

c = 8–2*2 = 0, c = 0

So, the equation is after putting the value of m and c is,

y = 2*x + 0 = 2*x, so the estimated formula is, y = 2*x

That’s how we determined the function or answer 2*x , from the condition y = f(x) and given dataset x = {2,3,4,5,6} ,

y = {4,6,8,10,12}

Congratulations, you have solved your first machine learning and statistical problem. The model or method, you have used to solve this problem is called, linear regression. Are you excited to learn machine learning. Then search internet and learn from internet. And welcome to my next blog.

Tags: #AI, #ML_for_students, #ML_Simplified, #Machine_Learning_for_High_school_students (A journey into logical thinking)