Artificial Neural Networks (ANNs)

By: Ivana Kolorici – Livnjak

In the world of AI, opinions vary widely, but one thing is for sure: understanding its core. As you may know, we’ve developed Botko, but have you ever wondered what’s in its brain? The sophisticated technology of Large Language Models (LLMs) and neural networks mimic human thinking, enabling Botko to simplify tasks, enhance efficiency, and revolutionize customer interactions. Through exploring these neural networks, we’ll uncover how Botko and similar AI systems can become invaluable partners in our daily lives and businesses. Stay tuned as we dive deeper into the essence of LLMs, making AI’s complex world fun and accessible to everyone.

Artificial Neural Networks (ANNs)

Large Language Models are a subset of AI trained on vast data to generate human-like, high-quality, coherent text. But before we talk about how these models learn, I would like to walk you through the learning process of Neural Networks.

ANN Composition

As we reach the finale of our neural network performance, we’re met with the output layer’s reveal ‘a’.

But what is ‘a’?

It’s the neural network’s final answer, a distilled essence of all the data and calculations that have flowed through the network’s layers. Depending on the task at hand, ‘a’ could manifest in various forms:

For Images: ‘a’ might be the recognition of an object, the face of a person, or the classification of an image into categories like ‘day’ or ‘night’.
For Texts: ‘a’ could represent the sentiment of a tweet, the summary of an article, or the next word prediction in a sentence you’re typing.
For Sounds: ‘a’ could be the transcription of speech into text, the identification of a musical note, or the detection of an environmental sound within an audio clip.

So, when you see ‘a’, think of it as the network’s vote of confidence, the culmination of many small judgments, all leading to one final decision. In the grand scheme of things, ‘a’ is the network’s educated guess, its attempt to make sense of the world – one image, one word, one sound at a time.

Attention! Math coming your way!

Let’s break down this neural network journey, step by step, to reveal how a simple decision is made, using an example with a single neuron.It’s like a relay race where each runner (node) passes the baton (input value) to the next, adding their own speed (weight) and flair (bias), aiming for the best time (lowest cost).

The activation function decides which information should be highlighted and which should be toned down.

For every piece of input data, the neural network multiplies it by a weight w (which signifies its importance) and adds bias b (which can shift the decision one way or another). After adding up all these weighted inputs and biases, the total is handed to the activation function.

The activation function then takes that total and transforms it into the output in a specific way. If the transformed value crosses a certain threshold, the gate is opened: the node ‘activates’ and sends its output to the next layer to continue the process. This is how a neural network moves data from input to final decision, step by step, layer by layer.

Backpropagation – Learning from Mistakes

Remember the cost, or loss function, we talked about? It shows us the difference between what the network predicts and what it should have predicted. We calculate it in order to learn from our mistakes. The numerical difference is the grade, telling how well the network performed.

Now, imagine if we tried to just guess the right weights and biases to improve our grade – it would be like throwing darts in the dark, hoping to hit the bullseye. Not very efficient, right? That’s where backpropagation comes in as our guiding light.

Backpropagation looks at the final score and works backward, figuring out which parts of our network need to change. It does this by checking:

How the cost function changes when the output a2 changes.
And how a2 changes when weights w, inputs from the activation function a1, or biases b change.

This is the neural network’s study session, where it goes through each weight and bias to understand their impact on the final score. In math-speak, we’re looking at derivatives, which tell us how sensitive our cost function is to each variable – a little like checking which questions we got wrong on a test.

Gradient descent towards minimizing prediction error with two parameters (red to blue)

By following the negative gradient of the cost function, the network is essentially rolling downhill towards the lowest possible score (the minimum error).

Why negative? Think of it as correcting errors – if the network’s prediction is too high, it needs to go down, and if it’s too low, it needs to go up. In the opposite direction that is.

But here’s the catch: we’re not rolling a ball down a hill in a 3D space; we’re navigating a landscape with thousands of dimensions. So, it’s less about visualizing and more about trusting the math to guide us.

Each step of backpropagation adjusts the weights and biases in the opposite direction of the gradient.

By repeatedly adjusting and learning through the entire dataset, epoch after epoch, the network becomes smarter, aiming for that perfect score where the predictions are just right. Welcome to the learning process of neural networks, a harmonious blend of math, data, and a dash of intuition, all working together to make smarter decisions.

So, as we close this chapter on neural networks, consider the limitless possibilities they present.

References

[3Blue1Brown]. (2017, October 5). Neural networks [Video]. YouTube. https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
IBM (n.d.). What are neural networks? Ibm.com. https://www.ibm.com/topics/neural-networks
[Adam Dhalla]. (2021, March 1). The Complete Mathematics of Neural Networks and Deep Learning [Video]. YouTube. https://www.youtube.com/watch?v=Ixl3nykKG9M
[HuggingFace]. (2022, December 13). Reinforcement Learning from Human Feedback: From Zero to chatGPT [Video]. YouTube. https://www.youtube.com/watch?v=2MBJOuVq380

Artificial Neural Networks (ANNs)

ANN Composition

But what is ‘a’?

Backpropagation – Learning from Mistakes

References

Entropy387

Social media