Build Your First AI Model in Python: A Beginner's Guide (1 of 3)

Embark on your AI journey with this step-by-step tutorial, designed to guide beginners through building a basic AI model in Python.

Srinivas Chippagiri

CORE ·

Apr. 22, 25 · Tutorial

Likes (2)

Comment

Save

3.7K Views

Artificial Intelligence (AI) brings fundamental changes to healthcare, finance, manufacturing and customer service through automatic information processing and data-driven insights that lead to smarter business decisions. Artificial learning systems and machine learning models power this shift, identifying patterns and large datasets even without direct human intervention.

In order to enhance AI models and optimize solutions, it is important to grasp the fundamentals of AI model development and have a basic understanding of pre-trained AI models. Hands-on experience in building AI systems allows professionals and aspiring AI developers like you to refine your skills, enabling you to customize models based on specific needs or requirements, solve problems more efficiently and achieve better model performance.

This step-by-step guide will help you build neural networks using TensorFlow with Keras APIs in Python. It aims to cover the important aspects of neural network development, from configuring the development environment and data preparation to designing network topology, training the model, and evaluation. By the end of this tutorial, you will gain the knowledge and skills that are needed to create and deploy your own neural networks.

Understanding AI Models

Artificial intelligence systems rely on analysis components that function as interconnected units, using analytical data to deliver predictive outcomes about system pattern recognition. AI’s neural network, which mimics the human brain's information processing capabilities, enables machines to develop pattern recognition and learn from experience, thereby improving their operational capacity.

The primary elements that form a basic artificial neural network include three main sections:

Input Layer: The first stage of a neural network works as an entry point, which receives raw data through the Input Layer. Neural network layers contains numerous neurons where each represents a data feature during processing. Data arrives at the input layer before it transfers the received information to the next processing stage without performing computational operations. The neural network contains a number of neurons equal to the count of input characteristics. An image recognition model distributes neurons, one for each pixel within the images.
Hidden Layers: Perform computations.
Output Layer: The output layer produces predictions or classifications using processed information as its final operational responsibility. The number of nodes within a layer is determined by the task it is addressing. A binary classification issue requires one sigmoid-activated neuron but multiple classification tasks use softmax activated neurons. The output layer transforms patterns learned during processing into concrete results that include image object detection and stock prediction outputs as well as text generation in natural language processing systems.

1. Setting Up the Environment

Establish three library requirements to set up a coding environment before beginning.

    Python
   
   pip install tensorflow numpy matplotlib

Once installed, let’s import the required libraries:

    Python
   
   import tensorflow as tf 
from tensorflow import keras 
import numpy as np 
import matplotlib.pyplot as plt

2. Loading the MNIST Dataset

The MNIST database offers 28x28 grayscale images of handwritten digits for our use.

    Python
   
   # Load dataset 
mnist = keras.datasets.mnist 
(x_train, y_train), (x_test, y_test) = mnist.load_data()

The normalization process is achieved by dividing pixel values by 255. This normalization process allows the model to achieve both improved speed during training and increased performance output.

    Python
   
   Normalize pixel values to be between 0 and 1 
x_train, x_test = x_train / 255.0, x_test / 255.0

Let’s visualize some sample images from the dataset:

    Python
   
 

   # Display first 10 images 
plt.figure(figsize=(10, 5)) 
for i in range(10): 
    plt.subplot(2, 5, i+1) 
    plt.imshow(x_train[i], cmap=plt.cm.binary) 
    plt.title(y_train[i]) 
    plt.axis('off') 

plt.show() 

  

This helps us to understand the kind of data the model will process.

3. Building the Neural Network

A simple neural network model is defined using Keras within this section.

    Python
   
   # Define the model 

model = keras.Sequential([ 
    keras.layers.Flatten(input_shape=(28, 28)),  # Input layer 
    keras.layers.Dense(128, activation='relu'),  # Hidden layer with 128 neurons 
    keras.layers.Dense(10, activation='softmax') # Output layer with 10 neurons (digits 0-9) 
])

Understanding the Layers

A Flatten Layer changes the 28x28 image into a single-dimensional array.
This hidden layer functions with 128 ReLU activated neurons for extracting features from the input data.
Dense Layer (10 neurons, Softmax activation): Outputs probabilities for each digit (0-9).

4. Compiling the Model

As a first step, we compile the model along with setting these specifications before beginning the training process.

The selected loss function helps evaluate the difference between predicted and actual values.
Organizational Logic Controls the Parameter Values to Reduce Losses.
Metrics: Measures accuracy.

    Python
   
   model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy', 
              metrics=['accuracy'])

5. Training the Model

Training takes place through use of the available training data.

    Python
   
   model.fit(x_train, y_train, epochs=5)

The model goes through a fixed number of occasions which enables it to encounter the dataset multiple times.

After each epoch the model updates its weights in order to boost accuracy levels.

6. Evaluating Model Performance

The model undergoes testing using data that has never been exposed to it.

    Python
   
   test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2) 

print(f"Test Accuracy: {test_acc:.4f}")

7. Making Predictions

The model needs testing to identify digits within the test dataset.

    Python
   
 

   predictions = model.predict(x_test) 
 # Display an image and its predicted label 
index = 0  # Change this index to test different images 
plt.imshow(x_test[index], cmap=plt.cm.binary) 
plt.title(f"Predicted: {np.argmax(predictions[index])}") 
plt.show() 
  

np.argmax function returns the digit with the highest probability.

8. Improving the Model

If the model’s accuracy is not satisfactory enough, we can boost it by:

Adding more layers

    Python
   
 

   model = keras.Sequential([ 
    keras.layers.Flatten(input_shape=(28, 28)), 
    keras.layers.Dense(256, activation='relu'), 
    keras.layers.Dense(128, activation='relu'), 
    keras.layers.Dense(10, activation='softmax') 
]) 
  

Using Convolutional Neural Networks (CNNs)

    Python
   
 

   model = keras.Sequential([ 
    keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)), 
    keras.layers.MaxPooling2D(2,2), 
    keras.layers.Flatten(), 
    keras.layers.Dense(128, activation='relu'), 
    keras.layers.Dense(10, activation='softmax') 
]) 
  

And that's it, your very first AI model!

Conclusion

Building models in the early stages of your deep dive into AI is key, and we achieved successful classification results on handwritten digits using our neural network. Developers need to further experiment with various technical and architectural models to see more advanced development of AI systems. In the next part of this series, you will learn how to evaluate AI models using metrics and visualization tools.

I highly encourage you to explore building your own image classification system using a CNN approach. Please share your thoughts and results in the comments!

AI Machine learning Python (language)

Opinions expressed by DZone contributors are their own.

Related

Trending