DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Gemini 2.0 Flash (Experimental): A Deep Dive for Developers
  • Exploring Foundations of Large Language Models (LLMs): Tokenization and Embeddings
  • The Disruptive Potential of On-Device Large Language Models
  • Retrieval-Augmented Generation (RAG): Enhancing AI-Language Models With Real-World Knowledge

Trending

  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • Start Coding With Google Cloud Workstations
  • Automating Data Pipelines: Generating PySpark and SQL Jobs With LLMs in Cloudera
  • How to Convert XLS to XLSX in Java
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Building Product to Learn AI, Part 2: Shake and Bake

Building Product to Learn AI, Part 2: Shake and Bake

In part 1, we gathered the crucial "ingredients" for our AI creation — the data. Now, transform that data into a fully functioning Large Language Model (LLM).

By 
Obaid Sarvana user avatar
Obaid Sarvana
·
Indrajit Bhattacharya user avatar
Indrajit Bhattacharya
·
Aug. 13, 24 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
3.6K Views

Join the DZone community and get the full member experience.

Join For Free

If you haven't already, be sure to review Part 1 where we reviewed data collection and prepared a dataset for our model to train on.

In the previous section, we gathered the crucial "ingredients" for our AI creation — the data. This forms the foundation of our model. Remember, the quality of the ingredients (your data) directly impacts the quality of the final dish (your model's performance).

Now, we'll transform that data into a fully functioning Large Language Model (LLM). By the end of this section, you'll be interacting with your very own AI!

Choosing Your Base Layer

Before we dive into training, we’ll explore the different approaches to training your LLM. This is like choosing the right flour for your bread recipe — it significantly influences the capabilities and limitations of your final creation. 

There are many ways to go about training an ML model. This is also an active area of research, with new methodologies emerging every day. Let’s take a look at the major tried-and-true categories of methods of model development. (Note: These methods are not necessarily mutually exclusive.)

Person kneading bowl of bread dough

Key Approaches

1. Start From Scratch (Pretraining Your Own Model)

This offers the most flexibility, but it's the most resource-intensive path. The vast amounts of data and compute resources required here mean that only the most well-resourced corporations are able to train novel pre-trained models.

2. Fine-Tuning (Building on a Pre-trained Model)

This involves starting with a powerful, existing LLM and adapting it to our specific meal-planning task. It's like using pre-made dough — you don't have to start from zero, but you can still customize it.

3. Leveraging Open-Source Models

Explore a growing number of open source models, often pre-trained on common tasks, to experiment without the need for extensive pre-training.

4. Using Commercial Off-the-Shelf Models

For production-ready applications, consider commercial LLMs (e.g., from Google, OpenAI, Microsoft) for optimized performance, but with potential customization limits.

5. Cloud Services 

Streamline training and deployment with powerful tools and managed infrastructure, simplifying the process.

Choosing the Right Approach

The best foundation for your LLM depends on your specific needs:

  • Time and resources: Do you have the capacity for pretraining, or do you need a faster solution?
  • Customization: How much control over the model's behavior do you require?
  • Cost: What's your budget? Can you invest in commercial solutions?
  • Performance: What level of accuracy and performance do you need?
  • Capabilities: What level of technical skills and/or compute resources do you have access to?

Moving Forward

We'll focus on fine-tuning Gemini Pro in this tutorial, striking a balance between effort and functionality for our meal-planning model.

Bread rising in the oven

Getting Ready to Train: Export Your Dataset

Now that we've chosen our base layer, let's get our data ready for training. Since we're using Google Cloud Platform (GCP), we need our data in JSONL format.

Note:  Each model might have specific data format requirements, so always consult the documentation before proceeding.

Luckily, converting data from Google Sheets to JSONL is straightforward with a little Python.

  1. Export to CSV: First, export your data from Google Sheets as a CSV file.
  2. Convert CSV to JSONL: Run the following Python script, replacing your_recipes.csv with your actual filename:
Python
 
import csv
import json

csv_file = 'your_recipes.csv'  # Replace 'your_recipes.csv' with your CSV filename
jsonl_file = 'recipes.jsonl'

with open(csv_file, 'r', encoding='utf-8') as infile, 
	open(jsonl_file, 'w', encoding='utf-8') as outfile:
    
    reader = csv.DictReader(infile) 

    for row in reader:
        row['Prompt'] = row['Prompt'].splitlines()
        row['Response'] = row['Response'].splitlines()
        json.dump(row, outfile)
        outfile.write('\n')


This will create a recipes.jsonl file where each line is a JSON object representing a meal plan.

Bread browning in the oven

Training Your Model

We’re finally ready to start training our LLM. Let’s dive in!

1. Project Setup

  1. Google Cloud Project: Create a new Google Cloud project if you don't have one already (free tier available).
  2. Enable APIs: Search for "Vertex AI" in your console, and on the Vertex AI page, click Enable All Recommended APIs.
  3. Authentication: Search for "Service Accounts," and on that page, click Create Service Account. Use the walkthrough to set up a service account and download the required credentials for secure access.
  4. Cloud Storage Bucket: Find the "Cloud Storage" page and create a storage bucket.

2. Vertex AI Setup

  1. Navigate to Vertex AI Studio (free tier available).
  2. Click Try it in Console in a browser where you are already logged in to your Google Cloud Account.
  3. In the left-hand pane find and click Language.
  4. Navigate to the “Tune and Distill” tab:

Tune and Distill tab

3. Model Training

  • Click Create Tuned Model.
  • For this example, we’ll do a basic fine-tuning task, so select “Supervised Tuning” (should be selected by default).
  • Give your model a name.
  • Select a base model: We’ll use Gemini Pro 1.0 002 for this example.
  • Click Continue.
  • Upload your JSONL file that you generated in Step 2.
  • You’ll be asked for a “dataset location.” This is just where your JSONL file is going to be located in the cloud. You can use the UI to very easily create a "bucket" to store this data.

Click start and wait for the model to be trained! With this step, you have now entered the LLM AI arena. The quality of the model you produce is only limited by your imagination and the quality of the data you can find, prepare, and/or generate for your use case.

For our use case, we used the data we generated earlier, which included prompts about how individuals could achieve their specific health goals, and meal plans that matched those constraints.

4. Test Your Model

Once your model is trained, you can test it by navigating to it on the Tune and Distill main page. In that interface, you can interact with the newly created model the same way you would with any other chatbot. 

In the next section, we will show you how to host your newly created model to run evaluations and wire it up for an actual application!

Slicing hot bread

Deploying Your Model

You've trained your meal planning LLM on Vertex AI, and it's ready to start generating personalized culinary masterpieces. Now it's time to make your AI chef accessible to the world! This post will guide you through deploying your model on Vertex AI and creating a user-friendly bot interface.

  1. Create an endpoint:
    • Navigate to the Vertex AI section in the Google Cloud Console.
    • Select "Endpoints" from the left-hand menu and click "Create Endpoint."
    • Give your endpoint a descriptive name (e.g., "meal-planning-endpoint").
  2. Deploy your model:
    • Within your endpoint, click "Deploy model."
    • Select your trained model from the Cloud Storage bucket where you saved it.
    • Specify a machine type suitable for serving predictions (consider traffic expectations).
    • Choose a deployment scale (e.g., "Manual Scaling" for initial testing, "Auto Scaling" for handling variable demand).
    • Deploy the model.

Congratulations! You've now trained and tested your very own LLM on Google's Vertex AI. You are now an AI engineer! In the next and final installment of this series, we'll take you through the exciting steps of deploying your model, creating a user-friendly interface, and unleashing your meal-planning AI upon the world! Stay tuned for the grand finale of our LLM adventure.

AI Cloud storage Data (computing) Google (verb) large language model

Opinions expressed by DZone contributors are their own.

Related

  • Gemini 2.0 Flash (Experimental): A Deep Dive for Developers
  • Exploring Foundations of Large Language Models (LLMs): Tokenization and Embeddings
  • The Disruptive Potential of On-Device Large Language Models
  • Retrieval-Augmented Generation (RAG): Enhancing AI-Language Models With Real-World Knowledge

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: