DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • When (Tech Service) Relationships Don’t Work Out
  • Pilot VPC and Advanced NAT: Securely Connect Overlapping Networks to AWS VPC
  • Analyze Your ALB/NLB Logs With ClickHouse
  • Strategic Deployments in AWS: Leveraging IaC for Cross-Account Efficiency

Trending

  • A Modern Stack for Building Scalable Systems
  • How to Configure and Customize the Go SDK for Azure Cosmos DB
  • Transforming AI-Driven Data Analytics with DeepSeek: A New Era of Intelligent Insights
  • Building Enterprise-Ready Landing Zones: Beyond the Initial Setup
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. How To Deploy Llama2 on AWS With Walrus in Minutes

How To Deploy Llama2 on AWS With Walrus in Minutes

In this blog, we will explore how to deploy Llama2 on AWS with Walrus. Walrus is an open-source application management platform.

By 
Ally Lynn user avatar
Ally Lynn
·
Nov. 21, 23 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
1.7K Views

Join the DZone community and get the full member experience.

Join For Free

In the realm of artificial intelligence, the advent of large language models has been nothing short of a revolution. Models like GPT-4 and, more recently, Llama2, have ushered in a new era of natural language understanding and generation. 

However, while the development and training of these models mark significant milestones, their true value is unlocked only when they are effectively deployed and integrated into practical use cases. 

In this blog, we will explore how to deploy Llama2 on AWS with Walrus. Walrus is an open-source application management platform that simplifies application deployment and management on any infrastructure. It helps platform engineers build golden paths for developers and empowers developers with self-service capabilities.

Prerequisites

To follow this tutorial, you will need:

  1. An AWS account with associated credentials and sufficient permissions to create EC2 instances
  2. Walrus installed.

Note: While using a CPU is cheaper than using a GPU, it still incurs costs corresponding to the EC2 instance.

The Simple Way

With Walrus, you can have a running llama-2 instance on AWS with a user-friendly web UI in about a minute. Just follow these steps:

Add the Llama-2 Service Template

  1. Log in to Walrus, click on Operations Center in the left navigation, go to the Templates tab, and click the New Template button.
  2. Enter a template name, e.g., llama-2
  3. In the source field, enter https://github.com/walrus-tutorials/llama2-on-aws.
  4. Click Save

Operation Hub

Configure Environment and AWS Credentials

  1. In the left navigation, click on Application Management, go to the default project view, and click the Connectors tab.
  2. Click the New Connector button and select the Cloud Provider type.
  3. Enter a connector name, e.g., aws.
  4. Choose AWS for the Type option.
  5. Select Tokyo (ap-northeast-1) for the Region option.
  6. Click Save

Note: The specified region is used here because the subsequent steps involve using an AMI from that region. If you want to use a different region, you can export the AMI to your region or refer to the following sections on how to build the llama-2 image from scratch.

Configure Environment and AWS Credentials

  1. Click the Environments tab, click the New Environment button.
  2. Enter an environment name, e.g., dev.
  3. Click the Add Connector button and select the aws connector created in the previous step.
  4. Click Save

Environments

Create the Llama-2 Service

  1. In the Environments tab, click on the name of the dev environment to enter its view.
  2. Click the New Service button.
  3. Enter a service name, e.g., my-llama-2.
  4. Choose llama-2 in the Template option.
  5. Click Save

Note: The default service configuration assumes your AWS account has a default VPC in the corresponding region. If you don't have a default VPC, create a new VPC and associate a subnet and a security group with it in the AWS VPC console. The security group needs to open port 7860 TCP (for accessing the llama-2 web UI). You can set your VPC name and security group name in the service configuration.

Accessing the Llama-2 Web UI

You can see the deployment and running status of the llama-2 service on its details page. Once the llama-2 service deployment is complete, you can access its web UI by clicking the access link of the service in the Walrus UI.

Accessing the Llama-2 Web UI

who are you

Deep Dive: Building the Llama-2 Image From Scratch

The above instructions utilized a pre-built llama-2 image. This approach saves time as you don't need to download the large language model (often with a significant file size) or build the inference service when creating a new llama-2 instance. This section explains how such a llama-2 image is built.

You can find the complete build process here.

Key steps include:

Go
 
# get text-generation-webui
git clone https://github.com/oobabooga/text-generation-webui && cd text-generation-webui
# configure text-generation-webui
ln -s docker/{Dockerfile,docker-compose.yml,.dockerignore} .
cp docker/.env.example .env
sed -i '/^CLI_ARGS=/s/.*/CLI_ARGS=--model llama-2-7b-chat.ggmlv3.q4_K_M.bin --wbits 4 --listen --auto-devices/' .env
sed -i '/^\s*deploy:/,$d' docker/docker-compose.yml
# get quantized llama-2
curl -L https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q4_K_M.bin --output ./models/llama-2-7b-chat.ggmlv3.q4_K_M.bin
# build and run
docker compose up --build


In essence, this process downloads the quantized llama-2-7b-chat model, then builds and utilizes text-generation-webui to launch the llama-2 service.

Congratulations! You have successfully deployed Llama-2 on AWS using Walrus. If you have any other questions about Walrus, feel free to join our community and communicate directly with our developers.

AWS UI Virtual private cloud

Published at DZone with permission of Ally Lynn. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • When (Tech Service) Relationships Don’t Work Out
  • Pilot VPC and Advanced NAT: Securely Connect Overlapping Networks to AWS VPC
  • Analyze Your ALB/NLB Logs With ClickHouse
  • Strategic Deployments in AWS: Leveraging IaC for Cross-Account Efficiency

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!