Building Generative AI Services: An Introductory and Practical Guide

Amazon Bedrock simplifies AI app development with serverless APIs, offering Q&A, summarization, and image generation using top models like Claude and Stability AI.

Srinivas Chippagiri

CORE ·

Anil Jonnalagadda

Jun. 11, 25 · Tutorial

Likes (13)

Comment

Save

18.4K Views

Amazon Web Services (AWS) offers a vast range of generative artificial intelligence solutions, which allow developers to add advanced AI capabilities to their applications without having to worry about the underlying infrastructure. This report highlights the creation of functional applications using Amazon Bedrock, which is a serverless offering based on an API that provides access to core models from leading suppliers, including Anthropic, Stability AI, and Amazon.

As the demand for AI-powered applications grows, developers seek easy and scalable solutions to integrate generative AI into their applications. AWS provides this capability through the firm's proprietary generative AI services, and the standout among these is Amazon Bedrock. Amazon Bedrock enables you to access foundation models via API without worrying about underlying infrastructure, scaling, and model training.

Through this practical guide, you learn how to utilize Bedrock to achieve a variety of generation tasks, including Q&A, summarization, image generation, conversational AI, and semantic search.

Local Environment Setup

Let's get started by setting up the AWS SDK for Python and configuring our AWS credentials.

    Shell
   
   pip install boto3
aws configure

Confirm that your account has access to the Bedrock service and underlying foundation models via the AWS console. Once done, we can experiment with some generative AI use cases!

Intelligent Q&A With Claude v2

The current application demonstrates how one can create a question-and-answer assistant using the Anthropic model v2. Forming the input as a conversation allows you to instruct the assistant to give concise, on-topic answers to user questions. Such an application is especially ideal for customer service, knowledge bases, or virtual helpdesk agents.

Let's take a look at a practical example of talking with Claude:

    Python
   
 

   import boto3 
import json 
 
client = boto3.client("bedrock-runtime", region_name="us-east-1") 
 
body = { 
    "prompt": "Human: How can I reset my password?\n\nAssistant:", 
    "max_tokens_to_sample": 200, 
    "temperature": 0.7, 
    "stop_sequences": ["\nHuman:"] 
} 
 
response = client.invoke_model( 
    modelId="anthropic.claude-v2", 
    contentType="application/json", 
    accept="application/json", 
    body=json.dumps(body) 
) 
 
print(response['body'].read().decode())
  

This prompt category simulates a human question while a knowledgeable assistant gives structured and coherent answers. A variation of this method can be utilized to create custom assistants that provide logically correct responses to user queries.

Summarization Using Amazon Titan

Amazon Titan text model enables easy summarization of long texts to concise and meaningful abstractions. Amazon Titan text model greatly improves the reading experience, enhances user engagement, and minimizes cognitive loads for such applications as news reporting, legal documents, and research papers.

    Python
   
 

   body = { 
    "inputText": "Cloud computing provides scalable IT resources via the internet...", 
    "taskType": "summarize" 
} 
 
response = client.invoke_model( 
    modelId="amazon.titan-text-lite-v1", 
    contentType="application/json", 
    accept="application/json", 
    body=json.dumps(body) 
) 
 
print(response['body'].read().decode())
  

By altering the nature of the task and the source text, we can implement the same strategy in content simplification, keyword extraction, and paraphrasing.

Text-to-Image Generation Using Stability AI

Visual content is crucial to marketing, social media, and product design. Using Stability AI's Stable Diffusion model in Bedrock, a user can generate images from text prompts, thus simplifying creative workflows or enabling real-time content generation features.

    Python
   
 

   import base64 
from PIL import Image 
from io import BytesIO 
 
body = { 
    "prompt": "A futuristic smart ring with a holographic display on a table", 
    "cfg_scale": 10, 
    "steps": 50 
} 
 
response = client.invoke_model( 
    modelId="stability.stable-diffusion-xl-v0", 
    contentType="application/json", 
    accept="application/json", 
    body=json.dumps(body) 
) 
 
image_data = json.loads(response['body'].read()) 
img_bytes = base64.b64decode(image_data['artifacts'][0]['base64']) 

Image.open(BytesIO(img_bytes)).show() 

  

This technique is especially well-adapted to user interface mockups, game industry asset production, or real-time visualization tools in design software.

Conversation With Claude v2

Let's expand on the Q&A example. For example, this sample use case demonstrates a sample multi-turn conversation experience in Claude v2. The assistant maintains context and answers properly through conversational steps:

    Python
   
 

   conversation = """ 
Human: Help me plan a trip to Seattle. 
Assistant: Sure! Business or leisure? 
Human: Leisure. 
Assistant: 
""" 
 
body = { 
    "prompt": conversation, 
    "max_tokens_to_sample": 200, 
    "temperature": 0.5, 
    "stop_sequences": ["\nHuman:"] 
} 
 
response = client.invoke_model( 
    modelId="anthropic.claude-v2", 
    contentType="application/json", 
    accept="application/json", 
    body=json.dumps(body) 
) 
 
print(response['body'].read().decode()
  

Interacting in multi-turn conversations is crucial for building booking agents, chatbots, or any agent that is meant to gather sequential information from users.

Using Embeddings for Retrieval

Text embeddings are quantitative representations containing semantic meaning. Amazon Titan generates embeddings that can be stored in vector databases to be used in semantic search, recommendation systems, or similarity measurement.

    Python
   
 

   body = { 
    "inputText": "Explain zero trust architecture." 
} 
 
response = client.invoke_model( 
    modelId="amazon.titan-embed-text-v1", 
    contentType="application/json", 
    accept="application/json", 
    body=json.dumps(body) 
) 
 
embedding_vector = json.loads(response["body".read()])['embedding']

print(len(embedding_vector)) 
 
  

You can retrieve documents by meaning using embeddings, which greatly improves retrieval efficiency for consumer and enterprise applications.

Additional Day-to-Day Applications

By integrating these important usage scenarios, developers can build well-architected production-grade applications. For example:

A customer service system can make use of Claude to interact in question-and-answer conversations, utilize Titan to summarize content, and employ embeddings to search for documents.
A design application can utilize Stable Diffusion to generate images based on user-defined parameters.
A bot driven by Claude can escalate requests to the human through AWS Lambda functions in the bot.

AWS Bedrock provides out-of-box integration for services including Amazon Kendra (enterprise search across documents), AWS Lambda (serverless backend functionality), and Amazon API Gateway (scalable APIs) to enable full-stack generative applications.

Conclusion

Generative AI services from AWS, especially Amazon Bedrock, provide developers with versatile, scalable tools to implement advanced AI use cases with ease. By using serverless APIs to invoke text, image, and embedding models, you can accelerate product development without managing model infrastructure. Whether building assistants, summarizers, generators, or search engines, Bedrock delivers enterprise-grade performance and simplicity.

AI AWS generative AI

Opinions expressed by DZone contributors are their own.

Related

Trending