DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Google Cloud Document AI Basics
  • Build a Local AI-Powered Document Summarization Tool
  • How To Build Translate Solutions With Google Cloud Translate AI
  • A Look at Intelligent Document Processing and E-Invoicing

Trending

  • Reducing Hallucinations Using Prompt Engineering and RAG
  • Beyond the Glass Slab: How AI Voice Assistants are Morphing Into Our Real-Life JARVIS
  • Jakarta EE 11 and the Road Ahead With Jakarta EE 12
  • A Keycloak Example: Building My First MCP Server Tools With Quarkus
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. How I Built an AI Portal for Document Q and A, Summarization, Transcription, Translation, and Extraction

How I Built an AI Portal for Document Q and A, Summarization, Transcription, Translation, and Extraction

Fed up with juggling disconnected AI tools, I built a simple, all-in-one web portal for document processing—summaries, transcriptions, translations, and more. Here’s how.

By 
Sanjay Krishnegowda user avatar
Sanjay Krishnegowda
·
Jun. 05, 25 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
1.6K Views

Join the DZone community and get the full member experience.

Join For Free

These days, AI is everywhere, but most people at work are still stuck using a mix of disconnected tools. Some folks use a chatbot here, someone else copies text into a summarizer there, and there’s always a messy process to get meeting recordings transcribed or translated. It’s kind of a headache.

I kept hearing the same complaints from my team: “Why can’t all of this just be in one place?”

So I set out to build it—a single web portal where anyone can upload documents, ask questions, get summaries, transcribe meetings, translate files, and even pull tables out of PDFs. It’s not flashy; it’s just practical and solves real problems we face every day.

In this article, I’ll show you how I put the whole thing together, why I picked certain technologies, and how you can run it yourself. Everything is open source. No AI jargon required.

What This Portal Can Do

Here’s what you’ll get—all in one website, one login, one simple UI:

  • Chat with Data: Upload a document, ask it anything, and get real answers (not just keyword matches).
  • Summarize: Drop in a long report or policy and get a short, clear summary—customized if you want.
  • Transcribe: Upload meeting recordings and get a written transcript (fast).
  • Translate: Convert documents into other languages, keeping the format.
  • Extract: Grab tables and key data from PDFs and download as JSON or Excel.

No more bouncing between apps. Just upload, pick a feature, and get what you need.

Why I Built It

I’ve worked in data and AI for years, but when it comes to day-to-day work, most solutions felt disconnected. Employees waste time moving files between tools, or don’t bother with AI because it’s “one more app to learn.” I wanted something that anyone could use—no training, no plugins, no fuss.

If you’re a developer, manager, or even a business user who’s ever thought, “AI is cool, but I just want it to save me time,” this portal is for you.

How It All Fits Together

I wanted to keep things simple but strong under the hood. Here’s the big picture:

  • Frontend: React
  • Backend: FastAPI (Python)
  • LLM & Embeddings: Azure OpenAI
  • Vector DB: Pinecone
  • Audio Transcription: Whisper (runs locally)
  • Translation: Azure Translator
  • Document Extraction: Azure Document Intelligence
  • Storage: Local or Azure Blob Storage (demo uses local)
  • Auth: (Add Azure AD/OAuth for production)

Here’s how it all connects:

Enterprise AI portal


Feature Walkthroughs (With Diagrams and Code)

Chat With Data

Ever wish you could just ask a big PDF or report a question and get a direct answer—without reading the whole thing? That’s what “Chat with Data” does. You upload a document, ask a question in plain English, and get the answer pulled right from your file.
This saves time for everyone—legal, finance, compliance, or anyone who deals with lengthy docs.

How It Works

The portal splits and embeds your document, stores it in Pinecone, and uses Azure OpenAI to answer any question you type.

An image of chat with data in Azure OpenAI


Backend: FastAPI Endpoint for Uploading Documents

Python
 
# Python backend example
@app.post("/upload/")
async def upload_file(file: UploadFile = File(...)):
    contents = await file.read()
    text = contents.decode("utf-8", errors="ignore")
    upsert_document(text)
    return {"status": "uploaded"}


Backend: FastAPI Endpoint for Chat

Python
 
# Python backend for chat
@app.post("/chat/")
async def chat(query: str = Form(...)):
    matches = query_pinecone(query)
    context = " ".join([m['text'] for m in matches])
    answer = get_answer(query, context)
    return {"answer": answer}

See the rest of the backend code in GitHub.

Summarization

Let’s face it—most business documents are too long. The summarization feature lets anyone upload a big file and get a short, focused summary. You can even add a custom prompt like “summarize key risks for compliance” or “give me the main action items.”
This is a game-changer for managers, analysts, or anyone who needs to make decisions quickly.

How It Works

Your doc is uploaded, the backend sends it (plus your prompt) to Azure OpenAI, and you get a summary back—no need to read the whole thing.

An image of Azure OpenAI workflow


Backend: FastAPI Summarization Endpoint

Python
 
# Summarization endpoint
@app.post("/summarize/")
async def summarize(file: UploadFile = File(...), prompt: str = Form("Summarize this document:")):
    contents = await file.read()
    text = contents.decode("utf-8", errors="ignore")
    summary = summarize_text(text, prompt)
    return JSONResponse(content={"summary": summary})

See the React SummarizeForm component.

Audio Transcription

We’ve all sat through meetings or calls that get recorded and then… nobody wants to listen back. With this feature, just upload your audio or video file and get a written transcript—fast.
It helps teams document decisions, catch up quickly, and make meetings more inclusive.

How It Works

Your audio is uploaded, the backend uses Whisper to transcribe everything, and you get the full text back in your browser.

Backend: FastAPI Audio Transcription Endpoint

Python
 
# Audio transcription endpoint
@app.post("/transcribe/")
async def transcribe(file: UploadFile = File(...)):
    audio_bytes = await file.read()
    transcript = transcribe_audio_file(audio_bytes, file.filename)
    return JSONResponse(content={"transcript": transcript})

TranscribeForm.js

Language Translation

Global teams need to work across languages, and translating documents is usually slow and manual. This feature lets you upload any file, pick your target language, and get a translated version—keeping the format as close as possible.

This is super useful for HR, compliance, or anyone working with partners abroad.

How It Works

Upload a doc, pick your language, and the backend calls Azure Translator. You get the translated text right away.

An image of Azure Translator workflow


Backend: FastAPI Translation Endpoint

Python
 
# Translation endpoint
@app.post("/translate/")
async def translate(
    file: UploadFile = File(...),
    to_language: str = Form(...)
):
    contents = await file.read()
    text = contents.decode("utf-8", errors="ignore")
    translated = fake_translate(text, to_language)
    return JSONResponse(content={"translated": translated})

TranslateForm.js

Document Extractor

Extracting tables and key-value data from PDFs and forms is usually a nightmare. This feature does it for you.

It’s great for finance teams, analysts, or anyone who needs to pull structured data for reporting, spreadsheets, or automation.

How It Works

Upload a PDF (or scanned doc), pick whether you want JSON or Excel, and the portal uses Azure Document Intelligence to pull out tables and key-value pairs. Download and you’re done.

An image of document extractor workflow


Backend: FastAPI Extraction Endpoint

Python
 
# Document extractor endpoint
@app.post("/extract/")
async def extract(
    file: UploadFile = File(...),
    output_format: str = Form("json")
):
    contents = await file.read()
    filename = file.filename

    if output_format == "excel":
        xls_path = extract_tables_and_kv(contents, filename, output_format="excel")
        return FileResponse(xls_path, media_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", filename="extracted_tables.xlsx")
    else:
        output = extract_tables_and_kv(contents, filename, output_format="json")
        return JSONResponse(content=output)

ExtractForm.js

How to Run the Portal Yourself

Everything is open source.
Code: github.com/sanjaybk7/AIPortal

  1. Clone the repo: 

    git clone https://github.com/sanjaybk7/AIPortal.git

  2. Backend:
    Go into backend, set up Python, install requirements, and run FastAPI.

  3. Frontend:
    Go into frontend, install dependencies, and run React.

  4. Open your browser:
    Head to http://localhost:3000 and start uploading.

You’ll need API keys for Azure and Pinecone, but setup is explained in the repo.

Wrapping Up

I built this portal because I was tired of bouncing between tools and wanted something my team would actually use. If you feel the same way, or just want to see what’s possible when you connect modern AI tools together, give it a try.

I’d love to hear your thoughts or improvements. Feel free to fork the repo, open an issue, or just let me know how you’re using it.

AI Document Translation

Opinions expressed by DZone contributors are their own.

Related

  • Google Cloud Document AI Basics
  • Build a Local AI-Powered Document Summarization Tool
  • How To Build Translate Solutions With Google Cloud Translate AI
  • A Look at Intelligent Document Processing and E-Invoicing

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: