DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Real-Time AI Inference at Scale Using Cloud Run, GPUs, and Vertex AI
  • MuleSoft IDP: Enhancing Efficiency and Accuracy in Data Extraction
  • Navigating the Complexities of AI-Driven Integration in Multi-Cloud Environments: A Veteran’s Insights
  • Engineering LLMOps: Building Robust CI/CD Pipelines for LLM Applications on Google Cloud

Trending

  • Data Contracts as the "Circuit Breaker" for Model Reliability
  • Building a High-Throughput Distributed Sequence Generator Using the Hi-Lo Algorithm
  • Slopsquatting: Building a Scanner That Catches AI-Hallucinated Packages Before They Reach Production
  • Testing AI-Infused Apps: A Dual-Layer Framework for AI Quality Assurance
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. The AI Firewall: Using Local Small Language Models (SLMs) to Scrub PII Before Cloud Processing

The AI Firewall: Using Local Small Language Models (SLMs) to Scrub PII Before Cloud Processing

Learn how to run small language models locally as an AI firewall to detect and scrub PII from data before sending it to cloud AI services.

By 
Rambabu Bandam user avatar
Rambabu Bandam
·
Feb. 10, 26 · Tutorial
Likes (0)
Comment
Save
Tweet
Share
2.9K Views

Join the DZone community and get the full member experience.

Join For Free

As organizations increasingly rely on powerful cloud-based AI services like GPT-4, Claude, and Gemini for sophisticated text analysis, summarization, and generation tasks, a critical security concern emerges: what happens to sensitive data when it's sent to external AI providers?

Personal Identifiable Information (PII) — including names, email addresses, phone numbers, social security numbers, and financial data — can inadvertently be exposed during cloud AI processing. This creates compliance risks under regulations like GDPR, HIPAA, and CCPA, and opens the door to potential data breaches.

The solution? An AI Firewall — a local small language model (SLM) that acts as a security gateway, automatically detecting and scrubbing PII from data before it ever leaves your infrastructure. This tutorial walks you through implementing this pattern from scratch.

Why Local SLMs as a PII Firewall?

Before diving into implementation, let's understand why local small language models are ideal for this use case:

  • Data Never Leaves Your Infrastructure: Unlike cloud APIs, local models process data entirely on your machines
  • Low Latency: Processing happens locally without network round-trips
  • Cost Efficiency: No per-token charges after initial hardware investment
  • Compliance Friendly: Easier to demonstrate data governance for audits
  • Customizable: Fine-tune models for your specific PII patterns

Architecture Overview

The AI Firewall pattern follows this flow:

Plain Text
 
┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   Raw Data      │────▶│  Local SLM       │────▶│   Cloud AI      │
│   with PII      │     │  (PII Detector   │     │   (Safe Data    │
│                 │     │   & Scrubber)    │     │    Processing)  │
└─────────────────┘     └──────────────────┘     └─────────────────┘
                               │
                               ▼
                        ┌──────────────────┐
                        │  PII Vault       │
                        │  (Secure Store)  │
                        └──────────────────┘


Step 1: Setting Up Your Local Environment

Hardware Requirements

For running SLMs locally, you'll need:

  • Minimum: 8GB RAM, 4-core CPU (for CPU inference)
  • Recommended: 16GB+ RAM, GPU with 8GB+ VRAM (for faster inference)
  • Production: 32GB+ RAM, NVIDIA GPU with 16GB+ VRAM

Installing Ollama

Ollama is the most popular and easiest way to run LLMs locally. Install it with:

Plain Text
 
# macOS
curl -fsSL https://ollama.ai/install.sh | sh

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Windows (PowerShell, requires WSL2)
irm https://ollama.ai/install.ps1 | iex


Pulling the Right Model

For PII detection, we need a model that's small enough for fast inference but capable enough for entity recognition. I recommend these options:

Plain Text
 
# Option 1: Phi-3 Mini (lightweight, fast)
ollama pull phi3

# Option 2: Llama 3.2 (more accurate, slightly larger)
ollama pull llama3.2:3b

# Option 3: Mistral (good balance)
ollama pull mistral:7b-instruct


Step 2: Building the PII Detection System

Core Detection Class

Create a Python class that interfaces with the local Ollama model for PII detection:

Plain Text
 
import ollama
import json
import re
from dataclasses import dataclass
from typing import List, Dict, Tuple

@dataclass
class PIIEntity:
    """Represents a detected PII entity."""
    text: str
    category: str
    start_pos: int
    end_pos: int
    confidence: float

class LocalPIIDetector:
    """
    Uses a local SLM to detect PII in text.
    Acts as an AI Firewall before cloud processing.
    """
    
    PII_CATEGORIES = [
        "PERSON_NAME", "EMAIL", "PHONE_NUMBER", 
        "SSN", "CREDIT_CARD", "ADDRESS",
        "DATE_OF_BIRTH", "BANK_ACCOUNT", "IP_ADDRESS"
    ]
    
    def __init__(self, model_name: str = "phi3"):
        self.model = model_name
        self.detection_prompt = self._build_detection_prompt()
    
    def _build_detection_prompt(self) -> str:
        return """You are a PII (Personal Identifiable Information) detection system.
Analyze the following text and identify ALL instances of PII.

For each PII found, output in this exact JSON format:
{
    "entities": [
        {"text": "exact text", "category": "CATEGORY", "confidence": 0.95}
    ]
}

Categories to detect: PERSON_NAME, EMAIL, PHONE_NUMBER, SSN, CREDIT_CARD, 
ADDRESS, DATE_OF_BIRTH, BANK_ACCOUNT, IP_ADDRESS

If no PII is found, return: {"entities": []}

TEXT TO ANALYZE:
"""
    
    def detect(self, text: str) -> List[PIIEntity]:
        """Detect PII entities in the given text."""
        response = ollama.chat(
            model=self.model,
            messages=[
                {
                    "role": "user",
                    "content": f"{self.detection_prompt}\n{text}"
                }
            ],
            options={"temperature": 0.1}  # Low temp for consistency
        )
        
        return self._parse_response(response["message"]["content"], text)
    
    def _parse_response(self, response: str, original_text: str) -> List[PIIEntity]:
        """Parse the LLM response into PIIEntity objects."""
        entities = []
        try:
            # Extract JSON from response
            json_match = re.search(r'\{[^{}]*"entities"[^{}]*\[.*?\][^{}]*\}', 
                                   response, re.DOTALL)
            if json_match:
                data = json.loads(json_match.group())
                for item in data.get("entities", []):
                    # Find position in original text
                    start = original_text.find(item["text"])
                    if start != -1:
                        entities.append(PIIEntity(
                            text=item["text"],
                            category=item["category"],
                            start_pos=start,
                            end_pos=start + len(item["text"]),
                            confidence=item.get("confidence", 0.8)
                        ))
        except (json.JSONDecodeError, KeyError) as e:
            print(f"Warning: Could not parse LLM response: {e}")
        
        return entities


Step 3: Implementing the PII Scrubber

Scrubbing Strategies

Once PII is detected, we need to scrub it. Here are common strategies:

  • Redaction: Replace with [REDACTED] or category markers like [EMAIL]
  • Tokenization: Replace with reversible tokens (PII-001, PII-002)
  • Pseudonymization: Replace with fake but realistic data
  • Hash-based: Replace with hashed values for consistency
Plain Text
 
import hashlib
from typing import Optional
import uuid

class PIIScrubber:
    """
    Scrubs detected PII from text using various strategies.
    Maintains a vault for reversible operations.
    """
    
    def __init__(self, strategy: str = "tokenize"):
        self.strategy = strategy
        self.vault: Dict[str, str] = {}  # token -> original value
        self.token_counter = 0
    
    def scrub(self, text: str, entities: List[PIIEntity]) -> Tuple[str, Dict]:
        """
        Scrub PII from text and return scrubbed version with mapping.
        """
        # Sort entities by position (reverse) to replace from end
        sorted_entities = sorted(entities, key=lambda x: x.start_pos, reverse=True)
        
        scrubbed = text
        mappings = {}
        
        for entity in sorted_entities:
            replacement = self._get_replacement(entity)
            mappings[replacement] = entity.text
            self.vault[replacement] = entity.text
            
            scrubbed = (
                scrubbed[:entity.start_pos] + 
                replacement + 
                scrubbed[entity.end_pos:]
            )
        
        return scrubbed, mappings
    
    def _get_replacement(self, entity: PIIEntity) -> str:
        """Generate replacement based on strategy."""
        if self.strategy == "redact":
            return f"[{entity.category}]"
        
        elif self.strategy == "tokenize":
            self.token_counter += 1
            return f"<>"
        
        elif self.strategy == "hash":
            hash_val = hashlib.sha256(entity.text.encode()).hexdigest()[:8]
            return f"[{entity.category}:{hash_val}]"
        
        elif self.strategy == "pseudonymize":
            return self._generate_fake(entity.category)
        
        return "[REDACTED]"
    
    def _generate_fake(self, category: str) -> str:
        """Generate fake but realistic replacement data."""
        fakes = {
            "PERSON_NAME": "John Smith",
            "EMAIL": "[email protected]",
            "PHONE_NUMBER": "(555) 123-4567",
            "SSN": "XXX-XX-XXXX",
            "ADDRESS": "123 Main St, Anytown, ST 12345",
        }
        return fakes.get(category, "[REDACTED]")
    
    def restore(self, scrubbed_text: str) -> str:
        """Restore original PII from vault (for authorized use only)."""
        restored = scrubbed_text
        for token, original in self.vault.items():
            restored = restored.replace(token, original)
        return restored


Step 4: Creating the AI Firewall Gateway

Now let's combine everything into a complete firewall gateway:

Plain Text
 
import openai  # For cloud AI calls
from datetime import datetime
import logging

class AIFirewall:
    """
    Main AI Firewall class that orchestrates PII detection,
    scrubbing, cloud processing, and response handling.
    """
    
    def __init__(
        self,
        local_model: str = "phi3",
        scrub_strategy: str = "tokenize",
        cloud_provider: str = "openai"
    ):
        self.detector = LocalPIIDetector(model_name=local_model)
        self.scrubber = PIIScrubber(strategy=scrub_strategy)
        self.cloud_provider = cloud_provider
        self.audit_log = []
        
        logging.basicConfig(level=logging.INFO)
        self.logger = logging.getLogger("AIFirewall")
    
    def process(
        self, 
        text: str, 
        cloud_prompt: str,
        restore_response: bool = False
    ) -> dict:
        """
        Main processing pipeline:
        1. Detect PII locally
        2. Scrub PII from text
        3. Send safe text to cloud AI
        4. Optionally restore PII in response
        5. Return results with audit trail
        """
        timestamp = datetime.utcnow().isoformat()
        
        # Step 1: Detect PII using local SLM
        self.logger.info("Detecting PII with local model...")
        entities = self.detector.detect(text)
        self.logger.info(f"Found {len(entities)} PII entities")
        
        # Step 2: Scrub PII
        scrubbed_text, mappings = self.scrubber.scrub(text, entities)
        self.logger.info("PII scrubbed from text")
        
        # Step 3: Send to cloud AI
        self.logger.info("Sending sanitized text to cloud AI...")
        cloud_response = self._call_cloud_ai(scrubbed_text, cloud_prompt)
        
        # Step 4: Optionally restore PII in response
        if restore_response and mappings:
            final_response = self.scrubber.restore(cloud_response)
        else:
            final_response = cloud_response
        
        # Step 5: Create audit record
        audit_record = {
            "timestamp": timestamp,
            "pii_detected": len(entities),
            "pii_categories": list(set(e.category for e in entities)),
            "scrub_strategy": self.scrubber.strategy,
            "text_length_original": len(text),
            "text_length_scrubbed": len(scrubbed_text),
        }
        self.audit_log.append(audit_record)
        
        return {
            "original_text": text,
            "scrubbed_text": scrubbed_text,
            "cloud_response": cloud_response,
            "final_response": final_response,
            "pii_entities": [
                {"text": e.text, "category": e.category, "confidence": e.confidence}
                for e in entities
            ],
            "audit": audit_record
        }
    
    def _call_cloud_ai(self, safe_text: str, prompt: str) -> str:
        """Send sanitized text to cloud AI service."""
        try:
            response = openai.chat.completions.create(
                model="gpt-4",
                messages=[
                    {"role": "system", "content": prompt},
                    {"role": "user", "content": safe_text}
                ]
            )
            return response.choices[0].message.content
        except Exception as e:
            self.logger.error(f"Cloud AI error: {e}")
            return f"Error: {str(e)}"


Step 5: Real-World Usage Example

Let's see the AI Firewall in action with a realistic scenario:

Plain Text
 
def main():
    # Initialize the firewall
    firewall = AIFirewall(
        local_model="phi3",
        scrub_strategy="tokenize"
    )
    
    # Sample text with PII
    customer_feedback = """
    Hi, my name is Sarah Johnson and I'm writing about my order #12345.
    You can reach me at [email protected] or call me at (415) 555-0123.
    My billing address is 742 Evergreen Terrace, Springfield, IL 62704.
    I paid with my card ending in 4532 and my SSN is 123-45-6789 which
    was required for the credit check.
    """
    
    # Process through firewall
    result = firewall.process(
        text=customer_feedback,
        cloud_prompt="Summarize this customer feedback and identify the main concern.",
        restore_response=False  # Keep PII scrubbed in response
    )
    
    print("=== ORIGINAL TEXT ===")
    print(result["original_text"])
    
    print("\n=== SCRUBBED TEXT (Sent to Cloud) ===")
    print(result["scrubbed_text"])
    
    print("\n=== PII DETECTED ===")
    for entity in result["pii_entities"]:
        print(f"  {entity['category']}: {entity['text']} ({entity['confidence']:.0%})")
    
    print("\n=== CLOUD AI RESPONSE ===")
    print(result["final_response"])
    
    print("\n=== AUDIT LOG ===")
    print(result["audit"])

if __name__ == "__main__":
    main()


Expected Output

Plain Text
 
=== SCRUBBED TEXT (Sent to Cloud) ===
Hi, my name is <> and I'm writing about my order #12345.
You can reach me at <> or call me at <>.
My billing address is <>.
I paid with my card ending in <
       > and my SSN is < 
       
        > which was required for the credit check. === PII DETECTED === PERSON_NAME: Sarah Johnson (95%) EMAIL: [email protected] (98%) PHONE_NUMBER: (415) 555-0123 (97%) ADDRESS: 742 Evergreen Terrace, Springfield, IL 62704 (92%) CREDIT_CARD: 4532 (85%) SSN: 123-45-6789 (99%)


Step 6: Hybrid Approach with Regex Pre-filtering

For maximum accuracy and speed, combine pattern-based detection with the SLM:

Plain Text
 
class HybridPIIDetector:
    """
    Combines regex patterns (fast, high-precision) with
    SLM detection (catches context-dependent PII).
    """
    
    PATTERNS = {
        "EMAIL": r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}',
        "PHONE_NUMBER": r'\(?\d{3}\)?[-.s]?\d{3}[-.s]?\d{4}',
        "SSN": r'\d{3}-\d{2}-\d{4}',
        "CREDIT_CARD": r'\b\d{4}[-s]?\d{4}[-s]?\d{4}[-s]?\d{4}\b',
        "IP_ADDRESS": r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b',
    }
    
    def __init__(self, model_name: str = "phi3"):
        self.slm_detector = LocalPIIDetector(model_name)
        self.compiled_patterns = {
            k: re.compile(v) for k, v in self.PATTERNS.items()
        }
    
    def detect(self, text: str) -> List[PIIEntity]:
        """Detect PII using both regex and SLM."""
        entities = []
        
        # Fast regex pass
        for category, pattern in self.compiled_patterns.items():
            for match in pattern.finditer(text):
                entities.append(PIIEntity(
                    text=match.group(),
                    category=category,
                    start_pos=match.start(),
                    end_pos=match.end(),
                    confidence=0.99  # Regex matches are high confidence
                ))
        
        # SLM pass for context-dependent PII (names, addresses)
        slm_entities = self.slm_detector.detect(text)
        
        # Merge, avoiding duplicates
        existing_positions = {(e.start_pos, e.end_pos) for e in entities}
        for e in slm_entities:
            if (e.start_pos, e.end_pos) not in existing_positions:
                entities.append(e)
        
        return entities


Step 7: Deploying as a REST API

Wrap the firewall in a FastAPI service for easy integration:

Plain Text
 
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional

app = FastAPI(title="AI Firewall API", version="1.0.0")

firewall = AIFirewall(local_model="phi3", scrub_strategy="tokenize")

class ProcessRequest(BaseModel):
    text: str
    cloud_prompt: str
    restore_response: Optional[bool] = False

@app.post("/process")
async def process_text(request: ProcessRequest):
    """Process text through the AI Firewall."""
    try:
        result = firewall.process(
            text=request.text,
            cloud_prompt=request.cloud_prompt,
            restore_response=request.restore_response
        )
        return result
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy", "model": firewall.detector.model}

# Run with: uvicorn app:app --host 0.0.0.0 --port 8000


Performance Considerations

Model Size Speed (CPU) Speed (GPU) Accuracy
Phi-3 Mini 3.8B ~2s/req ~200ms/req Good
Llama 3.2 3B 3B ~1.5s/req ~150ms/req Good
Mistral 7B 7B ~4s/req ~300ms/req Excellent


Security Best Practices

  • Vault encryption: Encrypt the PII vault at rest using AES-256
  • Access control: Implement RBAC for the restore() function
  • Audit logging: Log all PII access and scrubbing operations
  • Network isolation: Run the local SLM on an isolated network segment
  • Regular updates: Keep Ollama and models updated for security patches

Conclusion

Implementing a local SLM as an AI Firewall provides a robust solution for protecting PII while still leveraging the power of cloud AI services. Key takeaways:

  • Defense in depth: The local SLM adds a security layer without replacing other measures
  • Regulatory compliance: Demonstrates proactive data protection for GDPR, HIPAA, CCPA
  • Practical hybrid: Combine regex patterns with SLM for best accuracy and speed
  • Reversible when needed: Tokenization allows authorized restoration of PII

As AI becomes more integral to business operations, the AI Firewall pattern will become essential for organizations that need to balance innovation with data protection.

AI Cloud entity Firewall (computing) Processing

Opinions expressed by DZone contributors are their own.

Related

  • Real-Time AI Inference at Scale Using Cloud Run, GPUs, and Vertex AI
  • MuleSoft IDP: Enhancing Efficiency and Accuracy in Data Extraction
  • Navigating the Complexities of AI-Driven Integration in Multi-Cloud Environments: A Veteran’s Insights
  • Engineering LLMOps: Building Robust CI/CD Pipelines for LLM Applications on Google Cloud

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook