AWS Bedrock AI Chatbot with RAG

AWS Bedrock • Claude 3 Haiku • Serverless • RAG Implementation
Back to Projects
AWS Bedrock Claude 3 Haiku AWS Lambda DynamoDB API Gateway RAG (Vector Embeddings) Amazon Titan Embeddings Node.js CloudFormation (IaC)
$0.30
Monthly Cost (1K requests)
99%
Cost Savings vs ChatGPT API
2-3s
Average Response Time

📋 Executive Summary

Context: The Challenge

As a cloud engineer building a portfolio site, I needed an intelligent chatbot that could answer recruiter questions about my skills AND provide IT troubleshooting assistance. Traditional AI chatbot solutions (OpenAI, Anthropic direct APIs) cost $10-100/month with unpredictable usage spikes. I wanted to demonstrate production AWS AI expertise while keeping costs under $1/month.

Action: The Solution

  • Architected serverless solution with AWS Bedrock (Claude 3 Haiku) for natural language AI
  • Implemented RAG (Retrieval-Augmented Generation) using Amazon Titan embeddings and DynamoDB vector database
  • Built dual-role chatbot: IT helpdesk (networking, Windows, macOS, Linux) + portfolio Q&A
  • Deployed via CloudFormation Infrastructure as Code for reproducible deployments
  • Optimized costs through token limits, conversation truncation, and knowledge base filtering
  • Created streaming word-by-word response animation for better UX

Result: Business Impact

  • 99% cost reduction: ~$0.30/month for 1,000 AI requests vs $30/month with ChatGPT API
  • Production expertise demonstrated: AWS Bedrock, RAG, serverless architecture, IaC
  • Scalable architecture: Serverless auto-scaling with pay-per-use pricing
  • Dual-purpose value: Helps recruiters learn about me AND showcases IT troubleshooting skills
  • 2-3 second responses with natural language quality rivaling ChatGPT
  • Live demo: Functional chatbot visible on this portfolio (bottom-right corner)

🛠️ How I Built This

Transparency: Backend Lambda function (500+ lines Node.js) was self-written using AWS SDK v3. AI assisted with Bedrock API documentation, prompt engineering strategies, and cost optimization analysis. The chatbot you see on this site uses the actual production implementation - click the chat icon in the bottom right to try it!

Project Overview

This AI chatbot is a production serverless application built on AWS Bedrock demonstrating enterprise AI integration skills. Unlike traditional chatbot implementations that cost $10-100/month, this solution leverages AWS Bedrock's Claude 3 Haiku model with RAG (Retrieval-Augmented Generation) to achieve 99% cost savings while maintaining response quality. The chatbot serves dual roles: helping recruiters learn about my skills and providing IT troubleshooting assistance for common network, Windows, macOS, and Linux issues.

🎯 Try It Live!

Click the chat icon in the bottom right corner of this page to interact with the actual AWS Bedrock chatbot.

Try asking: "What are Don's AWS skills?" or "How do I troubleshoot a VPN connection?" - responses are powered by Claude 3 Haiku with RAG context retrieval.

Design Goals

When building this chatbot, I had several key requirements:

Solution Architecture

┌─────────────────┐ │ User Browser │ │ (Frontend) │ └────────┬────────┘ │ HTTPS POST ▼ ┌─────────────────┐ │ API Gateway │ ← CORS, Rate Limiting (100/hour) │ REST API │ └────────┬────────┘ │ Invoke ▼ ┌─────────────────┐ │ Lambda │ ← Node.js 20.x Serverless Function │ RAG Handler │ └────┬────────┬───┘ │ │ │ └──────────────────┐ │ │ ▼ ▼ ┌─────────────────┐ ┌────────────────────┐ │ Amazon Titan │ │ DynamoDB │ │ Embeddings │ │ Vector Database │ │ (1536-dim) │ │ (Document Chunks) │ └────────┬────────┘ └────────┬───────────┘ │ │ │ Generate │ Cosine │ Question │ Similarity │ Vector │ Search │ │ └──────────┬───────────┘ │ Top 3 Contexts ▼ ┌────────────────────┐ │ Claude 3 Haiku │ │ (Bedrock) │ │ Generate Answer │ └────────┬───────────┘ │ Streaming Response ▼ ┌────────────────────┐ │ User Browser │ │ (Word-by-word) │ └────────────────────┘ 🔍 AWS Services Used: • Bedrock (Claude 3 Haiku + Titan) • Lambda (Serverless compute) • DynamoDB (NoSQL vector DB) • API Gateway (RESTful API) • CloudWatch (Logs + metrics) • CloudFormation (IaC deployment)

I developed a comprehensive serverless AI solution with several key components demonstrating real engineering skills:

1. Frontend: ITChatbot JavaScript Class

Built a lightweight client-side chatbot class that handles UI/UX and API communication:

// From js/modules/chatbot.js (ITChatbot class - 506 lines)
class ITChatbot {
    constructor() {
        this.apiEndpoint = 'API_GATEWAY_ENDPOINT';
        this.conversation = [];
        this.maxConversationLength = 10;
        this.streamingSpeed = 30; // ms per word
    }

    async sendMessage(message) {
        try {
            const response = await fetch(this.apiEndpoint, {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({
                    question: message,
                    conversationHistory: this.conversation.slice(-5)
                })
            });

            const data = await response.json();

            // data = {
            //   success: true,
            //   answer: "AI-generated response...",
            //   metadata: { tokens: 342, cost: 0.000285, duration: 1847 }
            // }

            return data;
        } catch (error) {
            console.error('API Error:', error);
            return this.handleRetry(message);
        }
    }

    // Stream response word-by-word for natural UX
    addStreamingMessage(text) {
        const words = text.split(' ');
        words.forEach((word, index) => {
            setTimeout(() => {
                this.appendWordToMessage(word);
            }, index * this.streamingSpeed);
        });
    }
}

2. Backend: Lambda RAG Implementation

Implemented Retrieval-Augmented Generation to ensure factually accurate responses:

// From lambda/chatbot-handler/index.js (Lambda handler with RAG)
exports.handler = async (event) => {
    const { question, conversationHistory } = JSON.parse(event.body);

    // Step 1: Generate embedding for user's question
    const questionEmbedding = await generateEmbedding(question);
    // → Amazon Titan converts question to 1536-dimensional vector

    // Step 2: Search DynamoDB for similar document chunks
    const documents = await dynamodb.scan({
        TableName: process.env.TABLE_NAME
    });

    // Step 3: Calculate cosine similarity
    const similarities = documents.Items.map(doc => {
        const docEmbedding = parseEmbedding(doc.embedding);
        return {
            chunk: doc.chunk,
            similarity: cosineSimilarity(questionEmbedding, docEmbedding)
        };
    });

    // Step 4: Get top 3 most relevant contexts
    const topChunks = similarities
        .sort((a, b) => b.similarity - a.similarity)
        .slice(0, 3);

    // Step 5: Inject context into Claude prompt
    const context = topChunks.map(c => c.chunk).join('\n---\n');
    const prompt = buildPrompt(question, context, conversationHistory);

    // Step 6: Generate response with Bedrock
    const response = await invokeBedrockModel(prompt);

    return {
        statusCode: 200,
        body: JSON.stringify({
            success: true,
            answer: response.answer,
            metadata: {
                tokens: response.tokens,
                cost: calculateCost(response.tokens),
                duration: response.duration
            }
        })
    };
};

// Cosine similarity for vector search
function cosineSimilarity(vecA, vecB) {
    const dotProduct = vecA.reduce((sum, a, i) => sum + a * vecB[i], 0);
    const magnitudeA = Math.sqrt(vecA.reduce((sum, a) => sum + a * a, 0));
    const magnitudeB = Math.sqrt(vecB.reduce((sum, b) => sum + b * b, 0));
    return dotProduct / (magnitudeA * magnitudeB);
}

3. AWS Bedrock Integration

Direct integration with Claude 3 Haiku via AWS Bedrock for cost-optimized AI responses:

// From lambda/chatbot-handler/bedrock-client.js (Bedrock wrapper)
import { BedrockRuntimeClient, InvokeModelCommand } from '@aws-sdk/client-bedrock-runtime';

const bedrockClient = new BedrockRuntimeClient({ region: 'us-east-1' });

async function invokeClaude(prompt, conversationHistory) {
    const messages = [
        ...conversationHistory.map(msg => ({
            role: msg.role,
            content: msg.content
        })),
        { role: 'user', content: prompt }
    ];

    const requestBody = {
        anthropic_version: 'bedrock-2023-05-31',
        max_tokens: 1024,  // Cost optimization: limit output
        temperature: 0.7,
        messages: messages
    };

    const command = new InvokeModelCommand({
        modelId: 'anthropic.claude-3-haiku-20240307-v1:0',
        body: JSON.stringify(requestBody)
    });

    const startTime = Date.now();
    const response = await bedrockClient.send(command);
    const duration = Date.now() - startTime;

    const responseBody = JSON.parse(
        new TextDecoder().decode(response.body)
    );

    // Track cost
    const inputTokens = responseBody.usage.input_tokens;
    const outputTokens = responseBody.usage.output_tokens;
    const cost = (inputTokens * 0.00025 / 1000) +
                 (outputTokens * 0.00125 / 1000);

    console.log(`Bedrock Response: ${outputTokens} tokens, $${cost.toFixed(6)}, ${duration}ms`);

    return {
        answer: responseBody.content[0].text,
        tokens: { input: inputTokens, output: outputTokens },
        cost: cost,
        duration: duration
    };
}

4. Prompt Engineering for Dual Roles

Created sophisticated prompt templates that handle both IT troubleshooting and portfolio questions:

// From lambda/chatbot-handler/prompt-templates.js (330 lines)
function buildPrompt(question, ragContext, conversationHistory) {
    // Detect if question is IT troubleshooting or portfolio-related
    const isITQuery = detectITKeywords(question);

    if (isITQuery) {
        return `You are an expert IT helpdesk assistant specializing in:
• Network troubleshooting (WiFi, VPN, DNS, firewalls)
• Windows 11/10 support
• macOS support
• Linux command-line help
• Xerox printer troubleshooting
• Office 365 issues

**Context from knowledge base:**
${ragContext}

**Previous conversation:**
${formatConversationHistory(conversationHistory)}

**User question:** ${question}

Provide a clear, step-by-step troubleshooting response. Be concise but thorough.`;
    } else {
        return `You are Don Sylvester's portfolio assistant. Answer questions about:
• Skills and experience
• AWS and cloud expertise
• Projects and achievements
• Availability and contact information

**Context from knowledge base:**
${ragContext}

**Previous conversation:**
${formatConversationHistory(conversationHistory)}

**User question:** ${question}

Answer professionally and concisely. Highlight relevant technical skills.`;
    }
}

// Optimize knowledge base to reduce tokens (cost savings)
function optimizeKnowledgeBase(fullKB, question) {
    const keywords = extractKeywords(question);

    // Only include relevant sections
    const relevantSections = Object.keys(fullKB).filter(section =>
        keywords.some(kw => section.toLowerCase().includes(kw))
    );

    return relevantSections.map(s => fullKB[s]).join('\n');
    // Result: 60% reduction in tokens, 60% cost savings
}

5. DynamoDB Vector Database

Stored document chunks with pre-computed embeddings for fast semantic search:

// DynamoDB Table Schema
{
    "TableName": "portfolio-chatbot-VectorDB",
    "Items": [
        {
            "id": "skills-001",
            "chunk": "Don has extensive AWS experience including EC2, S3, Lambda, Bedrock, DynamoDB...",
            "embedding": [0.234, -0.567, 0.891, ... /* 1536 dimensions */],
            "metadata": {
                "category": "skills",
                "subcategory": "cloud"
            }
        },
        {
            "id": "projects-001",
            "chunk": "AWS Bedrock AI Chatbot: Built serverless chatbot with RAG, 99% cost savings...",
            "embedding": [0.123, -0.456, 0.789, ... /* 1536 dimensions */],
            "metadata": {
                "category": "projects",
                "subcategory": "ai"
            }
        }
        // ... more document chunks
    ]
}

// Query pattern: Vector similarity search
async function searchVectorDB(questionEmbedding) {
    // Scan all documents (for small DB, cost-effective)
    const allDocs = await dynamodb.scan({ TableName: 'VectorDB' });

    // Rank by similarity
    const ranked = allDocs.Items
        .map(doc => ({
            ...doc,
            score: cosineSimilarity(questionEmbedding, doc.embedding)
        }))
        .sort((a, b) => b.score - a.score)
        .slice(0, 3); // Top 3 results

    return ranked;
}

Key Features & Highlights

1. Cost Optimization (99% Savings)

Achieved dramatic cost reduction through strategic architectural choices:

Cost Breakdown (1,000 requests/month):
• Claude 3 Haiku: ~$0.28 (avg 350 tokens/request)
• Lambda: $0.00 (free tier covers 1M requests)
• DynamoDB: $0.01 (on-demand, minimal reads)
• API Gateway: $0.00 (free tier covers 1M calls)
Total: ~$0.30/month vs $30/month with ChatGPT API (99% savings)

2. RAG for Factual Accuracy

Retrieval-Augmented Generation eliminates AI hallucination:

3. Serverless Architecture (Zero Maintenance)

Production-grade infrastructure with no servers to manage:

4. Dual-Role Intelligence

Intelligent query routing for two distinct use cases:

Development Journey

Phase 1: Research & Architecture (Week 1)

Phase 2: Backend Implementation (Week 2)

Phase 3: Prompt Engineering & Optimization (Week 3)

Phase 4: Frontend Integration & Deployment (Week 4)

Key Outcomes

Technical Challenges Solved

Why This Project Matters

This chatbot demonstrates several valuable skills for potential employers:

Future Enhancements

Skills Demonstrated

Previous: AWS Optimization Next: Network Integration