Devlog

Phase 4: Embeddings, Semantic Search & Cached AI Summaries

Adding AI deeper into the site: project cards now show cached AI-generated TL;DRs, and a semantic search bar lets visitors find projects by meaning, not keyword.

EmbeddingsRAGVercel CacheNext.js 16use cache

What's New

Phase 4 embeds AI as functional site features rather than just a chat widget. Two new capabilities:

  1. AI Project Summaries — each project card shows a one-sentence TL;DR generated by Claude, cached with Next.js 'use cache' so it's only computed once
  2. Semantic Project Search — a search bar that understands meaning, not just keywords, powered by OpenAI embeddings + cosine similarity

AI Summaries with 'use cache'

Next.js 16 introduced Cache Components (cacheComponents: true in next.config.ts), which enables the 'use cache' directive — a string you drop at the top of an async function to cache its return value server-side.

// lib/ai/generate-summary.ts
export async function generateProjectSummary(
  projectId: string,
  longDescription: string
): Promise<string> {
  "use cache";
  cacheLife("days");
  cacheTag(`project-summary-${projectId}`);

  const { text } = await generateText({
    model: "anthropic/claude-sonnet-4.6",
    prompt: `Write a single sentence TL;DR...\n\n${longDescription}`,
    maxTokens: 60,
  });
  return text.trim();
}

Key things happening here:

  • "use cache" tells Next.js to cache the return value (keyed by function identity + arguments)
  • cacheLife("days") sets a built-in cache profile — stale for days, revalidates in the background
  • cacheTag(...) lets us invalidate per-project if needed with revalidateTag()

The first request generates the summary via the AI. Every subsequent request serves it from cache — instant.

Semantic Search Architecture

The search doesn't use a vector database. For a portfolio with a handful of projects, pre-computed JSON embeddings are more appropriate than provisioning Neon + pgvector.

Build-time: Run scripts/generate-embeddings.ts to embed all project descriptions with embedMany and write vectors to lib/embeddings/project-embeddings.json.

Search-time (app/api/search/route.ts):

  1. embed the user's query string
  2. Load stored embeddings from the JSON file
  3. Compute cosineSimilarity between query and each project
  4. Return projects ranked by score
const { embedding: queryEmbedding } = await embed({
  model: "openai/text-embedding-3-small",
  value: query,
});

const results = storedEmbeddings
  .map((item) => ({
    id: item.id,
    score: cosineSimilarity(queryEmbedding, item.embedding),
  }))
  .sort((a, b) => b.score - a.score);

The AI SDK's cosineSimilarity function is built-in — no external math library needed.

Client-side (ProjectSearchContainer): A client component with a debounced search input. When active, it POST-queries /api/search and reorders the project grid accordingly.

Server/Client Component Boundary

The trickiest design decision was the component tree:

  • ProjectsSection (Server Component, async) — awaits cached summaries and enriches project data
  • ProjectSearchContainer (Client Component) — receives the enriched projects as props, handles search state
  • ProjectCard (shared component) — renders from both server and client contexts since it has no server-only APIs

This keeps AI-expensive work server-side and cached, while making search interactive client-side.

Cursor Skills Used

  • next-cache-components'use cache' directive patterns, cacheLife, cacheTag
  • vercel-storage — storage decision matrix (chose JSON over Neon for small dataset)
  • ai-sdkembed, embedMany, cosineSimilarity from the AI SDK docs