Case Study: LectureLens — AI Study Planner
LectureLens converts raw course materials (PDFs, notes, slides) into distilled summaries, targeted flashcards, interactive Q&A, and a milestone-driven study schedule using an incremental AI processing pipeline.
Current Status (WIP)
LectureLens is in an MVP build cycle. Core schema and row level security policies are codified. Next focus: completing auth flows, wiring the processing trigger, and surfacing topic extraction in the dashboard before moving into flashcards, practice Q&A, and semantic search.
Implemented
- Postgres schema (lectures, chunks, topics)
- RLS policies (migrations committed)
- Edge function scaffold for processing pipeline
- Initial upload flow prototype
In Progress / Next
- Auth UI (signup / signin / reset)
- Processing trigger + status badges
- Lecture detail & topic navigation
- Flashcards & study mode (Milestone 3)
- Practice Q&A + quiz mode (Milestone 4)
- Semantic search & RAG (Milestone 5)
- Study plans & reminders (Milestone 6)
Milestone Roadmap
- M1: Auth foundation & processing integration
- M2: Lecture detail pages & topic UX
- M3: Flashcards & study mode
- M4: Practice Q&A generation
- M5: Semantic search & retrieval‑augmented answers
- M6: Study schedules & progress tracking
- M7: Deployment, analytics & polish
The Challenge
Students juggle heterogeneous materials (slides, lecture notes, textbook PDFs) that vary in density and formatting. Translating this into a sustainable plan with spaced repetition and concept reinforcement is tedious and error-prone. Repetition without structure leads to plateaued retention and wasted time.
The Solution
Product Flow
- Import course material (notes, PDFs, slides)
- Generate summaries and flashcards
- Ask follow-up questions in Q&A
- Create a study schedule with milestones
Architecture
- Next.js (App Router) for UI and server actions
- Supabase for authentication, Postgres, and secure storage
- TypeScript for correctness across the stack
AI Processing Pipeline
- Extraction & Segmentation: Uploaded PDF/notes are parsed into logical blocks (headings, paragraphs, enumerations) to preserve hierarchy.
- Chunk Formation: Blocks grouped under a token threshold to keep prompt context focused and predictable in cost.
- Per-Chunk Summarization: Summaries generated with consistency directives (terminology normalization) to reduce drift across sections.
- Global Consolidation: Merge overlapping concepts; eliminate redundant glossary-style lines.
- Flashcard Generation: Pattern-based prompts classify facts vs processes vs edge cases; dedupe via similarity scoring.
- Interactive Q&A: User questions map to most relevant blocks (future: vector search) before response synthesis.
- Schedule Synthesis: Summary difficulty + card density drive milestone allocation (intro → drill → review cycles).
Data Model
- documents: id, title, original_filename, processing_state
- blocks: id, document_id, ordinal, text, type (heading|paragraph|list)
- summaries: block_ids[], text, level (chunk|section|global)
- cards: question, answer, source_block_ids[], difficulty, next_review_at
- schedules: milestone_index, start_at, focus (new|review), target_card_ids[]
Key Constraints
- Token Limits: Chunk sizing prevents model context overflow and outlier latency.
- Terminology Drift: Consolidation normalizes synonyms to avoid mismatched flashcards.
- Redundancy: Similar blocks (slide bullets vs notes) filtered before card generation.
- Fair Scheduling: Milestones weighted by conceptual density, not raw character count.
Results & Impact
Study Outcomes
- ✅ Consistent review via flashcards and Q&A
- ✅ Structured schedules reduce planning overhead
- ✅ Centralizes materials and progress in one place
- ✅ Reduced cognitive load choosing what to study next
Platform Qualities
- ✅ Open-source transparency and extensibility
- ✅ Next.js + Supabase foundation for rapid iteration
- ✅ Clear roadmap for export and collaboration
- ✅ Strong typing reduces regression surface in pipeline changes
Challenges & Lessons
- Noisy PDFs: Hyphenated line breaks and multi-column layouts required normalization heuristics to prevent fragmented prompts.
- Duplicate Concepts: Overlapping slide + note content inflated card counts; similarity checks trimmed 20–30% redundant candidates.
- Prompt Consistency: Early versions produced variable answer formats; adding explicit output schemas (lists, definitions) improved dedupe.
- Scheduling Bias: Long early chapters dominated milestone allocation until normalized by concept density metrics.
- Auth & Isolation: Supabase row-level security ensures users only access their own documents and derived artifacts.
Next Steps
- Vector embeddings (pgvector) for semantic retrieval in Q&A.
- Adaptive spaced repetition using historical success rate per card.
- Calendar (ICS) export & sync.
- Collaborative study sets with shared progress overlays.
Note: This case study reflects an in-progress build. Impact metrics and advanced features (semantic search, adaptive scheduling) will be updated as milestones are delivered.