A beginner-friendly walkthrough of how an AI-powered timesheet assistant works — from voice input to structured database entries, with conversational validation in between.

If you've ever had to log timesheets at the end of a long workday, you know the feeling: you stare at a form, trying to remember exactly what you worked on, for how long, and under which project code. It's tedious — and it's the kind of work that AI can actually take off your plate.

This post walks through a real architecture our team designed to do exactly that: an AI-assisted timesheet tool built on top of an existing ETM (Employee Time Management) system. No magic, just a clear pipeline you can follow step by step.

The big picture

Think of the system as a smart assistant sitting between you and the timesheet form. Instead of filling in fields manually, you just talk to it — or type naturally — and it figures out the rest.

Here's what happens under the hood:

You send a voice message or typed text from your phone or browser.
An API Gateway (built with NestJS) receives the request.
An AI Orchestrator takes over — it coordinates several specialist modules to understand your input.
If your input is complete, it logs the timesheet entry automatically.
If something is missing, it asks you a follow-up question — like a chat conversation.

Step 1 — Your input reaches the system

The client (web app or mobile) accepts two kinds of input:

Text: "I worked on the ETM dashboard bug fix for 3 hours this morning, project ETM-2024."
Voice: You speak, and a Speech-to-Text (STT) module converts it to the same text string.

Either way, everything becomes text before the AI starts working on it. This keeps the rest of the pipeline simple and consistent.

Step 2 — The LLM parses your intent

Once the text arrives at the AI Orchestrator, a Large Language Model (LLM) reads it and tries to extract a structured timesheet entry. Think of this as filling in a form automatically:

{
  "project": "ETM-2024",
  "task": "Dashboard bug fix",
  "duration_hours": 3,
  "date": "2025-04-17",
  "time_of_day": "morning"
}

The LLM is prompted to return JSON output, not freeform text. This is called structured output — it's a standard technique where you tell the model exactly what shape you need the answer in.

Step 3 — RAG fills in context you didn't mention

What if you just said "worked on the bug fix" without specifying a project? That's where the RAG (Retrieval-Augmented Generation) layer comes in.

RAG connects the LLM to a vector database that stores your project history, team assignments, and recent activity. So if "bug fix" has been associated with "ETM-2024" in your last 10 entries, the system can infer it — and fill it in for you.

This is similar to how your email app suggests completions based on your writing history, but smarter because it reasons over structured records.

Step 4 — Validation catches what's still missing

The Validator module checks the parsed entry against required fields. A valid timesheet typically needs:

Project code
Task description
Duration
Date

If any of these are still missing after the LLM and RAG steps, the system doesn't just fail silently. It triggers a conversational loop.

Step 5 — The chatbot asks, you answer

This is where it gets interesting. Instead of returning an error, the assistant sends you a friendly follow-up message:

"Got it — 3 hours on the dashboard bug fix. Which project should I log this under?"

You reply: "ETM-2024."

The system merges your reply with the existing data and re-validates. If everything checks out, it proceeds. If not, it asks again — until the entry is complete. This loop is lightweight and fast because the LLM only needs to parse one short answer at a time.

Step 6 — The entry is submitted

Once the entry is fully validated, the Action Router sends it to the appropriate API — either the Timesheet API or Leave API depending on the type of entry. The APIs write the data to the SQL database, just like a normal form submission would. The AI layer is completely transparent to the backend.

Why this architecture works for teams

A few things make this approach practical for a real company:

No backend changes needed. The AI layer sits on top of existing APIs. ETM doesn't know or care that an AI submitted the timesheet.
Graceful degradation. If the LLM is unavailable, the system falls back to a standard form. The AI is an enhancement, not a dependency.
Auditable. Every step is logged — what the user said, what the LLM parsed, what was submitted. You can trace any entry back to its origin.
Extensible. The same pattern works for leave requests, expense claims, or any structured form where users find manual entry painful.

Putting it all together

The core insight is simple: users shouldn't have to think in database schemas. They should be able to say what they did in plain language, and the system should figure out the rest — asking only when it genuinely doesn't know.

This is what good AI tooling looks like in a real enterprise context: not a chatbot bolted on top, but a layer that speaks the user's language and the system's language at the same time.

If you're building something similar, the stack is straightforward — NestJS for the gateway, any major LLM with function calling or structured output support, a vector store like pgvector or Qdrant for RAG, and your existing REST APIs on the backend. The hard part isn't the technology — it's designing the validation logic and the conversational prompts so they feel natural, not robotic.