The word “agent” has quickly become one of the most overused buzzwords in the AI space. Every week,...
We Found Out.
Five days. A handful of AI tools. A team where most people had never written a line of code. Here’s what happened.
The Setup
A few months ago, we gave the team a challenge: pick up Claude Code or Codex, build something, and see what happens. No prior coding experience required. No expectation of anything polished. Just—go make a thing.
The goal wasn’t to turn everyone into engineers. It was to get hands-on with the tools that are reshaping what’s possible for teams like ours—the same tools we talk about with customers every day. We wanted to understand them from the inside, not just describe them from the outside.
Three quarters of the participants had never touched AI-assisted development before. Some had never really thought of themselves as builders at all.
I think we all surprised by what we were each able to accomplish.
What the Team Built
The range of projects that came out of the week says more than any summary could. People didn’t just pick safe, obvious ideas—they reached for things they actually wanted to exist.
Personal Project & Goal Management System
One team member—who came in describing themself as “not a coder”—built a working web app for tracking goals, both personal AND for Betty initiatives, breaking projects into tasks being completed, and visualizing progress over time. They used Claude Code to go from idea to functional model with its entire own User Interface in under a week. It’s now part of the daily flow for planning and updating workflows.
“I went from ‘I can’t code’ to having a working app that I actually use every day. The biggest surprise was how Claude helped me think through the problem, not just write the code.”
AI Agents for Data Ingestion Strategy
A member of our customer success team built Python-based agents to handle the messy, manual work of structuring and validating incoming data, all by using Claude for planning build, with plans now to transfer to GitHub Copilot for implementation for the entire team to use! The agents route content automatically based on rules defined in plain English. Early testing with real customers looks promising, and the approach has already shaped how we talk through ingestion challenges with clients.
Automated Testing and Review Tools
Quality assurance is one of those areas where the work is constant but rarely glamorous. One team member built Node.js-based tools that flag common issues before anything reaches a human reviewer. They went from preliminary testing to actually being in real use within a couple days, and this concept is further influencing how we think about QA conversations with customers.
“I built something that could save our team significant time each week. That’s not just cool—that’s immediately useful.”
Customer Feedback Processing Agents
Manually sorting through feedback—categorizing it, summarizing it, figuring out what actually matters—is the kind of work that tends to fall through the cracks. One of our product team members built an agent using Claude’s API that reads, categorizes, and summarizes responses automatically. It’s processing test batches now to evaluate accuracy before broader use.
What We Actually Learned
We went in knowing it would be interesting and a learning experience. We came out with a few things we didn’t fully expect, including multiple products so ready-for-use that they were in place the week after the challenge!
The barrier really is lower
This gets said a lot, but the challenge made it concrete: we can do SO much more than ever before, and this is only speeding up. People who had been genuinely intimidated by the idea of building software, or even the thought of building an application, built working tools in a few days – alongside their full time responsibilities. That’s not a small thing. It changes who gets to participate in creating solutions—and that matters for any team.
The ceiling is higher than expected
The projects weren’t simple scripts. They involved API integrations, database operations, and real logic. First-time users built things that had actual complexity to them. That was encouraging in a way that surprised even us.
Prompting is a real skill, and it develops fast
The team got noticeably better at describing problems, breaking them into steps, and iterating based on AI feedback over the five days. It’s not a fixed ability—it’s something you practice. And it compounds.
Cross-team energy is hard to manufacture and easy to lose
Some of the best moments were team members sharing what they’d built with each other, getting curious about each other’s approaches, and asking “could we do something like that for this?” That kind of thinking doesn’t always happen naturally. The challenge created conditions for it, and the ‘Demo Day’ meeting left everyone with better ideas for their own recent builds, as well as what they were going to do next.
Being Honest About the Hard Parts
We’d rather be useful than impressive, so here’s what was genuinely difficult:
- Environment setup took longer than expected for people without technical backgrounds. Budget extra time here if you run your own version of this, or create a shared/remote environment where users can log into to work.
- AI-generated code works, but it isn’t always clean. Security considerations, error handling, and edge cases don’t get handled automatically. Human review before anything goes near production is non-negotiable.
- When things broke in unexpected ways—especially involving external APIs—debugging required knowledge the team was still building. Having someone with technical experience available to unblock people was genuinely important.
- The learning curve is real. Lower than before, but still real. Fluency with these tools takes time and repetition, not just one good week.
Why This Matters for Us
At Betty, we spend a lot of time helping customers think through how AI can work inside their organizations. Doing this challenge made that work more grounded. When someone asks us about implementation friction, our team now has a lived version of that answer, not just a theoretical one.
The ingestion strategy work has already influenced how we approach data validation conversations with clients, the QA framework has become a starting point in quality assurance discussions and already saved more than a dozen hours of testing and re-testing, and the feedback agent holds promise that could transform entire parts of our process. That kind of practical credibility is hard to fake and easy to build—if you actually do the work.
We’re planning to do this again, how could we not? The tools are moving fast, and keeping our own fluency sharp is part of how we stay genuinely useful to the people we work with.
One Last Thing
If you’re thinking about running something like this with your own team—whether it’s a formal challenge or just an afternoon of experimentation—we’re happy to share what worked for us and what we’d do differently. Reach out.
The tools are here. The advantage goes to teams that get comfortable with them early.