Betty's Blog

Content Sources 101

Written by Sanjit Reddy | Apr 6, 2026 4:12:46 PM

When organizations first think about powering an AI knowledge assistant, it is tempting to think bigger is better. More sources, more systems, more content, more coverage. On paper, that sounds like the fastest path to value. In practice, it often creates the opposite result.

The goal is not to connect everything at once. The goal is to connect the content that will help members get better answers quickly, while giving the organization a clear path to ROI. At Betty, that starting point looks different for every customer because there is no universal best source type. The right choice depends on where an association’s most useful, most trusted knowledge already lives.

That is the real starting principle. Fast value does not come from boiling the ocean. It comes from identifying the content sources that already contain the answers members are most likely to need, then making those answers easier to find and use.

Start with the Most Trusted Answers

The most valuable content source is usually not the biggest repository. It is the source that already holds trusted answers to common questions. For one association, that might be a well-maintained website with clear pages on membership, events, certifications, and benefits. For another, it might be a structured feed that powers publications or a document repository where staff maintain official guidance. In another case, it may be content delivered through SFTP, RSS, or a custom API feed because those sources give the organization more control over what is shared and how it is updated. The distinction matters.

Fast value is not just about getting content into the system. It is about getting the right content into the system. A smaller, curated source will usually create stronger results than a much larger source full of duplicate, thin, outdated, or loosely structured information. That is why the first question is not, “What can we connect?” It is, “Where do our best answers already live?”

Why Some Sources Deliver Value Faster

In many cases, the content sources that create value fastest are the ones that are already organized, maintained, and easy for the customer to control. Structured feeds are a good example. Sources like RSS, SFTP, and custom API feeds are often strong starting points because they reflect content the organization is already managing with intention. They tend to be more predictable, easier to keep current, and better aligned with how the customer wants content to be ingested over time. That does not make them automatically best for every organization. It does make them a strong fit when the goal is high-quality answers with a clear maintenance path.

Document repositories can also be high-value starting points when important knowledge lives outside the public website. Google Drive, SharePoint, uploaded files, and PDFs can all play a role here. For some organizations, the public site explains the basics, but the most useful knowledge lives in internal documents, policy files, educational materials, or member resources. In those cases, starting with documents may create faster value than focusing only on public pages.

Website-based sources can also be effective, especially when the site is already rich in content and well organized. A customer may start with their website through a crawler, sitemap, WordPress, Drupal, or direct links if those sources reflect the content members already rely on. When a site is current, structured, and content-rich, it can be a practical starting point.

The key is not the source type by itself. The key is whether the source contains authoritative content in a form that supports good answers.

Quality Matters More Than Quantity

One of the easiest mistakes to make is assuming that more connected content will automatically lead to better results. Usually, it is the opposite. More content only helps when that content is accurate, relevant, and structured in a way that supports retrieval.

If a source has very little substance, it will not add much value. If it is highly unstructured, inconsistent, or full of duplicate material, it can make it harder to surface the right answer. This is one reason public website crawling should be approached thoughtfully. A website may seem like the obvious first place to start, but public pages do not always contain the highest-value knowledge. Sometimes they are light in detail. Sometimes they are built for navigation or marketing rather than answering questions directly.

That does not mean that website content is unimportant. It often is important. It just means that public content is not automatically the best first source. In many cases, the strongest outcomes come from starting with the content that is most complete, most current, and most trusted, whether that lives on the website, in a feed, or in a document repository.

Clean, authoritative content usually beats broad, messy coverage.

Think in Terms of Use Cases

The best way to choose a starting source is to work backward from the questions members are most likely to ask.

If the goal is to answer questions about events, education, publications, or news, a structured feed or content-managed website may be the strongest fit. If the goal is to answer detailed questions about certification requirements, member policies, governance, or internal resources, documents or repository-based sources may be more important. If the goal is to support discoverability across a broad set of published content, website and CMS-connected sources may play a larger role.

This is why there is no strict best source type. The right starting point depends on the use case, the content, and how the organization manages information today. Betty supports multiple ways of connecting content because associations do not publish knowledge in the same way. Some rely on websites, some on feed, and others on shared drives. Some rely on a mix.

The important thing is to choose the source that best matches the member experience the organization wants to improve first.

What Not to Do

A common instinct is to connect every available source before launch so nothing gets missed. That sounds thorough, but it is often not the fastest route to value.

A better approach is to avoid treating every repository as equally useful. Not all content deserves to be first. Some sources are rich with trusted knowledge. Others are sparse, outdated, or not especially helpful for answering member questions. Connecting everything at once can make the content picture bigger without making it better.

The stronger move is to start with the sources that already reflect how the organization communicates its most important answers. Then, once usage patterns become clearer, additional sources can be added where they truly improve coverage.

That creates a better signal from the beginning and a clearer line to ROI.

A Better Way to Think About “First”

The question is not which integration is best in the abstract. The better question is which content source will help this organization deliver useful answers fastest.

In many cases, that will be a source the customer already controls closely, such as RSS, SFTP, or a custom API feed. In other cases, it will be a familiar content home like Google Drive, SharePoint, WordPress, Drupal, or the public website. Media sources like YouTube or Vimeo may also add value, but more often as a supplement than as the best place to begin. More specialized source types may have a role too, but not every option needs to be central to the first decision.

Fast value comes from fit. It comes from choosing content that is trusted, maintained, relevant, and rich enough to answer the questions that matter most.

Conclusion

When organizations think about content sources, the smartest first move is usually not the broadest one. It is the most focused one.

The best starting source is the one that already contains the most reliable answers to the questions members ask most often. That source may be a feed, a document repository, a website, or something else entirely. What matters is not the format alone. What matters is the quality of the content, how well it maps to member needs, and how effectively it can stay current over time.

That is where fast value comes from. Not from connecting everything at once, but from connecting the right content first.