Skip to main content
Use this page when your first question is not “which tool do I call?” but “what kind of source do I have?” If you prefer to browse by job to be done, start at Capabilities.

At a Glance

Code & Repositories

GitHub repositories, package source code, file trees, grep, and code search workflows.

Documentation Sites

Crawl public docs, honor llms.txt when available, and search pages semantically or with regex.

PDFs & Research Papers

Structured PDF parsing, section-aware retrieval, and paper search for technical documents.

HuggingFace Datasets

Index dataset rows and schema-like structure for semantic retrieval.

Google Drive

Connect Drive, browse files and shared drives, select content, and keep it synced.

Spreadsheets & Tables

CSV, TSV, XLSX, and XLS ingestion with row-aware indexing.

Slack & Conversations

Index Slack history and search conversations with org-scoped isolation.

Local Knowledge

Local folders, databases, and chat history via direct folder indexing or continuous sync.

Source-Type Matrix

Source typeBring it in withBest tools after thatNotes
Code & repositoriesindex, Tracer, get_github_file_treesearch, nia_read, nia_grep, nia_explorePackage source code also works without indexing via nia_package_search_hybrid
Documentation sitesindexsearch, nia_read, nia_grep, nia_explorellms.txt aware, supports crawl filters
PDFs & research papersindex, PDF Indexingsearch, nia_readTree-guided retrieval for long documents
HuggingFace datasetsindex, HuggingFace Datasetssearch, nia_read, nia_exploreLarge datasets are sampled intelligently
Google DriveGoogle Drive Integrationsearch, nia_read, nia_grep, nia_exploreSupports selected files, folders, shared drives, and incremental sync
Spreadsheets & tablesindex with CSV, TSV, XLSX, or XLSsearch, nia_read, nia_exploreRow and header aware
SlackSlack Searchsearch, nia_grepWorkspace data stays org-scoped
Local folders, databases, chat historyindex(folder_path=...), Local Syncsearch, nia_read, nia_grep, nia_exploreBest fit for continuously changing personal or team knowledge

Code & Repositories

Best for source code, implementation patterns, architecture exploration, and exact file reading. Use when:
  • you want to index a GitHub repository and search it semantically
  • you need grep-style matching across a repo
  • you want public package source code without indexing first
  • you want GitHub code search without maintaining an index
Start with:
  • index for repositories you want in your own workspace
  • nia_package_search_hybrid for package source code
  • Tracer for public GitHub repo search without indexing
  • get_github_file_tree for quick structure inspection
Typical prompts:
"Index https://github.com/vercel/ai"
"Search the indexed repo for how streaming responses are implemented"
"Use Tracer to find how auth middleware works in the Next.js repo"
"Search the fastapi package for authentication examples"

Documentation Sites

Best for framework docs, product docs, API docs, and structured technical websites. Use when:
  • you want grounded answers from official docs
  • you need a source your agents can cite and revisit
  • you want to crawl a docs site with include or exclude patterns
Start with:
  • index on the docs URL
  • search to find relevant pages
  • nia_read and nia_grep for deeper inspection
Typical prompts:
"Index https://nextjs.org/docs"
"Search the docs for cache invalidation"
"Read the page about route handlers"

PDFs & Research Papers

Best for long technical documents where section structure matters. Use when:
  • you have PDFs, papers, filings, manuals, or legal docs
  • you need section-aware retrieval instead of flat chunk search
Start with:

HuggingFace Datasets

Best for row-level search, schema discovery, and agentic retrieval over dataset contents. Use when:
  • you need to search examples, records, or splits
  • you want natural-language access to a dataset instead of manual browsing
Start with:

Google Drive

Best for cloud-hosted files and folders you want to browse selectively, index deeply, and keep in sync over time. Use when:
  • your working knowledge already lives in Drive instead of a repo or docs site
  • you need selected files or folders rather than a full bucket import
  • you want shared drives and incremental sync support
Start with:

Spreadsheets & Tables

Best for CSVs, TSVs, Excel files, and structured business data. Use when:
  • your source is tabular
  • you want row-aware retrieval and header-aware indexing
  • you need a lighter-weight alternative to database sync
Start with:
  • index on CSV, TSV, XLSX, or XLS files
  • search for semantic lookup
  • nia_read and nia_explore to inspect rows and structure

Slack & Conversations

Best for operational knowledge, team decisions, support threads, and internal discussion history. Use when:
  • important context lives in Slack instead of docs
  • you want semantic and keyword retrieval over conversations
Start with:

Local Knowledge

Best for internal notes, local folders, private documents, databases, and saved chat history. Use when:
  • the knowledge lives on disk or in a local database
  • you want a continuously fresh private index
  • you want to sync chat history and local project context into Nia
Start with:
  • index(folder_path=...) for one-off folder indexing
  • Local Sync for continuous synchronization