Skip to main content
POST
/
data-sources
/
{source_id}
/
grep
Search data source with regex
curl --request POST \
  --url https://apigcp.trynia.ai/v2/data-sources/{source_id}/grep \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "pattern": "authentication.*token",
  "path": "/",
  "context_lines": 5,
  "A": 10,
  "B": 10,
  "case_sensitive": false,
  "whole_word": false,
  "fixed_string": false,
  "max_matches_per_file": 10,
  "max_total_matches": 100,
  "output_mode": "content",
  "highlight": false,
  "include_line_numbers": true,
  "group_by_file": true,
  "exhaustive": true
}
'
{
  "success": true,
  "source_type": "documentation",
  "matches": [
    {
      "path": "<string>",
      "split": "<string>",
      "row_index": 123,
      "line": "<string>",
      "line_number": 123,
      "context": [
        "<string>"
      ],
      "context_start_line": 123
    }
  ],
  "files": [
    "<string>"
  ],
  "counts": {},
  "pattern": "<string>",
  "path_filter": "<string>",
  "total_matches": 123,
  "files_searched": 123,
  "files_with_matches": 123,
  "truncated": true,
  "options": {
    "case_sensitive": true,
    "whole_word": true,
    "lines_before": 123,
    "lines_after": 123,
    "output_mode": "<string>"
  }
}

Authorizations

Authorization
string
header
required

API key must be provided in the Authorization header

Path Parameters

source_id
string
required

Flexible data source identifier (UUID, display name, or URL)

Body

application/json
pattern
string
required

Regex pattern to search for

Example:

"authentication.*token"

path
string
default:/

Limit search to this virtual path prefix

context_lines
integer

Lines before AND after each match (shorthand for A/B). Overridden by A or B if specified.

Required range: 0 <= x <= 10
A
integer

Lines after each match (like grep -A). Overrides context_lines for after.

Required range: 0 <= x <= 20
B
integer

Lines before each match (like grep -B). Overrides context_lines for before.

Required range: 0 <= x <= 20
case_sensitive
boolean
default:false

Case-sensitive matching (default is case-insensitive)

whole_word
boolean
default:false

Match whole words only

fixed_string
boolean
default:false

Treat pattern as literal string, not regex

max_matches_per_file
integer
default:10

Maximum matches to return per file

Required range: 1 <= x <= 100
max_total_matches
integer
default:100

Maximum total matches to return

Required range: 1 <= x <= 1000
output_mode
enum<string>
default:content

Output format:

  • content: Return matched lines with context
  • files_with_matches: Return only file paths that matched
  • count: Return match counts per file
Available options:
content,
files_with_matches,
count
highlight
boolean
default:false

Add >>markers<< around matched text in results

include_line_numbers
boolean
default:true

Include line numbers in results

group_by_file
boolean
default:true

Group matches by file in results

exhaustive
boolean
default:true

Search ALL chunks for complete results (default: true). When true, iterates through all indexed chunks to find every match (like real grep). When false, uses BM25 keyword search to find top candidates first (faster but may miss matches).

Response

Search completed successfully

success
boolean
source_type
enum<string>

Type of data source searched

Available options:
documentation,
huggingface_dataset,
research_paper
matches
(HuggingFaceGrepMatch · object | DocGrepFileMatch · object)[]

Matches (format depends on source_type):

  • For documentation: grouped by file with nested matches
  • For HuggingFace datasets: flat list with split/row_index

Flat match for HuggingFace datasets

files
string[]

List of file paths (when output_mode is 'files_with_matches')

counts
object

Match counts per file (when output_mode is 'count')

pattern
string

The pattern that was searched

path_filter
string

Path filter that was applied

total_matches
integer
files_searched
integer
files_with_matches
integer

Number of files that contained matches

truncated
boolean

Whether results were truncated due to limits

options
object

Applied search options