Skip to content

✨ AI-Powered Search

Curiositi’s semantic search goes beyond keyword matching. Find files by meaning, context, and intent — even when you don’t remember exact filenames or terms.

Traditional search looks for exact keyword matches:

Search: "quarterly report"
Matches: Files containing the words "quarterly" AND "report"
Misses: "Q1 financial summary", "quarter earnings review"

Semantic search understands meaning:

Search: "quarterly report"
Matches: "Q1 financial summary", "quarter earnings review", "fiscal report Q1"
Because: AI understands these all relate to periodic financial reports

When files are processed, their content is split into chunks and each chunk is converted into a 1536-dimensional vector embedding that captures semantic meaning:

"quarterly sales report" → [0.023, -0.156, 0.892, ...]
"Q1 revenue summary" → [0.019, -0.142, 0.887, ...]
Similar meaning = Similar vectors = Close in vector space

When you search, your query text is also converted to a vector embedding using the same model.

PostgreSQL with pgvector performs a cosine similarity search, comparing your query vector against all stored chunk vectors and returning the closest matches.

Matching chunks are grouped by their source file and returned with similarity scores.

Curiositi provides two search procedures:

Use searchWithAI for pure semantic search using vector embeddings:

const results = trpc.file.searchWithAI.useQuery({
query: "quarterly sales report",
limit: 10, // optional, max 100
minSimilarity: 0.7, // optional, 0.0 to 1.0
});

Use search for combined filename and semantic search:

const results = trpc.file.search.useQuery({
query: "report",
limit: 20, // optional, max 50
});

This combines traditional filename matching with semantic search for broader coverage.

The search is automatically scoped to the user’s active workspace.

Simply describe what you’re looking for:

"meeting notes about the product launch"
"contract with Acme Corporation"
"presentation about Q4 marketing strategy"

Images are searchable by their AI-generated descriptions. When an image is processed, a vision model generates a text description, which is then embedded:

Search: "team photo from offsite"
Finds: IMG_2847.jpg (description: "Group of employees at mountain retreat")
Search: "dashboard mockup with blue theme"
Finds: design-v2.png (description: "UI mockup showing analytics dashboard")
  1. Be specific — “Q1 marketing campaign budget” rather than “budget”
  2. Use natural language — Ask as you would a colleague
  3. Include context — “last month’s sales data” rather than “sales”
  4. Try variations — If one query doesn’t find what you need, rephrase it

Results include a similarity score (0.0 to 1.0):

ScoreMeaning
0.90+Very high relevance
0.80-0.89High relevance
0.70-0.79Good relevance
0.60-0.69Moderate relevance
< 0.60Lower relevance
  1. Check the file has completed processing (status: completed)
  2. Try different phrasing
  3. Lower the minSimilarity threshold
  4. Verify you’re in the correct workspace
  1. Make your query more specific
  2. Increase the minSimilarity threshold
  3. Check similarity scores in results to gauge relevance