Building a Custom Connector
You can connect any data source to Trove by building a local connector. Your code fetches data from the source and pushes it to Trove via the Sync API.
When to Build a Custom Connector
Section titled “When to Build a Custom Connector”Build a custom connector when you have a data source Trove doesn’t support yet (e.g., Notion, Slack, email, local files, a proprietary database).
Choose Your Execution Mode
Section titled “Choose Your Execution Mode”- Local (recommended for custom connectors). Your code fetches data and pushes it to Trove. You control when and how data is fetched.
- Cloud. Trove runs the connector on a schedule. Only available for built-in connector types.
Step-by-Step Walkthrough
Section titled “Step-by-Step Walkthrough”1. Create the Connector
Section titled “1. Create the Connector”Register your connector with Trove. This tells Trove to expect data from a new source.
curl -X POST https://api.ontrove.sh/graphql \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "query": "mutation { createConnector(input: { connectorType: \"notion\", name: \"My Notion\", execution: LOCAL, config: {} }) { id name status } }" }'Save the returned connector id. You need it for every sync request.
2. Fetch Data from Your Source
Section titled “2. Fetch Data from Your Source”Write code that reads from your data source. Use the source’s API, read files, scrape pages, whatever works. Trove doesn’t care how you get the data, only how you send it.
3. Push Documents to Trove
Section titled “3. Push Documents to Trove”Send documents to the Sync API in batches of up to 50:
curl -X POST https://api.ontrove.sh/api/sync/documents \ -H "Authorization: Bearer YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "connector_id": "YOUR_CONNECTOR_ID", "documents": [ { "external_id": "notion-page-abc123", "title": "Meeting Notes - Q1 Planning", "text": "Full text of the Notion page...", "author": "Jane Smith", "date": "2026-03-20T10:00:00Z", "content_type": "text", "tags": ["meetings", "planning"] } ] }'Each document in the batch accepts these fields:
| Field | Required | Description |
|---|---|---|
external_id | Yes | Stable identifier from the source system (used for dedup) |
title | No | Document title |
text | No* | Full text content |
url | No | Original URL |
author | No | Content author |
date | No | Original creation date (ISO 8601) |
content_type | No | text, transcript, highlight, or bookmark (default: text) |
tags | No | Array of string tags |
metadata | No | Arbitrary JSON for connector-specific data |
*At least text or audio_url must be provided. If you provide audio_url, Trove queues a transcription workflow.
4. Handle Pagination
Section titled “4. Handle Pagination”For large datasets, send documents in batches of 50. Use the cursor field to track your sync position:
{ "connector_id": "YOUR_CONNECTOR_ID", "cursor": "page-3-of-10", "documents": [...]}The cursor is stored on the connector and returned in the response. On the next run, read the cursor to know where you left off.
5. Handle Deduplication
Section titled “5. Handle Deduplication”Use stable external_id values (e.g., the Notion page ID, the Slack message timestamp, the email message ID). If you sync the same external_id from the same connector again, Trove skips it automatically. Re-running your sync is safe and idempotent.
Example: Notion Connector (TypeScript)
Section titled “Example: Notion Connector (TypeScript)”async function syncNotion(connectorId: string, token: string) { const pages = await fetchNotionPages(); // Your Notion API calls
const documents = pages.map(page => ({ external_id: page.id, title: page.title, text: page.content, author: page.author, date: page.lastEdited, content_type: 'text' as const, tags: page.tags, }));
// Send in batches of 50 for (let i = 0; i < documents.length; i += 50) { const batch = documents.slice(i, i + 50); const response = await fetch('https://api.ontrove.sh/api/sync/documents', { method: 'POST', headers: { 'Authorization': `Bearer ${token}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ connector_id: connectorId, documents: batch, }), });
const result = await response.json(); console.log(`Indexed: ${result.documentsIndexed}, Skipped: ${result.documentsSkipped}`);
if (result.errors.length > 0) { console.error('Errors:', result.errors); } }}Response Format
Section titled “Response Format”The Sync API returns:
{ "documentsIndexed": 12, "documentsSkipped": 3, "errors": [ { "external_id": "page-xyz", "error": "Text content is empty" } ], "cursor": "page-3-of-10"}documentsIndexed. Number of new documents successfully ingested.documentsSkipped. Number of documents skipped due to deduplication.errors. Array of per-document errors with theexternal_idthat failed and the reason.cursor. Echoed back if you provided one.
Incremental Sync
Section titled “Incremental Sync”For efficiency, only sync new or updated content on subsequent runs.
- Store the cursor value between runs (in a file, database, or environment variable).
- On the next run, use the cursor to fetch only content created or modified since the last sync.
- How you implement “fetch since cursor” depends on the source API. Some support
updated_afterparameters; others require you to track timestamps yourself.