Overview
JobSync MCP is a Model Context Protocol server that gives Claude (or any MCP-compatible LLM client) direct, structured access to job-search infrastructure — ATS APIs, an application pipeline, a deduplication cache, Airtable, and your personal profile. Instead of asking Claude to "find me jobs," you run automated multi-step workflows entirely inside the model context window: fetch → classify → deduplicate → store → track.
The server exposes 23 tools across eight functional domains. Each tool has a recommended model hint (⚡ haiku, ⚖ sonnet, 🔬 opus) so orchestrators can route calls to the cheapest capable model. All persistent state lives under ~/.jobsync/ — no cloud services required beyond Airtable (which is optional if you use the markdown sink).
Architecture at a glance
Claude / any MCP host
The server runs as a subprocess of the MCP host. All tool calls are synchronous JSON-over-stdio. The server reads ~/.jobsync/config.json on startup and validates required credentials based on the configured sink.
Installation & Configuration
Install
npm install -g jobsync-mcp
# Initialise config interactively
jobsync-mcp init
The init command creates ~/.jobsync/config.json with your Airtable credentials and preferred settings. You can also write the file manually — the schema is documented below.
Config fields — ~/.jobsync/config.json
| Field | Type | Default | Description |
|---|---|---|---|
| airtable.pat | string | — | Airtable Personal Access Token. Required when sink is airtable or both. Needs scopes: data.records:write, schema.bases:read. Add schema.bases:write if you want airtable_create_base. |
| airtable.baseId | string | — | Airtable base ID (starts with app…). Required with airtable sink. |
| airtable.tableName | string | "Jobs" | Name of the table within the base where job records are upserted. |
| airtable.fieldMap | object | undefined | Optional override mapping from JobSync field names to your Airtable column names. Useful if you renamed columns in an existing base. |
| sink | "airtable" | "markdown" | "both" | "airtable" | Where jobs are written. markdown writes to markdownPath. both writes to both simultaneously. |
| markdownPath | string | ~/.jobsync/jobs.md | Absolute path for markdown output. Only used when sink includes markdown. |
| lookbackHours | number | 12 | How far back (in hours) the scrape workflow considers a job "recently seen" for dedup purposes. Increase if you run scrapes less frequently. |
| usOnly | boolean | true | When true, the location filter rejects roles with clearly non-US locations. Set to false for remote-global or international searches. |
| enableFastPath | boolean | false | Gate for the four ATS fetcher tools. Set to true to enable fetch_greenhouse_jobs, fetch_lever_jobs, fetch_ashby_jobs, and fetch_workday_jobs. |
| profileDir | string | ~/.jobsync/profile | Directory containing skills.md, experience.md, projects.md, roles.json, and raw-resume.txt. |
| brandedOutput | boolean | true | When true, tool responses include JobSync branding headers. Set to false for plain JSON output. |
sink requires Airtable but airtable.pat or airtable.baseId are missing, the server will throw immediately — not at call time.File system layout
├── config.json # Server config (PAT, baseId, sink, …)
├── cache.db # SQLite dedup cache
├── pipeline.tsv # Application tracking (tab-separated)
├── portals.yml # Portal scanner config (search queries, companies)
├── jobs.md # Markdown sink output (if sink includes "markdown")
└── profile/
├── skills.md # Curated skill list
├── experience.md # Work history in markdown
├── projects.md # Notable projects
├── roles.json # Detected / custom / excluded job roles
└── raw-resume.txt # Plaintext resume parsed from PDF/DOCX
Claude Desktop setup
Add the server to claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\):
{
"mcpServers": {
"jobsync": {
"command": "npx",
"args": ["-y", "jobsync-mcp"]
}
}
}
After saving, restart Claude Desktop. The server appears in the tool list as jobsync.
Profile & Resume Tools
The profile system stores your professional identity in structured markdown files. Claude reads these files to understand your background during fit-analysis and job-match scoring. You can bootstrap the profile by pasting your resume — the parser handles PDF, DOCX, TXT, and MD.
Returns the full contents of every profile file in a single call. Use this at the start of a fit-analysis session so Claude has your background available in context.
Parameters
None.
Returns
| Field | Type | Description |
|---|---|---|
| skills | string | Contents of skills.md |
| experience | string | Contents of experience.md |
| projects | string | Contents of projects.md |
| roles | object | Raw roles.json — detected, custom, excluded arrays |
| activeRoles | string[] | Computed: (detected ∪ custom) − excluded |
| rawResume | string | Contents of raw-resume.txt (empty string if absent) |
Overwrites one of the three profile markdown files. Use this after onboarding to save structured skill or experience content that Claude generated from your resume.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| file | "skills"|"experience"|"projects" | required | Which file to write. |
| content | string | required | Full markdown content to write. Overwrites existing file. |
Manages the three-way role system: detected (auto-inferred from resume), custom (user-added), and excluded (suppressed from searches). The activeRoles set used for scraping is (detected ∪ custom) − excluded.
You can either replace entire arrays (detected, custom, excluded) or perform incremental mutations (addCustom, addExcluded, removeCustom, removeExcluded). Incremental mutations are preferred for interactive editing.
Parameters (all optional, at least one required)
| Param | Type | Description |
|---|---|---|
| detected | string[] | Replace the full detected roles array (from resume parsing). |
| custom | string[] | Replace the full custom roles array. |
| excluded | string[] | Replace the full excluded roles array. |
| addCustom | string[] | Append to custom array without touching the rest. |
| addExcluded | string[] | Append to excluded array. |
| removeCustom | string[] | Remove specific entries from custom array. |
| removeExcluded | string[] | Remove specific entries from excluded array. |
Parses a resume file into plaintext and saves it to raw-resume.txt in the profile directory. Supports PDF, DOCX, TXT, and MD formats. After parsing, run profile_update_roles to extract role keywords from the raw text.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| path | string | required | Absolute path to the resume file on the local filesystem. |
Returns
{ success, savedPath, charCount } — path where raw-resume.txt was written and the character count of extracted text.
Portals Configuration Tools
The portals config (~/.jobsync/portals.yml) drives scrape workflows — it defines which companies to monitor and which search queries to run. These tools let Claude read, update, and initialise this file from a built-in master company list.
Returns the raw YAML content of portals.yml plus an exists flag. If the file does not yet exist, exists is false and content is empty — use this to trigger onboarding.
Returns
| Field | Type | Description |
|---|---|---|
| content | string | Raw YAML string, or empty string if file absent. |
| exists | boolean | Whether the file exists. |
Overwrites portals.yml with the provided YAML string. Used during onboarding after Claude selects relevant companies from the master list. Can also be called directly for manual edits.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| content | string | required | Full YAML content to write. Must not be empty. |
Returns the built-in YAML snippet covering ~60 companies across AI labs, developer tools, voice AI, SaaS, and fintech. During onboarding, Claude calls this, filters by the user's target roles, then passes the subset to portals_write.
Returns
| Field | Type | Description |
|---|---|---|
| masterCompanyList | string | YAML snippet — all built-in companies with their ATS slugs. |
| path | string | Target path where portals.yml would be written. |
Portals YAML structure
search_queries:
- "software engineer AI"
- "backend engineer"
tracked_companies:
- name: Anthropic
ats: ashby
slug: anthropic
- name: OpenAI
ats: greenhouse
slug: openai
- name: Vercel
ats: lever
slug: vercel
Each entry in tracked_companies maps directly to a fast-path fetcher call. The ats field must be one of greenhouse, lever, ashby, or workday.
ATS Fast-Path Fetchers
These four tools call public ATS job-board APIs directly — no browser, no scraping. They return structured job arrays ready for classification and dedup. All four require enableFastPath: true in config.
"enableFastPath": true in ~/.jobsync/config.json to enable them.Fetches open roles from the Greenhouse job board API at boards-api.greenhouse.io/v1/boards/{slug}/jobs. Supports one slug or an array for batch fetching.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| slug | string | Single company slug (e.g. "openai"). | |
| slugs | string[] | Array of slugs for batch fetching. Provide either slug or slugs. |
Returns
{ total, jobs[] } where each job has: id, positionTitle, company, location, applyLink, datePosted, jobBoard: "greenhouse".
Notes
Link verification for Greenhouse uses the API (a 404 on /jobs/{id} means the role is closed) rather than a browser GET — faster and more reliable.
Fetches postings from the Lever public API at api.lever.co/v0/postings/{slug}?mode=json.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| slug | string | Single company slug (e.g. "vercel"). | |
| slugs | string[] | Array of slugs for batch fetching. |
Returns
{ total, jobs[] } — each job: id, positionTitle, company, location, applyLink, datePosted, team, jobBoard: "lever".
Notes
Link verification checks the state field via the Lever API — a state of "closed" marks the link inactive.
Fetches postings from the Ashby HQ posting API at api.ashbyhq.com/posting-api/job-board/{slug}. Ashby is popular among AI-native companies (Anthropic, Perplexity, etc.).
Parameters
| Param | Type | Description | |
|---|---|---|---|
| slug | string | Single Ashby company slug. | |
| slugs | string[] | Array of slugs for batch fetching. |
Returns
{ total, jobs[] } — each job includes isListed field; only listed jobs are returned.
null for datePosted. The server stamps today's date as a fallback before upserting to Airtable. Do not rely on this date for recency filtering.Fetches jobs from Workday's internal CXS (Candidate Experience) API. Accepts the full board URL rather than a slug, since Workday tenants have unique subdomains and paths.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| boardUrl | string | Full Workday board URL (e.g. "https://company.wd1.myworkdayjobs.com/en-US/External"). | |
| boardUrls | string[] | Array of board URLs for batch fetching. |
Notes
Workday's datePosted is also unreliable from the API payload — use rawFields.postedOn instead for accurate post dates. Link verification uses a GET request combined with soft-close pattern matching in the response body.
Filters & Classification
These tools form the classification pipeline. Run them after fetching to strip irrelevant roles before touching the cache or Airtable. For most workflows, classify_job_batch is the single call you need — it composes all filters internally.
Determines whether a location string is likely non-US. Uses pattern matching against known country/city patterns — does not make a network call.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| location | string | required | Location string from job posting (e.g. "London, UK", "Remote"). |
Returns
{ likelyNonUS: boolean }
Tests a job title (and optionally location) against include/exclude keyword lists. Returns passes: false if the title matches an exclude keyword or fails to match any include keyword.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| title | string | required | Job title to evaluate. |
| location | string | Location string — used for US-only check when usOnly is set. | |
| include | string[] | Whitelist keywords. Title must match at least one. Case-insensitive. | |
| exclude | string[] | Blacklist keywords. Title must match none. Case-insensitive. | |
| usOnly | boolean | When true, non-US locations also fail the filter. |
Returns
{ passes: boolean, title, location }
Classifies a job into an industry and enriches it with tags. Uses a combination of heuristic pattern matching and LLM-assisted reasoning (hence sonnet). Also flags H1B sponsorship likelihood and identifies the originating job board.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| company | string | required | Company name. |
| positionTitle | string | required | Job title. |
| applyLink | string | Apply URL — used to infer ATS and job board. | |
| jobDescription | string | Full job description text for richer classification. |
Returns
| Field | Type | Description |
|---|---|---|
| industry | string | Classified industry (e.g. "AI / ML", "Fintech"). |
| tags | string | Comma-separated tags (e.g. "FAANG+,YC,Remote"). |
| h1bSponsor | boolean | Estimated H1B sponsorship likelihood. |
| jobBoard | string | Inferred originating job board. |
| industryOptions | string[] | Alternate industry classifications considered. |
The recommended single-call entry point for classification. Internally runs filter_us_location, filter_title_keywords, and detect_industry_tags for every job in the batch, returning a split of accepted vs. rejected with per-job reasoning.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| jobs | object[] | required | Array of job objects, each with at minimum positionTitle, company, applyLink. Optional: location, jobDescription. |
| include | string[] | Title include keywords. Defaults to active roles from profile if omitted. | |
| exclude | string[] | Title exclude keywords. | |
| usOnly | boolean | Defaults to config.usOnly. |
Returns
| Field | Type | Description |
|---|---|---|
| total | number | Input job count. |
| accepted | number | Jobs that passed all filters. |
| rejected | number | Jobs that failed at least one filter. |
| results | object[] | Each result: { job, passes, rejectReason?, industry, tags }. |
Link Verification
Job postings go stale. These tools check whether an apply link is still active before storing or presenting it. Board-specific logic handles the nuances of each ATS's closure patterns.
Checks whether a single job apply URL is still active. Uses board-specific strategies for known ATS domains and a generic GET + soft-close pattern match for unknown boards.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| url | string | required | The apply URL to verify. |
Returns
| Field | Type | Description |
|---|---|---|
| url | string | The verified URL. |
| active | boolean | true if the role appears open. |
| statusCode | number | HTTP status code returned. |
| reason | string | Human-readable explanation for the verdict. |
Board-specific strategies
| ATS | Strategy |
|---|---|
| Greenhouse | API 404 on /jobs/{id} → inactive |
| Lever | API state === "closed" → inactive |
| Ashby | API isListed === false → inactive |
| Workday | GET + soft-close regex on response body |
| Generic | GET + SOFT_CLOSED_PATTERNS regex set |
Verifies up to 20 URLs in parallel. Use this to audit existing pipeline entries or clean up stale Airtable records.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| urls | string[] | required | Array of apply URLs to verify. Maximum 20 per call. |
Returns
{ total, active, inactive, results[] } — each result matches the single-link shape.
Cache Layer
The dedup cache is a SQLite database at ~/.jobsync/cache.db. Every processed job is stamped here by its apply link. On subsequent scrape runs, seen links are skipped before hitting any classification or storage logic — keeping downstream calls cheap.
Checks whether one or more apply links have already been processed. Accepts a single URL or a batch array.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| applyLink | string | Single URL to check. | |
| applyLinks | string[] | Array of URLs to check in bulk. |
Returns
Single: { seen: boolean, record? } — record contains the cache entry if seen.
Batch: { results: Array<{ applyLink, seen, record? }> }
Marks one or more jobs as seen in the cache. Call this after a job has been classified and stored — not before, so a mid-run crash doesn't permanently skip unprocessed jobs.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| jobs | object[] | required | Each entry: { applyLink, company, positionTitle }. All three are stored in the cache record. |
Returns
{ marked: number, cacheSize: number }
Deletes cache entries older than days days. Run monthly or on a schedule to prevent the SQLite file from growing unbounded.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| days | number | Entries older than this many days are deleted. Default: 90. |
Returns
{ deleted: number, cacheSize: number }
Airtable Integration
Airtable is the default authoritative sink — the place where all passing jobs land after classification. The schema is fixed: the server creates exactly the columns it knows about and maps them consistently. You can inspect that schema at any time with airtable_get_schema.
Creates or updates job records in Airtable. Uses applyLink as the dedup key — if a record with that link already exists it is updated in-place, otherwise a new record is created.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| jobs | object[] | required | Array of job objects. Required fields per job: positionTitle, company, applyLink. Optional: location, datePosted, industry, tags, fitScore, notes. |
Returns
{ created: number, failed: number, rejected: number } — rejected counts jobs that failed schema validation before even hitting the API.
cache_mark_seen after a successful upsert batch, not before. This ensures that a failed upsert doesn't permanently skip a job on the next run.Fetches recently created Airtable records for dedup reconciliation. Use this when the SQLite cache has been cleared or on a new machine — pull existing Airtable records then re-seed the cache.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| lookbackDays | number | How many days back to fetch records. Default: 14. | |
| maxRecords | number | Maximum number of records to return. Default: 500. |
Returns the field map the server uses when writing to Airtable — useful for debugging column name mismatches or understanding the exact output format.
Parameters
None.
Lists all Airtable bases accessible to the configured PAT. Useful during setup to find your base ID or to verify the PAT has the right permissions.
Parameters
None.
Returns
{ bases: Array<{ id, name, permissionLevel }> }
Creates a new Airtable base pre-configured with the JobSync schema (all required columns, correct field types). Optionally sets the new base as the active base in config. Requires the schema.bases:write PAT scope.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| workspaceId | string | required | Airtable workspace ID to create the base in. |
| name | string | Base name. Defaults to "JobSync". | |
| setAsActive | boolean | If true, writes the new baseId to config.json. Default: true. |
Application Pipeline
The pipeline is a tab-separated file (~/.jobsync/pipeline.tsv) that tracks every job you're considering, have applied to, or are interviewing for. It's intentionally a flat file — easy to inspect, easy to back up, easy to query with a spreadsheet tool.
Pipeline statuses
pending · applied · interviewing · offer · rejected · withdrawnAdds new jobs to the pipeline or refreshes metadata for existing entries. The applyLink is the dedup key. For existing entries, status, appliedAt, and notes are preserved — only metadata fields (title, company, fit score, tags) are updated. Call this after fit analysis to persist all scored roles.
Parameters — jobs array items
| Field | Type | Description | |
|---|---|---|---|
| positionTitle | string | required | Job title. |
| company | string | required | Company name. |
| applyLink | string | required | Primary dedup key. |
| id | string | Stable job ID from ATS scraper. | |
| location | string | ||
| datePosted | string | YYYY-MM-DD format. | |
| industry | string | ||
| tags | string | Comma-separated tag string. | |
| fitScore | string | Score 1–10 as a string. |
Marks one or more jobs as applied by their exact apply links. Sets appliedAt to today's date (YYYY-MM-DD). Call this whenever the user confirms they have submitted an application.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| applyLinks | string[] | required | Exact apply link URLs of submitted applications. |
Updates the status and/or notes of one or more pipeline entries. Identify jobs by applyLink (preferred) or id.
Parameters — updates array items
| Field | Type | Description | |
|---|---|---|---|
| status | string | required | New status value (see valid values above). |
| applyLink | string | Primary lookup key. | |
| id | string | Fallback if applyLink unavailable. | |
| notes | string | Optional free-text note (e.g. recruiter name, next steps). |
Reads the full pipeline and returns entries grouped by status with summary counts. Power this to a dashboard view or use it to decide where to focus next.
Parameters
| Param | Type | Description | |
|---|---|---|---|
| statusFilter | string[] | If provided, only return entries with these statuses. Omit for all entries. |
Returns
{ summary: { pending, applied, interviewing, offer, rejected, withdrawn }, grouped: { [status]: job[] } }
Data Flow & Workflow Architecture
A complete scrape workflow — from cold start to Airtable record — follows this sequence:
1. Onboarding (first run)
2. Scrape workflow (recurring)
3. Application tracking (ongoing)
Model Hints
Every tool is tagged with a recommended model hint. MCP orchestrators that support model routing can use this to minimize cost without sacrificing accuracy.
| Hint | Model | Used for |
|---|---|---|
| ⚡ haiku | claude-haiku-* | Fast, deterministic operations: cache lookups, TSV reads/writes, ATS API fetches, simple YAML reads. |
| ⚖ sonnet | claude-sonnet-* | Mid-complexity classification and write operations: industry detection, job batch classification, Airtable upserts, resume parsing. |
| 🔬 opus | claude-opus-* | High-judgment tasks: role extraction and management where nuanced understanding of career trajectories matters. |
⚡ [Model hint: haiku] text so orchestrators can parse and act on them.Frequently Asked Questions
"sink": "markdown" in config and point markdownPath to any file. The Airtable PAT and baseId are only validated when sink is "airtable" or "both". The application pipeline (pipeline.tsv) works independently of the sink setting."enableFastPath": true explicitly once you understand what will be called — then the four fetcher tools appear in Claude's tool list.datePosted for recency filtering on Ashby jobs — use the Airtable createdTime field instead.applyLink URLs have been processed (fetched, classified, attempted upsert). Airtable dedup is based on whether a record with that applyLink already exists in the table. If you clear the cache on a new machine, use airtable_list_recent_jobs to pull existing records and re-seed the cache with cache_mark_seen.portals_master_list notes — some companies have specific Workday board URLs that must be used instead.cache.db and concurrent TSV writes to pipeline.tsv can cause race conditions. Run scrape sessions sequentially. The MCP stdio transport is single-tenant by design — one Claude session per server process.portals.yml directly (or use portals_write) and add an entry under tracked_companies with the correct ats and slug. For Workday companies, use the boardUrl field instead of slug since Workday tenants don't follow a uniform slug pattern.~/.jobsync/config.json, or change "sink" to "markdown" if you don't want Airtable. Run jobsync-mcp init to re-run the interactive setup, which will walk you through both options.profile_read to load your background, then run each job (or a batch) through classify_job_batch which returns enriched results. Have Claude add a fitScore (1–10) field based on your skills and experience, then call pipeline_upsert_jobs with the scored results. The score is stored in the pipeline TSV and forwarded to Airtable on upsert.~/.jobsync/profile/ on your local machine. They are read into the Claude context window when you call profile_read, which means the contents are sent to Anthropic's API as part of your conversation — the same as any other text you share with Claude. No other third party receives them.