@skill-tools/router
BM25 skill selection middleware for Agent Skills. Uses full-text search to select which skills to inject into an agent's context window from large skill catalogs. Zero external dependencies — built-in Okapi BM25 engine.
npm install @skill-tools/router Quick Start
import { SkillRouter } from '@skill-tools/router';
const router = new SkillRouter();
await router.indexSkills([
{ name: 'bap-browser', description: 'AI-powered browser automation via BAP...' },
{ name: 'run-tests', description: 'Execute test suites...' },
{ name: 'lint-code', description: 'Run ESLint or Biome...' },
]);
const results = await router.select('open a webpage and fill out a form');
// [{ skill: 'bap-browser', score: 1.0, metadata: {...} }] API
SkillRouter
Main class. Constructor accepts optional SkillRouterOptions with embedding config and BM25 parameters.
indexSkills(skills)
Index skill descriptions into the BM25 inverted index. Tokenizes, removes stop words, and builds posting lists.
indexDirectory(dirPath)
Auto-discover and index SKILL.md files in a directory.
select(query, options?)
Find the top-K most relevant skills for a natural language query.
| Option | Default | Description |
|---|---|---|
topK | 5 | Number of results |
threshold | 0.0 | Minimum BM25 score (normalized 0–1) |
boost | — | Skill names to boost by 1.2x |
exclude | — | Skill names or wildcard patterns to exclude |
detectConflicts(threshold?)
Find skills with highly similar descriptions that may conflict.
save() / load(snapshot)
Serialize/restore the full index as JSON. Useful for instant startup persistence.
SkillRouter.fromSnapshot(snapshot)
Static factory to restore from a serialized snapshot.
BM25 Parameters
| Parameter | Default | Description |
|---|---|---|
k1 | 1.2 | Term frequency saturation. Higher values give more weight to repeated terms. |
b | 0.75 | Length normalization. 0 = ignore length, 1 = fully normalize. |
const router = new SkillRouter({
bm25: { k1: 1.5, b: 0.8 },
}); Contextual Enrichment (v0.2.2)
Before indexing, context terms are extracted deterministically from the skill body and prepended to the description. This improves recall for queries that match terms in the skill's instructions, headings, or inline code — without needing LLMs or embeddings.
- Name parts — skill name split on
-and_ - Section headings — all markdown headings from the body
- Inline code refs — backtick-wrapped terms from the body
Max 80 context tokens. Deduped against description tokens to avoid inflating term frequency. Disable with context: false:
const router = new SkillRouter({ context: false }); Embedding Providers (Advanced)
By default, the router uses BM25 full-text search (zero dependencies). For custom semantic search, you can provide your own embedding function:
const router = new SkillRouter({
embedding: {
provider: 'custom',
dimensions: 1536,
embed: async (texts) => {
// Call your embedding API here
return texts.map(t => myEmbedFunction(t));
},
},
}); Architecture
- Inverted index maps terms to posting lists — only documents containing query terms are scored
- IDF pre-computation at index time for O(q) query performance
- Float64Array score accumulator for efficient scoring
- Score normalization to [0, 1] — divide by max score per query
- Scales to 10,000+ skills with zero external dependencies
- Snapshot format includes inverted index, IDF cache, and BM25 parameters
- Dual-path: BM25 default, embedding + vector store fallback for custom providers
BM25 Scoring Formulas
IDF(t) = log((N - df(t) + 0.5) / (df(t) + 0.5) + 1)
score(q, d) = Σ IDF(t) × (tf(t,d) × (k1 + 1))
/ (tf(t,d) + k1 × (1 - b + b × |d| / avgdl))
normalized = score / max_score // top result = 1.0
Where N = total documents, df(t) = documents containing term t,
tf(t,d) = frequency of t in document d,
|d| = document length in tokens, avgdl = average document length.
BM25Index (Direct Use)
The BM25 engine is also exported for standalone use:
import { BM25Index } from '@skill-tools/router';
const index = new BM25Index();
index.add('doc-1', 'AI-powered browser automation via BAP');
index.add('doc-2', 'Run unit tests with Vitest');
const results = index.search('browser', { topK: 5 });
// [{ id: 'doc-1', score: 1.0 }] Test Coverage
64 tests across 6 suites. Run with npx vitest run from the package directory.
router.test.ts (18 tests)
- Initializes with zero count, indexes skills, handles empty lists
- Selects relevant skills for a query
- Respects topK and threshold options
- Applies boost and exclude filters (including wildcard patterns)
- Returns metadata in selection results
- Saves, loads, and restores from snapshots (load + fromSnapshot)
- Detects conflicting skills
- Uses custom embedding providers, throws for unsupported ones
- Result scores are between 0 and 1, correct shape
bm25.test.ts (17 tests)
- Starts empty, adds documents, reports correct size
- Ranks deploy, test, and database queries correctly
- Normalizes scores to [0, 1] range
- Respects topK limit and threshold filter
- Removes documents by ID
- Serializes and deserializes round-trip
- Returns empty for empty query and no-match queries
- Accepts custom k1 and b parameters
- Preserves document metadata, handles single-document and incremental adds
context-extractor.test.ts (11 tests)
- Returns empty string for skills with no body or sections
- Extracts inline code references, section headings, key terms from content
- Splits skill name on hyphens and underscores
- Deduplicates against description terms and within extracted terms
- Handles empty body and sections gracefully
- Truncates to max ~80 tokens
- Strips leading dashes from code refs, filters single-character name parts
contextual-routing.test.ts (10 tests)
- Body terms improve ranking for specific queries
- AWS-specific, jest-specific, and biome-specific terms route correctly
- context: false disables enrichment
- Skills without body/sections behave identically to v0.1
- Original metadata.description preserved in results
- Correct count after indexing, save/load round-trip
- Mixed skills (some with body, some without)
memory-store.test.ts (10 tests)
- Starts empty, adds entries, reports size
- Searches by cosine similarity
- Respects topK limit and similarity threshold
- Removes entries, serializes/deserializes
- Throws on unsupported version and dimension mismatch
- Preserves metadata through search results
local-embedding.test.ts (8 tests)
- Returns vectors of configured dimension, defaults to 256
- Produces L2-normalized vectors, zero vector for empty text
- Embeds multiple texts in batch
- Produces similar vectors for similar texts
- buildVocabulary populates IDF values