Content Orchestration
Content Resolution is the technical mechanism behind Context Orchestration. It is the process of extracting a specific segment from a source file based on a ResourceNode to provide high-fidelity evidence to an agent.
Modes of Resolution
CiteKit supports two modes of resolution depending on your environment:
- Physical Resolution (Default): Extracts a new file (clip, mini-PDF, crop) to your disk. Requires local binaries like FFmpeg.
- Virtual Resolution: Returns only the metadata (timestamps, pages, bounding boxes) without creating a file. Perfect for serverless and direct AI model consumption.
See the Virtual Resolution Guide for more.
Mechanism by Modality (Physical)
1. Video & Audio (FFmpeg)
CiteKit uses ffmpeg for frame-accurate, lossless extraction.
Key Command:
ffmpeg -ss <start> -to <end> -i <input> -c copy <output>-ss/-to: Seek to precise timestamps.-c copy: Stream Copy. This instructs FFmpeg to copy the audio/video bitstreams directly without re-encoding.- Speed: Near instant (>100x variance).
- Quality: 100% lossless (bit-perfect clone).
2. PDF (PyMuPDF / pdf-lib)
CiteKit creates a valid, standalone PDF for the requested page range.
- Process:
- Open source PDF.
- Create new empty PDF.
- Copy pages
[start_page...end_page]to new PDF. - Save.
- Output: A clean PDF file containing only relevant context, preserving all vectors, text layers, and images.
3. Images (Pillow / Sharp)
CiteKit creates a new image file cropped to the semantic region of interest.
- Process:
- Load image.
- Crop to
bbox: [x, y, w, h]. - Save as original format (JPG/PNG).
4. Text (Line Slicing)
For text files (.txt, .md, .py), resolution is purely valid python string slicing:
- Read the source file content.
- Extract lines
[start_line : end_line]. - Write the subset to a new file.
Caching Strategy
Resolved evidences are hashed and cached in .citekit_output/.
If an agent requests the same segment twice, the file system path is returned instantly without re-processing.
Best Practice: Reuse Resolved Nodes
Agents should be aware that resolve() operations, while fast for audio/video stream-copy, still involve file I/O.
Recommendation: Before calling resolve(), check if you already have the evidence. CiteKit handles this internally, but your agent logic can also track "active" evidence.
- Internal Caching: CiteKit checks
.citekit_outputfirst. If the file exists and matches the hash, it returns the path immediately. - Agent Persistence: If your agent runs across multiple sessions, consider persisting the mapping of
node_id->local_pathto avoid even the micro-latency of the hash check.