About the Role
We’re looking for a Senior Software Engineer with deep experience in AI and LLM systems to lead the development of our intelligent document processing pipeline. You’ll own the end-to-end architecture — from ingesting raw PDFs and extracting structured data via OCR, to designing LLM-powered reasoning workflows that produce clean, reliable outputs.
This is a high-impact, technical role where your work will directly shape the core product. You’ll work closely with stakeholders to understand document processing challenges and translate them into robust, scalable AI solutions. If you thrive in ambiguous, fast-moving environments and love solving problems at the intersection of AI and real-world data, this role is for you.
Schedule:
- Full-time: Flexible hours, fully remote
- Part-time: Open to discussion based on timezone overlap
What You’ll Be Building
- End-to-End AI Pipeline: Design and ship a production-ready pipeline spanning OCR ingestion, structured extraction, LLM-powered reasoning, and clean data output.
- LLM Workflow Development: Build and iterate on GenAI workflows that are reliable, structured, and integrate seamlessly into the broader system.
- Data Extraction & Schema Design: Transform noisy, unstructured document data into consistent, queryable schemas.
- Prompt Engineering: Craft and refine prompts that make LLM outputs predictable and production-safe.
- OCR & Document Processing: Extract text, layout, and relationships from PDFs — the foundation of the entire pipeline.
- Backend / API Development: Build services that orchestrate the pipeline, handle file processing, and expose results to downstream consumers.
What We Need From You
Required:
- Expert-level GenAI / LLM workflow development (end-to-end pipeline design in production)
- Strong prompt engineering skills — structured, reliable, and usable outputs
- OCR & document processing experience — text, layout, and relationship extraction from PDFs
- Data extraction and schema design — handling messy real-world document data
- PDF processing — multi-page documents, grouping, and page-level context
- Backend / API development for pipeline orchestration and file processing
Nice to Have:
- Computer vision — detecting symbols, geometry, and spatial relationships in technical drawings
- Experience with vector databases or retrieval-augmented generation (RAG) workflows
- Familiarity with multi-modal models combining vision and language understanding
- Prior exposure to engineering, architectural, or technical document types
Soft Skills We Value
- Strong communicator — clear and concise across async and sync channels
- Self-directed — able to scope, prioritize, and execute with minimal hand-holding
- Comfortable with ambiguity — thrives in fast-moving, early-stage environments
- Detail-oriented — high bar for output quality and consistency
- Collaborative — works well with cross-functional teams and stakeholders
- Iterative thinker — ships incrementally, learns fast, and improves continuously