Are you an expert at navigating the complex architecture of Large Language Models? Welo Data is seeking a highly technical Senior Prompt Engineer based in Japan to lead the end-to-end migration of template workflows into high-performance LLM autoraters.
This is a specialized role for a technical architect who understands that "perfecting a prompt" is a rigorous engineering discipline. You will leverage advanced APG/APO tools and manual refinement to ensure our automated systems meet—and exceed—human accuracy baselines in both German and English contexts.
The Mission: Automated Quality at Scale
- Architectural Migration: Take full ownership of the end-to-end technical migration of templates to LLM autoraters.
- Optimization Leadership: Utilize Automatic Prompt Generation (APG) and supervise Automated Prompt Optimization (APO) tools to push model performance past plateaus and logic deadlocks.
- Metrics-Driven Excellence: Continuously measure quality against "gold data" baselines, tracking precision, recall, and F1 scores to justify launch readiness.
- Edge-Case Engineering: Manually draft and refine complex prompts to overcome anti-patterns and architecture gaps that automated tools cannot solve.
Project Details
- Schedule: Part-Time (Set your own hours within project milestones).
- Location: 100% Remote (Must be currently based in Japan).
- Language: Native fluency in Japanese and professional fluency in English.
- Employment Type: Freelance / Independent Contractor.
Candidate Profile
- Educational Foundation: Bachelor’s, Master’s, or PhD in Computer Science, Data Science, Computational Linguistics, or a related analytical field.
- Prompt Engineering Mastery: 4+ years of experience tuning LLMs for strict, structured outputs, complex classification, and few-shot learning.
- Analytical Power: High proficiency in identifying error patterns and using SQL or data analytics tools to monitor performance.
- Technical Agility: Fast learner capable of mastering proprietary internal tools and "Goose API" style interfaces with minimal oversight.
Preferred Technical Skills
- Familiarity with shadowbot monitoring and disagreement tracking between human and LLM ratings.
- Hands-on experience with Chain-of-Thought (CoT) prompting and APO systems.
- Deep linguistic expertise, including a strong understanding of semantics and formal logic.
- Proven ability to draft high-level Launch Certification Documentation.