Alignerr · STEM & research
ML researcher designing evaluation tasks for frontier AI
Listed on Alignerr as “Machine Learning Evaluation Specialist”
What this actually is
You design problems that stump current AI models, evaluate AI reasoning against the correct answer, write rubrics, and provide expert feedback. Often the highest-paid category because the expertise pool is small. The platform title (Machine Learning Evaluation Specialist) reflects the rate band and the expertise required, not the day-to-day work.
Can you do this on your visa?
F-2 / F-4 / F-5 / F-6: open. E-1 to E-7: needs concurrent-employment permit. D-2 / D-4 students: S-3 permit, 20 hr/week cap. D-10 / D-8: case by case.
Korean tax on USD income
First 5 years in Korea: foreign-source income only taxed if remitted into Korea. After year 5: worldwide income. Full tax guide.
Original posting from Alignerr
Machine Learning Evaluation Specialist
About the Role
What if your years of hard-earned research expertise could directly shape the future of AI? We're looking for domain experts with deep machine learning knowledge to design evaluation challenges that push state-of-the-art AI systems to their limits - the kind of problems only a true specialist could craft.
Your work won't sit in a drawer. It directly influences how the next generation of AI models are measured, trained, and improved.
- Organization: Alignerr
- Type: Hourly Contract
- Location: Remote
- Commitment: 10-40 hours/week
---
What You'll Do
- Design complex, original machine learning problems rooted in your area of domain expertise
- Create evaluation tasks that demand advanced knowledge well beyond standard ML pipelines
- Draw from your own research experience to craft challenges that genuinely test highly capable AI models
- Write clear problem statements, define evaluation criteria, and establish gold-standard solutions
- Assess AI-generated solutions for correctness, creativity, and methodological rigor
- Document problem difficulty, required domain knowledge, and expected failure modes
- Collaborate asynchronously with a global team of researchers and engineers
---
Who You Are
- Graduate-level expertise (MS or PhD preferred) in a scientific or technical discipline that intersects with machine learning
- Strong working knowledge of ML methods - model selection, feature engineering, evaluation metrics, and pipeline design
- Deep familiarity with active, open research problems in your field
- A sharp eye for where general ML knowledge breaks down and specialized domain insight becomes essential
- Experience publishing or conducting original research is highly valued
- Excellent written communication - you can articulate complex, nuanced problems with precision and clarity
- Self-motivated and energized by intellectually demanding, independent work
---
Example Domains
We welcome experts from a wide range of fields, including but not limited to:
- Computational biology, genomics, or bioinformatics
- Climate science and environmental modeling
- Medical imaging and healthcare ML
- Materials science and computational chemistry
- Astrophysics and signal processing
- Natural language processing for low-resource or specialized corpora
- Robotics, control theory, or reinforcement learning in complex environments
- Financial modeling and quantitative analysis
If your domain sits at the frontier of ML research, we want to hear from you.
---
Why Join Us
- Work at the cutting edge - your challenges help define the boundaries of what AI can and cannot do
- Make a real impact - your expertise directly shapes AI safety and evaluation research
- Full autonomy - work on your own schedule, from anywhere in the world
- Flexible commitment - scale hours up or down based on your availability
- Ongoing opportunity - strong contributors are considered for contract extensions and deeper research involvement
- Build your profile - establish yourself as a contributor to frontier AI development alongside top research labs
Quoted from Alignerr’s public listing on 2026-06-02. We don’t edit platform copy; honest framing is in the title and the “what this actually is” block above.
More on this platform
About Alignerr
Owned by Labelbox. 450+ active roles across STEM, coding, audio, language, and general categories. Public role listings; you apply per role.
Is Alignerr legit? Our review
4.6/5 on Trustpilot, 2.2/5 on Glassdoor. Both are real. Here's what the divergence means, what the work actually is, who pays well, and how Korean tax + visa rules apply.
See all AI training jobs
Browse by category and compare across all eight platforms we cover.