You are viewing a preview of this job. Log in or register to view more details about this job.

Freelance Bilingual AI Evaluator: Spanish (Mexico)

Overview
Uber AI Solutions is looking for highly analytical freelancers to evaluate and refine AI-generated content. In this opportunity, you’ll bridge the gap between English instructions and Mexican Spanish media to ensure AI models are accurate, helpful, and culturally relevant.

Uber AI Solutions is a marketplace connecting skilled freelancers with Generative AI researchers and product teams working on cutting-edge AI systems. We partner with independent contractors to support large-scale model evaluation, alignment, and quality initiatives.

This is a remote, contract-based position offering flexible hours and the opportunity to work on cutting-edge AI alignment and evaluation initiatives.

The opportunity

Project scope: Evaluate audio, video, and text prompts and responses in Spanish.
Evaluation: Compare AI responses side-by-side to determine which is superior based on truthfulness, completeness, and insightfulness.
Analysis: Write complex justifications in English to explain why one response is better than another across several dimensions including logic and accuracy.

What this project offers

Flexibility: Work from anywhere at any time. Tasks are available 24/7.
Earning Potential: Competitive hourly pay.
Commitment: We expect a volume of 5–40 hours per week.
Time per task: Roughly 10-25 minutes.

What you need to be eligible

Native-level fluency: Fluent in both English and Mexican Spanish (including cultural nuances and local slang).
Analytical mindset: Obsessive about details. If a model misses a negative constraint or a formatting rule, you spot it instantly.
Research skills: You're willing to fact-check claims and verify dates or people rather than assuming the AI is correct.
Legal residency: You must be a current legal US resident.
Technical requirements: A personal laptop and a stable internet connection.

Why this work matters

Your contributions directly impact the quality, trustworthiness, and reliability of Generative AI systems used at scale. By improving how AI models evaluate and present information, your work helps advance responsible AI development and real-world AI performance.