AI Agent Evaluation Analyst

4 weeks ago


Biñan, Philippines Mindrift Full time

Get AI‑powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency. What We Do The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe. Who We're Looking For We’re looking for curious and intellectually proactive contributors—individuals who double‑check assumptions and play devil’s advocate. Comfortable with ambiguity and complexity? Prefer an async, remote, flexible opportunity? Interested in learning how modern AI systems are tested and evaluated? This role is ideal. Flexible Project‑Based Opportunity Analysts, researchers, or consultants with strong critical thinking skills Students (senior undergrads / grad students) looking for an intellectually interesting gig People open to a part‑time and non‑permanent opportunity About the Project We are hiring QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you will balance quality assurance, research, and logical problem‑solving. This opportunity is ideal for those who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases. What You'll Be Doing Review evaluation tasks and scenarios for logic, completeness, and realism Identify inconsistencies, missing assumptions, or unclear decision points Help define clear expected behaviors (gold standards) for AI agents Annotate cause‑effect relationships, reasoning paths, and plausible alternatives Think through complex systems and policies as a human would to ensure agents are tested properly Work closely with QA, writers, or developers to suggest refinements or edge‑case coverage How to Get Started Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills on your own schedule. Shape the future of AI while building tools that benefit everyone. Requirements Excellent analytical thinking: reason about complex systems, scenarios, and logical implications Strong attention to detail: spot contradictions, ambiguities, and vague requirements Familiarity with structured data formats: read JSON/YAML Ability to assess scenarios holistically: identify missing or unrealistic elements that might break Good communication and clear writing in English to document findings We also value applicants who have: Experience with policy evaluation, logic puzzles, case studies, or structured scenario design Background in consulting, academia, olympiads (logic/math/informatics), or research Exposure to LLMs, prompt engineering, or AI‑generated content Familiarity with QA or test‑case thinking (edge cases, failure modes) Understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.) Benefits Get paid for your expertise, with rates up to $47/hour depending on skills, experience, and project needs Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments Participate in an advanced AI project and gain valuable experience to enhance your portfolio Influence how future AI models understand and communicate in your field of expertise #J-18808-Ljbffr


  • Software Engineer

    3 weeks ago


    Biñan, Philippines Mxv Full time

    DevRev DevRev’s AgentOS, purpose-built for SaaS companies, comprises three modern CRM apps for support, product, and growth teams. It connects end users, sellers, support, product people, and developers, reducing 9 business apps and converging 6 teams onto a common platform. Unlike horizontal CRMs, DevRev takes a blank canvas approach to collaboration, AI,...


  • Biñan, Philippines Moxie Full time

    This range is provided by Moxie. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range $11.00/hr - $15.00/hr At Moxie, we empower ambitious aesthetic entrepreneurs to build profitable, independent practices—without burnout, overwhelm, or guesswork. In just a few years, we’ve grown from an...

  • Data Engineer

    3 days ago


    Biñan, Philippines Pro5.ai Full time

    About the Role We’re seeking a Data Engineer to build and maintain data pipelines supporting AI agents for real estate and construction applications. You’ll play a key role in ensuring reliable data flows, integrations, and preprocessing frameworks that power advanced GenAI systems. What You'll Do A high-impact opportunity within an early‑stage,...

  • Senior AI Engineer

    1 week ago


    Biñan, Philippines EXUS Full time

    EXUS is a global technology company specializing in debt collections software for financial services and utilities. Our enterprise SaaS platform is used in over50 countries worldwide, delivering measurable improvements in collections, compliance, and operational efficiency. With20+ years of experienceand a productrecognized by Gartner as best-in-class, we...


  • Biñan, Philippines Invisible Expert Marketplace Full time

    Overview Join to apply for the Fulah Language Specialist - AI Trainer role at Invisible Expert Marketplace Responsibilities Review and annotate Fulah content for training data quality. Assess AI-generated outputs for accuracy, fluency, and cultural appropriateness. Identify and document error patterns, collaborating with the team to refine prompts,...


  • Biñan, Philippines Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that—our technology...


  • Biñan, Philippines Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that—our technology...


  • Biñan, Philippines Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that—our technology...


  • Biñan, Philippines TalentPop App Full time

    2 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. Join Our Team as a Training & Development Specialist! This is your chance to join a winning team and one of the fastest-growing companies in the eCommerce Ecosystem! Job Responsibilities Host training classes via Zoom, discussing topics in the training...


  • Biñan, Philippines Western Digital Full time

    Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we’ve been doing just that—our technology...