You are viewing a preview of this job. Log in or register to view more details about this job.

Prompt Engineer

About Firework

Join Firework – Where Innovation Meets Impact

Firework is revolutionizing connected commerce with the world’s most advanced and largest AI-powered video commerce platform, trusted by global brands and leading retailers. We bring the energy of in-store experiences online, transforming how businesses engage, convert, and build lasting customer relationships.

At Firework, you’ll be part of a high-growth, team-centric environment where innovation thrives and collaboration fuels success. Having raised over $235m to date led by investors such as the SoftBank Vision Fund 2 and operating at a global scale, we offer unparalleled opportunities to work cross-functionally, solve complex challenges, and drive meaningful impact in the future of connected digital commerce.

If you’re curious, ambitious, and energized by big ideas, Firework is the place to grow, lead, and shape the next era of online shopping—together.

Summary

We are seeking a highly creative and technically curious AI Prompt Engineer to help optimize and develop prompts that drive performance from large language models. In this role, you’ll serve as the bridge between human intent and machine output - designing effective inputs to elicit high-quality, reliable, and ethical responses from generative AI systems.

 

What you’ll be doing

  • Stay on top of prompt engineering techniques, research, and best practices
  • Develop and curate golden datasets for prompt evaluation and regression testing across modalities, ensuring long-term quality control and reproducibility
  • Design, test, and refine prompts to support a wide range of generative AI applications—not limited to chat, but also including audio synthesis, avatar animation, lip-sync alignment, and product image generation
  • Document best practices and create reusable prompt templates to support internal stakeholders, improving prompt consistency, clarity, and alignment across teams
  • Refactoring existing prompts to follow best-practice approaches
  • Collaborate with cross-functional teams to integrate AI-driven features into real-world product experiences, ensuring prompts are aligned with user needs, system constraints, and business goals
  • Build and maintain prompt libraries with clear versioning, metadata tagging, and usage patterns to support scalable and reusable development
  • Drive continuous improvement in prompt performance by using both automated metrics and human-in-the-loop evaluation pipelines
  • Contribute to and extend our internal evaluation framework—designing new evaluation flows, creating prompt-specific test cases, and defining metrics tailored to multi-modal output

     

You will have

  • Bachelor’s or Master’s degree in STEM or related field
  • Practical experience working with large language models and/or multi-modal generative models (e.g., text-to-audio, text-to-image, video or avatar generation)
  • Familiarity with prompt techniques such as zero-shot, few-shot, chain-of-thought, tool usage, and retrieval-based augmentation
  • Strong analytical and linguistic intuition, with the ability to translate abstract goals into effective machine-readable instructions
  • Deep interest in language and communication systems, and how humans and machines can interact effectively through prompt-based interfaces
  • Ability to create and maintain curated evaluation datasets (“golden sets”) to support ongoing testing and performance benchmarking
  • Strong writing and communication skills, with the ability to explain prompt behavior, rationale, and trade-offs to technical and non-technical audiences

 

We’ll be excited if you have

  • Hands-on experience with Python or another scripting language of choice
  • Experience with Jupyter Notebooks, or LLM ops tools and libraries such as LangChain, LangFuse, PromptLayer, or vector search systems
  • Experience designing or working within evaluation pipelines, including human and automated evaluations, metric design, and result interpretation.

     

Locations 

 

This role is remote within the U.S. Candidates located in the Bay Area who can work in a hybrid capacity are a plus.

 

Compensation 

 

The anticipated pay range for this position is $10-$35 per hour, with final compensation determined by experience and geographic location.

 

Don’t hold back

 

We understand some candidates may see the above and not apply because they don’t meet all the qualifications. We encourage you to apply anyway; we often find talented candidates that fit many other opportunities we have and look for potential too, not just what you did in the past.  As an equal employment opportunity employer, we are a diverse team that strives for an inclusive environment for all. We prohibit discrimination and harassment of any kind based on race, color, sex, religion, sexual orientation, national origin, age, disability, genetic information, pregnancy, or any other protected characteristic as outlined by federal, state, or local laws.