Jigsaw

Jigsaw is an applied research lab scaling the creation of simulated environments.

The final primitive for superintelligence is learning through simulated environments. That is how models became strong in coding and math. The next frontier is the rest of human labor and interests.

Today, model progress is constrained by a shortage of realistic learning environments. RL environment creation remains largely hand-built and labor-intensive, with each new domain requiring bespoke world, task, and eval design. The supply of new environments is growing too slowly to meet demand from frontier labs, neo-labs, and experimental open-source teams.

We train foundation models and build infrastructure for world generation, task authorship, and model evaluation, enabling high-fidelity environments that transfer human expertise into model capability.

Over time, Jigsaw aims to become the infrastructure layer for creating, serving, and accessing simulated environments across many domains. Our long-term ambition is to power the world’s most extensive repository of environments and simulations for fine-tuning and evaluating models.

Jigsaw is an applied research lab scaling the creation of simulated environments.

The final primitive for superintelligence is learning through simulated environments. That is how models became strong in coding and math. The next frontier is the rest of human labor and interests.

Darwin

Jigsaw’s platform for scalable environment creation.

Copilot

With Darwin's Copilot, domain experts describe scenarios in natural language. Copilot asks targeted questions to surface edge cases and domain nuance, then generates complete task specifications, evaluation criteria, and expected solutions. No coding or ML expertise required.

Copilot removes the engineering bottleneck in RL environment creation and makes human judgment legible to AI, at scale.

Runners + Verifiers

Darwin also provides the runtime and quality layer around environment creation: tools, execution, grading, QA, and verification.

Its runner infrastructure lets experts benchmark tasks across frontier and open-source models, inspect full agent traces in real time, and iterate on task design in a closed loop.

Darwin also provides the runtime and quality layer around environment creation: tools, execution, grading, QA, and verification.

Its runner infrastructure lets experts benchmark tasks across frontier and open-source models, inspect full agent traces in real time, and iterate on task design in a closed loop.

We use programmatic verifiers, LLM judges, and adversarial agent simulations to stress-test tasks/rubrics for reward hacking before they are used for training or evaluation.

Pascal

Jigsaw’s foundation model for world generation.

It is trained on hand-built environments, curated domain corpora, and feedback from Darwin's QA pipeline to generate complete, internally consistent simulated worlds.

Pascal creates the artifacts, interfaces, tools, agents, and other world elements needed to model real-world workflows. Teams can fine-tune their own hosted version of Pascal to generate worlds aligned to internal processes, while retaining control over sensitive data.

It is trained on hand-built environments, curated domain corpora, and feedback from Darwin's QA pipeline to generate complete, internally consistent simulated worlds.

As Darwin expands into more environments, its authoring and QA feedback improve Pascal, which in turn increases the realism of the worlds we can generate.

Market and Traction

We are fulfilling a multi 6**-figure contract with a top-3 frontier lab** and are working with research and domain experts from S&P 500 companies, Big 4 accounting firms, and gaming studios, to develop new environments and expand Darwin and Pascal’s coverage across high-value workflows.

Today, the primary market for RL environments is frontier research labs. We believe this market will expand to enterprises, neo-labs, and independent teams hosting or customizing models for internal workflows and domain-specific use cases.

Jigsaw is building toward a world where simulated environments become core infrastructure for post-training, evaluation, and deployment. Our long-term ambition is to power the broadest repository of environments across domains. Post-training is how models are adapted to the real world, and the environments that shape that process will become a foundational resource in AI.

Team

Our team includes researchers from Stanford AI Lab and USC who have published evaluation research at ACL, KDD, and NeurIPS, alongside operators who’ve built process automation at S&P500 companies.

Miguel M

Jongbin W

Giovanni G

Jack D

Scaling RL environments across domains requires both frontier research taste and deep understanding of model capability in the real world. Jigsaw was built at that intersection.

Jigsaw is backed by a16z speedrun, with advisors and angels from Anthropic, SpaceX, and Stanford AI Lab.

Scaling RL environments across domains requires both frontier research taste and deep understanding of model capability in the real world. Jigsaw was built at that intersection.

Jigsaw is backed by a16z speedrun, with advisors and angels from Anthropic, SpaceX, and Stanford AI Lab.

About

Research

Careers

About

Research

Careers

Scaling worlds to

unlock real-world intelligence.

company

About

Contact

Brand

Careers

Research

socials

X.com

The final primitive for superintelligence is learning through simulated environments. That is how models became strong in coding and math. The next frontier is the rest of human labor and interests.

Scaling worlds to

unlock real-world intelligence.

company

About

Contact

Brand

Careers

Research

socials

X.com

The final primitive for superintelligence is learning through simulated environments. That is how models became strong in coding and math. The next frontier is the rest of human labor and interests.

Scaling worlds to

unlock real-world intelligence.

company

About

Contact

Brand

Careers

Research

socials

X.com

The final primitive for superintelligence is learning through simulated environments. That is how models became strong in coding and math. The next frontier is the rest of human labor and interests.