Benchmarks
Tasks that test paper reading, derivation checking, assumptions, and useful failure modes.
AI4Theory focuses on benchmarks, formalization paths, and workflow tools for theoretical physics.
Tasks that test paper reading, derivation checking, assumptions, and useful failure modes.
Claims, definitions, equations, and dependencies made inspectable and checkable.
Tools for literature mapping, reference work, equation work, and research triage.