Open-source self-hosted web tool for evaluating Agent Skills with rubric scores, Deep Review, and improvement suggestions.
-
Updated
May 17, 2026 - TypeScript
Open-source self-hosted web tool for evaluating Agent Skills with rubric scores, Deep Review, and improvement suggestions.
AdaRubric: Adaptive Dynamic Rubric Evaluator for Agent Trajectories
Reward model engineering harness for evolutionary rubric search, deployable RM artifacts, online scoring, and RL experiment lineage.
Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric
A Claude Code skill that adds a rubric-based eval layer to any agent project. Framework-agnostic β generates rubric, test cases, judge prompt, and harness. Returns a weighted score plus a judge-leniency signal.
Export grades from assignment using advanced grading methods in excel format
Rubric-driven AI homework grading system built as a Claude Code Skill. Score student submissions with CoT reasoning, bias mitigation, and PDCA quality cycle.
Context-compensation scaffold for LLM evaluation prompts β disclose, gate on evidence, hedge on thin
AskBench: LLM question-asking/clarification benchmark & dataset with evaluation and training code (paper: arXiv 2602.11199).
Customize, manage templates of rubrics and fast grade HTML/PDF files
Universal quality evaluation plugin for Claude Code β 7-dimension scoring (correctness, completeness, adherence, efficiency, safety), configurable rubrics, threshold blocking, auto-hooks & /judge command.
An Appscript to generate a Google Sheet that will allow you to import certain learning targets into a Google Classroom Assignment.
(Findings of ACL 2025) TabXEval: an exhaustive, explainable rubric + two-phase framework (TabAlign β TabCompare) for table evaluation with TabXBench.
Add a description, image, and links to the rubric topic page so that developers can more easily learn about it.
To associate your repository with the rubric topic, visit your repo's landing page and select "manage topics."