BC-Bench Guide
Open Source · GitHub Pages · EN & ES

BC-Bench Guide

A step-by-step guide to evaluate coding agents on real Microsoft Dynamics 365 Business Central tasks using the BC-Bench framework.

🐛 101 real bugs
🤖 Multi-agent
📊 Statistical metrics
📚 5 chapters
🌐 EN · ES

Choose your language / Elige tu idioma

English
Start the guide in English
Español
Empezar la guia en español

What is BC-Bench?

BC-Bench is an open-source benchmarking framework by Microsoft for evaluating coding agents (Claude Code, GitHub Copilot CLI, and others) on real-world Business Central (AL) development tasks. It includes:

Guide Contents

# English Espanol
1 Introduction Introduccion
2 Setup & First Evaluation Setup y Primera Evaluacion
3 Agent Configuration Configuracion de Agentes
4 Baselines & VM Scripts Baselines y Scripts VM
5 Results & Analysis Resultados y Analisis

Quick Start

# Install BC-Bench
gh repo fork microsoft/BC-Bench --clone && cd BC-Bench
uv python install && uv sync --all-groups

# Explore the dataset
uv run bcbench dataset list
uv run bcbench dataset view microsoft__BCApps-4822

# Run your first evaluation (patch only, no container needed)
uv run bcbench run claude microsoft__BCApps-4822 --category bug-fix --model claude-sonnet-4-6