← Back to Benchmark Results

Data Modeling

Tables, fields, enums, keys, FlowFields, CalcFormulas, and table extensions

Report generated: February 14, 2026 at 8:13 PM

Benchmark data: Feb 8, 2026 – Feb 13, 2026

12
Models
10
Tasks
73.3%
Pass Rate

Model Rankings

pass@1 pass@3 (additional)
Kimi K2.5 (t0.1)
73%
80.0%
Deepseek V3.2 (t0.1)
70%
10%
80.0%
Glm 5 (t0.1)
50%
10%
60.0%

Model Performance

anthropic/claude-opus-4-6

Runs:3
pass@1:80.0%
pass@3:80.0%
Consistency:100.0%
1st: 202nd: 4Failed: 28/10 passed
Temperature:0.1
Thinking:-
Tokens/run:16,945
Cost/run:$0.26

Known Shortcomings (6)

  • reserved-keyword-as-parameter-name 1x
  • cross-join-dataitem-link 1x
  • incomplete-procedure-body 1x
  • flowfield-calcfields-requirement 1x
  • parse-failure 1x
+1 more View all 6
View details ›

anthropic/claude-sonnet-4-5-20250929

Runs:3
pass@1:80.0%
pass@3:80.0%
Consistency:100.0%
1st: 212nd: 3Failed: 28/10 passed
Temperature:0.1
Thinking:-
Tokens/run:17,364
Cost/run:$0.16

Known Shortcomings (8)

  • multiline-string-literals 1x
  • query-filter-element-syntax 1x
  • jsonobject-get-method-signature 1x
  • cross-join-dataitem-link-constraints 1x
  • reserved-keyword-as-variable-name 1x
+3 more View all 8
View details ›

anthropic/claude-opus-4-5-20251101@thinking=50000

Runs:3
pass@1:80.0%
pass@3:80.0%
Consistency:100.0%
1st: 212nd: 3Failed: 28/10 passed
Temperature:0.1
Thinking:50,000
Tokens/run:17,769
Cost/run:$0.28

Known Shortcomings (8)

  • page-extension-with-table-extension 1x
  • reserved-keyword-as-parameter-name 1x
  • dictionary-iteration-syntax 1x
  • empty-or-malformed-code-generation 1x
  • temporary-table-parameter-handling 1x
+3 more View all 8
View details ›

openrouter/minimax/minimax-m2.5

Runs:3
pass@1:76.7%
pass@3:80.0%
Consistency:90.0%
1st: 192nd: 4Failed: 28/10 passed
Temperature:0.1
Thinking:-
Tokens/run:22,846
Cost/run:$0.16

Known Shortcomings (18)

  • interface-definition-syntax 2x
  • text-char-conversion-copystr 1x
  • page-object-definition 1x
  • event-subscriber-attribute-syntax 1x
  • page-extension-and-table-extension-generation 1x
+13 more View all 18
View details ›

openrouter/moonshotai/kimi-k2.5

Runs:3
pass@1:73.3%
pass@3:80.0%
Consistency:90.0%
1st: 192nd: 3Failed: 28/10 passed
Temperature:0.1
Thinking:-
Tokens/run:43,778
Cost/run:$0.40

Known Shortcomings (3)

  • event-subscriber-parameter-syntax 1x
  • page-extension-cardpageid-override 1x
  • parse-failure 1x
View details ›

openrouter/deepseek/deepseek-v3.2

Runs:3
pass@1:70.0%
pass@3:80.0%
Consistency:70.0%
1st: 122nd: 9Failed: 28/10 passed
Temperature:0.1
Thinking:-
Tokens/run:20,054
Cost/run:$0.15

Known Shortcomings (18)

  • dictionary-clear-method 1x
  • application-area-in-page-extension-field 1x
  • multiline-string-literals 1x
  • page-extension-cardpageid-override 1x
  • errorinfo-custom-dimensions-api 1x
+13 more View all 18
View details ›

openai/gpt-5.2-2025-12-11@thinking=high

Runs:3
pass@1:70.0%
pass@3:70.0%
Consistency:100.0%
1st: 21Failed: 37/10 passed
Temperature:0.1
Thinking:high
Tokens/run:14,987
Cost/run:$0.12

Known Shortcomings (10)

  • interface-definition-syntax 2x
  • table-field-caption-property 1x
  • query-object-syntax 1x
  • query-crossjoin-syntax 1x
  • parse-failure 1x
+5 more View all 10
View details ›

gemini/gemini-3-pro-preview

Runs:3
pass@1:70.0%
pass@3:70.0%
Consistency:100.0%
1st: 21Failed: 37/10 passed
Temperature:0.1
Thinking:-
Tokens/run:108,154
Cost/run:$0.13

Known Shortcomings (9)

  • multiline-string-literals 1x
  • inherent-permissions-syntax 1x
  • query-crossjoin-column-datasource 1x
  • complete-codeunit-generation 1x
  • yaml-parsing-string-manipulation 1x
+4 more View all 9
View details ›

openrouter/x-ai/grok-code-fast-1

Runs:3
pass@1:70.0%
pass@3:70.0%
Consistency:100.0%
1st: 182nd: 3Failed: 37/10 passed
Temperature:0.1
Thinking:-
Tokens/run:120,390
Cost/run:$1.59

Known Shortcomings (12)

  • query-object-syntax 2x
  • multiline-string-literals 1x
  • page-extension-cardpageid-override 1x
  • json-api-methods 1x
  • recordref-fieldref-dynamic-manipulation 1x
+7 more View all 12
View details ›

openrouter/qwen/qwen3-max-thinking

Runs:3
pass@1:66.7%
pass@3:70.0%
Consistency:90.0%
1st: 112nd: 9Failed: 37/10 passed
Temperature:0.1
Thinking:-
Tokens/run:18,777
Cost/run:$0.13

Known Shortcomings (12)

  • option-field-optionmembers-required 2x
  • enum-frominteger-syntax 1x
  • list-iteration-pattern 1x
  • variant-type-argument-and-interface-definition 1x
  • json-object-api-methods 1x
+7 more View all 12
View details ›

openrouter/qwen/qwen3-coder-next

Runs:3
pass@1:56.7%
pass@3:60.0%
Consistency:90.0%
1st: 112nd: 6Failed: 46/10 passed
Temperature:0.1
Thinking:-
Tokens/run:18,970
Cost/run:$0.13

Known Shortcomings (19)

  • codeunit-generation-empty-output 5x
  • interface-definition-syntax 3x
  • query-object-syntax 2x
  • initvalue-vs-defaultvalue 1x
  • text-trim-method-unavailable 1x
+14 more View all 19
View details ›

openrouter/z-ai/glm-5

Runs:3
pass@1:50.0%
pass@3:60.0%
Consistency:80.0%
1st: 122nd: 3Failed: 46/10 passed
Temperature:0.1
Thinking:-
Tokens/run:36,231
Cost/run:$0.30

Known Shortcomings (17)

  • list-dictionary-of-interface-clear-method 1x
  • event-subscriber-event-name 1x
  • al-string-literal-escaping 1x
  • query-object-syntax 1x
  • fluent-api-return-self-codeunit 1x
+12 more View all 17
View details ›

Task Results Matrix

N/M = passed N of M runs (hover for details)

TaskDescriptionClaude Opus 4.6Claude Sonnet 4.5Claude Opus 4.5 (50K)Minimax M2.5Kimi K2.5Deepseek V3.2GPT-5.2Gemini 3 ProGrok Code Fast 1Qwen3 Max ThinkingQwen3 Coder NextGlm 5
CG-AL-E001Create a simple AL table called "Product Category" with ID 70000. The table should have the following fields: - Code (Code[20], primary key) - Description (Text[100]) - Active (Boolean, default true) - Created Date (Date)3/33/33/33/33/32/33/33/33/33/30/33/3
CG-AL-E003Create a simple AL enum called "Priority Level" with ID 70000. The enum should have the following values: - Low (value 0) - Medium (value 1) - High (value 2) - Critical (value 3)3/33/33/33/33/33/33/33/33/33/33/31/3
CG-AL-E004Create a table extension called "Item Extension" with ID 70000 that extends the standard Item table. Add the following new fields: - Warranty Period (Integer, representing months) - Supplier Rating (Option with values: Not Rated, Bronze, Silver, Gold, Platinum) - Last Maintenance Date (Date) - Special Instructions (Text[250])3/33/33/33/33/33/33/33/33/33/33/32/3
CG-AL-E031Create a table called "CG Subscription Plan" with ID 70031.3/33/33/33/33/33/33/33/33/33/33/33/3
CG-AL-E045Create a table called "Vehicle Log" with ID 70045.3/33/33/33/33/32/33/33/33/33/32/30/3
CG-AL-H002Create two tables to demonstrate FlowField with CalcFormula:3/33/33/33/33/33/33/33/33/32/33/33/3
CG-AL-H004Create an enum and codeunit that demonstrate correct enum ordinal handling.3/33/33/33/33/33/33/33/33/33/33/33/3
CG-AL-M003Create a complex table called "Sales Contract" with ID 70002 that includes comprehensive validation. Fields should include: - Contract No. (Code[20], primary key, auto-generated) - Customer No. (Code[20], with TableRelation to Customer) - Start Date and End Date (Date fields with validation) - Contract Value (Decimal with minimum value validation) - Status (Option: Draft, Active, Suspended, Terminated, Closed) - Payment Terms (Code[10] with TableRelation)0/30/30/30/30/30/30/30/30/30/30/30/3
CG-AL-M006Create an advanced table extension called "Advanced Customer Extension" with ID 70001 that extends the Customer table.3/33/33/32/31/32/30/30/30/30/30/30/3
CG-AL-M112Create two tables:0/30/30/30/30/30/30/30/30/30/30/30/3