The Weighted AI Impact on Software Engineering Productivity

 |  Writings

Most sources highlight big AI gains for simple tasks in software. But when weighted across an entire engineering-heavy product's work scope with a common sense and figures from actual experience, those gains look shrinking to the margins.

Contents

Common Internet Article Approach

Internet sources more often than not evaluate the benefits of LLM for software engineering isolated - on experienced developers, repetitive tasks, documentation etc.

Scenario Productivity Impact
Experienced dev on familiar codebases ≈ 19% slower
Repetitive tasks or documentation (e.g., via Copilot) 30–50% faster
Onboarding/refactoring/new features (Cursor at Sisense) 2×–3× faster
Anecdotal MVP or greenfield work 3–4× faster
General: Controlled experiment (Copilot) [1] ≈ 56% faster

[1]: https://arxiv.org/abs/2302.06590. The experiment contained creating the HTTP server for 3 endpoints, TODO application, no database rudimentary CRUD logic. It took ~71 minute to finish with Copilot and ~161 minutes without it. The task that would take experienced middle NodeJS engineer 10-15 minutes max with no AI.

Weighted Impact on Entire Product

But any marketable software endeavor contains 15-20 life stages from ideation, requirements to delivery and support so that the above figures do not show the full picture to us.

The following table represents the product life cycle work types from requirements to marketable product launch with weights for a 24+ month web software product from start through multiple finished deliveries of big functionality - a greenfield product, not a product in maintenance life stage.

Each work type is given the informed guess of the AI tools impact according to various internet sources (I do not talk the sources credibility, just check with common sense).

The conclusion is: for such products the AI impact is marginal, on the level of error.

Further, I may explore the different type of products, where the simple repetitive tasks, an AI excels, have higher weights. But that would not be an engineering-heavy project anymore - may be a maintenance phase of an engineering-heavy digital project, some other digital projects.

Work Type Weight (%) Impact Effect
Requirements & Design (Domain modeling, OAS, UX flows, architecture) 20 ~0% (neutral) 0.0
Boilerplate / Setup (initial project, configs, infra stubs) 1 ~200% faster (≈3×) +2.0
Scaffolding (routes, stubs, adapters, templates) 2 ~150% faster (≈2.5×) +3.0
Core Business Logic Implementation (DDD domain work) 30 ~20% slower (review/validation overhead) −6.0
Integration / ATDD Development (tests, fixtures, acceptance automation) 30 ~1% faster (fixtures/help) +0.3
Refactoring (architecture correction, module reshaping) 1 ~100% faster (targeted assists) +1.0
Onboarding (per-engineer ramp-up) 3 ~100% faster (code explain, search) +3.0
Documentation (code comments, ADRs, guides) 3 ~40% faster +1.2
Bugfixing / Debugging (defects, hotfixes, regressions) 10 ~0% (net neutral over time) 0.0
Total 100 +4.5

Engineering-Light Products

The question is, are there the same size (~24+ months of life) digital products that require significantly less weigh of engineering efforts to give more weight, hence, productivity gain to AI tools? Here is the review. We still see that for such products the AI usage in a product life stays marginal.

Rules of Thumb

Here are the quick overview of the rules of thumb on where AI could provide gains.

When to buy/configure (not AI-generate)

  • Use ready-made mature platforms/templates when they cover ≥70–80% of requirements with config (CMS, Shopify, Retool/Appsmith, Supabase templates, etc.).
  • Core needs are CRUD, dashboards, content ops, workflows, standard auth/RBAC, and common integrations.

What to code (the only parts worth building)

  • The delta: truly differentiating workflows/features the template can’t express.
  • Thin adapters to existing systems (idempotent jobs, webhooks, import/export).
  • Minimal policy/permission extensions the platform lacks.

Where AI still helps (low risk, real value)

  • Data migration & transforms (one-offs with checksums/validations).
  • Glue snippets inside the platform (validators, small automations, SQL).
  • Bulk content ops (localized copy, SEO meta, docs drafts) with human review.
  • Tests for custom deltas (smoke/AT checks around your extensions).
  • Explainers/docs/runbooks and PR descriptions.

Red flags / anti-patterns

  • “Let’s AI-generate our own CMS/e-commerce/admin suite.”
  • Heavy customization fighting the platform (you’ll own a fork forever).
  • Skipping acceptance criteria because “AI is fast.”

Quick decision rule (24+ mo projects)

  • High template fit (≥4 of 6): data model, workflows, permissions, theming, integrations, SLAs/compliance → Use platform, code only the delta; AI for glue/migration/docs.
  • Medium fit (2–3): platform + small plugins; keep exit plan.
  • Low fit (≤1): it’s genuinely custom → engineered build; expect AI’s net impact ≈ error margin.

Bottom line: Don’t shift work to create more boilerplate just to exploit AI. Use the proven template to ship, and apply AI surgically for toil around the edges—not to recreate the platform.