Methodology
How we measure AI-built app security
Every statistic on this site comes from ShipSafe’s own independent, read-only scans of publicly-deployed apps. We read the code and probe the live app from the public internet, and we only report data exposure that an actual anonymous request returned, never a guess. Below is each study, its sample size, and exactly what we count.
The studies
Cursor
67% had at least one critical-severity finding
sample: 100 deployed apps
Lovable
89% had a Supabase table readable with no login (RLS off)
sample: 50 deployed apps
Bolt.new
80% shipped at least one route with no authentication
sample: deployed apps
Replit
77% had at least one critical-severity finding
sample: 30 deployed apps
The widely-quoted 67% figure is the Cursor study specifically. Across builders, the rate of at least one critical finding ranged from 67% to 77%.
How a scan runs
- Read-only and non-destructive: GET requests only, bounded, never writing or deleting.
- Both surfaces: the code (or repo) for logic bugs, and the deployed app from the public internet.
- Data exposure is proven, not inferred: a single anonymous read against a Supabase table or storage bucket, or a Firebase database, that actually returned rows.
- Critical-severity follows the scanner's CWE/OWASP-mapped scale; a finding is counted once per app.
Scope and limits
These are point-in-time, automated scans of a sample of apps. They describe how often a class of issue appears across AI-built apps, not the security of any single app. A passing scan is not a guarantee, a penetration test, or a compliance audit. New code introduces new risk, so re-scan after changes.
Want the same proof for your own app?
Scan your app free