Artificial intelligence personalities influence the caliber of coded output
In August 2025, Sonar published a groundbreaking report titled "The Coding Personalities of Leading LLMs," shedding light on the distinctive coding styles of five prominent large language models (LLMs). These unique personalities, according to the report, significantly impact the security, reliability, and maintainability of the code they generate.
Key Findings
Distinct Model Archetypes
Each model—Anthropic's Claude Sonnet 4 and 3.7, OpenAI's GPT-4o, Meta's Llama 3.2-vision:90b, and OpenCoder-8B—has a measurable coding style or personality that influences how it structures and documents code, its level of verbosity, and the complexity of the code it produces.
Security Implications
All models generate code with significant security flaws. Common issues include hard-coded credentials and path traversal vulnerabilities. The severity of vulnerabilities varies, with Llama 3.2-vision:90b having over 70% of its vulnerabilities rated as ‘blocker’ level, GPT-4o at 62.5%, and Claude Sonnet 4 nearly 60%. Better performance in solving problems often correlates with more severe bugs—for example, Claude Sonnet 4's improved pass rate came with a 93% rise in high-severity bugs.
Reliability and Maintainability
Models share strong syntactic reliability, with a high percentage of generated code compiling and running successfully. Claude Sonnet 4 achieves the highest correctness at 95.57% HumanEval score. However, over 90% of issues in generated code across all models are ‘code smells’—symptoms of poor structure and design—that can degrade maintainability and lead to technical debt.
Impact of Coding Personality on Use
Understanding each LLM's personality helps developers and organizations apply models more safely and effectively. Since models vary in verbosity, complexity, and documentation style, choosing the appropriate model for a given coding task and rigorously verifying AI-generated code is crucial to mitigate risks.
Other Findings
- Generative AI coding models have strengths and weaknesses that vary due to coding style. For example, Claude 3.7 Sonnet is named "the balanced predecessor" for its capable benchmark pass rate and high comment density, while still producing a high proportion of "Blocker" severity vulnerabilities.
- Claude 4 Sonnet is labeled "the senior architect" for its high benchmark pass rate and verbose, complex coding style.
- The models often allow resource leaks from failure to close file streams due to their lack of contextual awareness of software engineering and application workings.
- Path traversal and injection vulnerabilities are the most common generated flaws by the evaluated LLMs, accounting for 34.04 percent (Claude Sonnet 4).
- Hard-coded secrets, like passwords, are generated by the models due to the presence of these flaws in their training data.
- Sonar argues that understanding the unique characteristics of AI-generated code is necessary for software developers to ensure code security, reliability, and maintainability.
Conclusion
While all leading LLMs possess strong coding capabilities, their unique personalities shape distinct patterns of output that affect the security posture, reliability, and long-term maintainability of AI-generated software. Developers should adopt a “trust but verify” approach supplemented with governance tools to harness each model’s strengths while managing its risks.
- The report on "The Coding Personalities of Leading LLMs" indicates that open source models like OpenCoder-8B also have significant security flaws, with over 60% of their vulnerabilities rated as 'blocker' level.
- The study finds that software generated by AI models, such as Meta's Llama 3.2-vision:90b, can contain common security flaws like path traversal vulnerabilities, which can compromise the overall security of the software.
- To ensure the security, reliability, and maintainability of software produced by AI models, developers should consider the coding personality of each model, using crystal-clear AI-generated code, and implementing strict verification processes, utilizing technology such as crypto and AI tools.