Security & OWASP Inside RSI: Why Self-Modifying AI Needs Vulnerability Detection

A self-modifying AI system has a security property that no static codebase has: it can introduce vulnerabilities not through deliberate action but through statistical pattern-matching. RSI (Recursive Self-Improvement) generates new code by learning from existing code. If the existing codebase contains examples of SQL string interpolation, unsanitized HTML output, or shell commands constructed from user input — even if those examples are buried in comments or test fixtures — RSI's pattern-matching may replicate those patterns in the code it generates. The security analysis component exists to catch these replications before they enter production.

This is a fundamentally different threat model from the one that traditional static analysis tools address. A traditional SAST tool assumes that a human wrote the code and scans for known-bad patterns. Component 8 of the RSI Safety System does the same scanning, but with the understanding that the code was generated by an AI that may have absorbed insecure patterns from training data — and that the AI runs continuously, potentially introducing vulnerabilities at a rate no human code review process could keep pace with.

"When an AI system modifies its own code, it can inadvertently introduce security vulnerabilities — not through malice but through pattern-matching to known coding styles that happen to be insecure. The security analysis runs BEFORE any RSI modification is allowed to proceed."

Component 8: Scope and Scale

Component 8 is implemented in 700+ lines and passes all 10 of its tests (100% pass rate). Its documentation, SECURITY_ANALYSIS_IMPLEMENTATION.md, runs to 1,200+ lines — making it the longest documentation file in the entire RSI Safety System. This is not an accident. Security is not an afterthought in a system that can modify itself autonomously. The documentation length reflects the deliberateness with which the security analysis was designed.

700+

Lines of Code

Component 8

10/10

Tests Passing

100% pass rate

60+

Detection Patterns

regex-based

1,200+

Doc Lines

longest in RSI system

The 8 API Endpoints

Component 8 exposes a complete security analysis API under the /api/code-intelligence/security-analysis/ namespace:

POST /api/code-intelligence/security-analysis/analyze
GET  /api/code-intelligence/security-analysis/results/:analysisId
POST /api/code-intelligence/security-analysis/batch
GET  /api/code-intelligence/security-analysis/stats
POST /api/code-intelligence/security-analysis/dependency-scan
GET  /api/code-intelligence/security-analysis/vulnerabilities
POST /api/code-intelligence/security-analysis/taint-trace
GET  /api/code-intelligence/security-analysis/owasp-coverage

The /analyze endpoint accepts a code snippet or file path and returns a full vulnerability report. The /batch endpoint accepts multiple files simultaneously, enabling RSI to scan all files in a proposed modification set before committing any of them. The /taint-trace endpoint performs a targeted taint analysis from a specified source to a specified sink. The /owasp-coverage endpoint returns which OWASP Top 10 categories are covered by the current detection pattern set.

OWASP Top 10 Coverage

The security analysis is organized around OWASP Top 10 vulnerabilities — the industry-standard taxonomy of the most critical web application security risks. Coverage across the OWASP categories:

OWASP Category	CWE	Detection Method	RSI Severity
SQL Injection	CWE-89	String interpolation in query context	CRITICAL
Cross-Site Scripting (XSS)	CWE-79	Unsanitised output to HTML context	HIGH
OS Command Injection	CWE-78	exec/spawn with user-controlled data	CRITICAL
Path Traversal	CWE-22	../ in file path construction	HIGH
Hardcoded Credentials	CWE-798	API keys, passwords in source	CRITICAL
Insecure Randomness	CWE-338	Math.random() for security context	MEDIUM
Prototype Pollution	CWE-1321	Object.assign with user-controlled data	HIGH
ReDoS	CWE-1333	Catastrophic backtracking regex patterns	MEDIUM

Each vulnerability finding includes the CWE ID and OWASP category for standards compliance. This matters for two reasons: it enables automated reporting in standards-compliant formats (SAST report formats, compliance dashboards), and it creates a vocabulary that security reviewers can reason about independently of the specific implementation details.

The 60+ Detection Patterns

The 60+ detection patterns are organized by vulnerability class. SQL injection detection patterns cover string template literals in database query contexts, string concatenation with variable data in query strings, and unparameterized query construction. XSS patterns cover template literal injection into HTML, innerHTML assignment with user-controlled data, and document.write with dynamic content.

Command injection patterns are particularly important for an AI platform with autonomous agents. The organisms (KAALI, ALICE, UNI) execute shell commands as part of their autonomy — running tests, managing files, querying the filesystem. If RSI generates code that constructs shell commands from user input, the command injection vulnerability is potentially exploitable by any user of the platform. The detection patterns flag any use of exec, spawn, or execSync where the command string contains variables that could be user-controlled.

Why Command Injection is Especially Dangerous on an AI Platform

Autonomous agents that execute shell commands are a necessary feature of a self-improving system. They are also a high-value target for command injection. An attacker who can inject shell commands into an AI organism's execution context has effectively gained code execution on the host. Component 8's command injection patterns are therefore treated as CRITICAL severity — any RSI-generated code that constructs shell commands from user-controlled data is blocked unconditionally.

Taint Tracking: From Source to Sink

Pattern matching catches known-bad code shapes. Taint tracking catches a broader class of vulnerabilities: those where user-controlled data flows through multiple assignment and transformation steps before reaching a dangerous sink. A vulnerability that traverses four variable assignments before reaching a database query cannot be detected by a single-line pattern. Taint tracking follows the data flow.

The taint sources are the entry points for user-controlled data in a Node.js/Express application: req.body, req.query, req.params, and req.headers. Data from these sources is marked as "tainted." Every assignment from a tainted variable propagates the taint to the new variable. Every function call that receives a tainted argument propagates the taint to the function's return value if the function does not sanitize its input.

The canonical SQL injection taint flow:

taint-tracking — SQL injection example JavaScript

// Source: req.body.userId (tainted)
const userId = req.body.userId;              // taint propagates

// Intermediate: assignment (taint follows)
const query = `SELECT * FROM users WHERE id = ${userId}`;  // DANGER

// Sink: database query (CRITICAL: SQL injection)
db.query(query);  // ← tainted data reaches DB sink unsanitised

Each hop in the taint chain is recorded. The taint report produced by Component 8 includes: the original taint source (line number, variable name), the full chain of assignments (with line numbers), and the sink where the tainted data arrives. This chain is the evidence that a human reviewer needs to understand the vulnerability and design the correct fix — whether parameterized queries, input validation, or output encoding.

Sanitization Recognition

Taint tracking without sanitization recognition produces false positives: it would flag code that correctly sanitizes user input before using it. Component 8 includes a library of recognized sanitization functions that break the taint chain:

Recognized Sanitization Breaks

SQL: parameterized queries (db.query('SELECT ... WHERE id = ?', [userId])), ORM methods (.findById(userId)). XSS: DOMPurify, escapeHtml, sanitize-html. Path: path.resolve with validation, path.normalize + absolute path check. Command: shell-escape, input validation against allowlist. If tainted data passes through a recognized sanitization function before reaching a sink, the taint chain is broken and no vulnerability is reported.

The sanitization library is intentionally conservative: only well-known, widely-used sanitization functions are recognized. Custom sanitization functions are not automatically trusted — they require explicit allowlisting in the security analysis configuration. This prevents a common category of vulnerability where developers implement their own sanitization that is subtly incomplete.

npm Audit Integration

Code-level vulnerabilities are only one dimension of the security surface. Dependencies introduce a second dimension: known CVEs in npm packages. An RSI modification that is itself clean may run on a dependency chain that contains a critical CVE, making the modification dangerous even though the generated code is correct.

Before any RSI modification is permitted, the dependency scan runs npm audit and parses the output. Critical CVEs in direct or transitive dependencies block RSI regardless of the modification's own security status. High CVEs generate a warning that must be acknowledged before proceeding. The dependency scan also checks for packages that have been deprecated or withdrawn from npm — another common source of supply-chain vulnerabilities.

npm Audit Integration Logic

CRITICAL CVE in any dependency → RSI blocked. HIGH CVE in any dependency → Warning, requires explicit acknowledgment. MODERATE CVE → Logged, RSI proceeds. LOW CVE → Logged silently. The threshold for blocking is conservative by design: a CRITICAL CVE in a transitive dependency may not be directly exploitable, but RSI cannot make that judgment accurately and conservatism is correct here.

Severity Classification and RSI Gating

The severity classification system has four levels — CRITICAL, HIGH, MEDIUM, LOW — with specific RSI gating behavior at each level:

CRITICAL

BLOCKS

HIGH

BLOCKS

MEDIUM

WARNS

LOW

LOGS

CRITICAL and HIGH findings block RSI — the modification cannot proceed until the vulnerability is remediated. MEDIUM findings generate a warning that must be acknowledged but do not block RSI. LOW findings are logged for review but require no immediate action. This tiering is calibrated to the risk profile of each severity level: a SQL injection (CRITICAL) introduced by an autonomous AI cannot wait for the next code review cycle. A use of Math.random() in a non-security context (LOW) can be addressed during normal development.

Why Security Analysis Is Inside RSI, Not External

A reasonable engineering choice would be to run security analysis as an external CI/CD step, scanning the entire codebase after each commit. Component 8 is inside RSI's modification pipeline rather than outside it — this is a deliberate architectural decision with specific justification.

An external CI/CD security scan runs after the code is committed. If RSI commits a modification with a SQL injection vulnerability, the external scan catches it, but the modification is already in the repository. Rolling it back requires a revert commit, which RSI may have already built upon. The state management problem compounds quickly when a self-modifying system generates multiple modifications per hour.

"The security analysis is a prerequisite for self-modification. RSI generates code by pattern-matching to existing code. If RSI has seen insecure patterns in the codebase, it might replicate those patterns. The security analysis catches this before it enters production."

By running inside RSI's pre-modification pipeline, Component 8 prevents vulnerable code from ever entering the repository. The modification is analyzed before it is written to disk, before it is staged, before it is committed. If the analysis finds a CRITICAL vulnerability, the modification is discarded entirely and RSI generates an alternative approach. The repository remains clean.

The Pattern-Matching Vulnerability Specific to AI

The threat model for Component 8 is worth stating precisely because it is genuinely different from the threat model for SAST tools in a typical development pipeline. A human developer who writes a SQL injection vulnerability typically did so by mistake — they were not thinking about parameterized queries at that moment. They do not have a systematic tendency to produce SQL injections; they produce one here and there, and code review usually catches them.

An AI system that learns from code has a different failure mode. If the training corpus or the existing codebase contains instances of SQL string interpolation — even in test files, migration scripts, or examples in comments — the AI's pattern-matching may assign non-negligible probability to string interpolation as the "normal" way to construct database queries. When RSI generates a new database query, it may reach for string interpolation not because of a momentary lapse but because that pattern has a learned prior probability from the code it has seen.

The AI Pattern-Replication Risk

This is not a hypothetical risk. Language models routinely produce SQL string interpolation, unescaped HTML output, and hardcoded API keys when generating code examples, because these patterns appear in training data. An AI system that generates code from a corpus containing any such examples will occasionally produce them. Component 8 treats this as a near-certainty at scale — with RSI generating hundreds of modifications, some percentage will contain learned-insecure patterns, and the security analysis must catch them all.

The 1,200-Line Documentation Standard

The SECURITY_ANALYSIS_IMPLEMENTATION.md documentation at 1,200+ lines is the most detailed documentation in the RSI Safety System. This length reflects a deliberate standard: security components in a self-modifying system must be documented with enough detail that any engineer can understand what is being checked, what is being missed, and what assumptions the analysis makes.

The documentation covers: the full list of 60+ detection patterns with examples of code that triggers each pattern and code that does not; the taint tracking algorithm including edge cases; the sanitization library with justifications for each recognized function; the severity classification criteria; the RSI gating logic; and a section on known limitations — patterns that the system does not yet detect, false positive scenarios, and conditions under which the security analysis may be bypassed intentionally.

The known limitations section is particularly important for honesty. Component 8 does not detect: second-order SQL injection (where tainted data is stored and later retrieved unsanitized), timing-based vulnerabilities, cryptographic weaknesses in algorithm selection, or SSRF (Server-Side Request Forgery) without explicit domain allowlist checking. These gaps are documented not to minimize the system's capability but to be precise about what it guarantees and what it does not.

Security as the Most Important RSI Prerequisite

Among the safety components of the RSI system — performance regression detection, dependency validation, security analysis, and others — security analysis receives the longest documentation and the most conservative gating thresholds. This ordering reflects the consequence structure: a performance regression can be measured and reverted. A security vulnerability that ships to production can be exploited before it is detected, and the consequences of exploitation — data exposure, privilege escalation, user harm — cannot be undone by reverting a commit.

The 700+ lines of implementation and 1,200+ lines of documentation represent the engineering investment appropriate to that consequence structure. Component 8 is not the most complex component in the RSI system. It is the most consequential, and the documentation and implementation scale accordingly.