Introduction
Information disclosure occurs when an application unintentionally exposes sensitive data to users. This is not always the result of a direct attack; often it is a side effect of poor design, verbose error handling, or incomplete configuration.
The leaked data can take many forms. It might include personal information about other users, sensitive business data such as revenue figures or unreleased features, or technical details about the underlying infrastructure. Even seemingly harmless information, such as software versions, internal IP addresses, or stack traces, can provide attackers with the context they need to identify and exploit other vulnerabilities.
Testing for Information Disclosure
Information disclosure can surface anywhere in an application. Finding these requires both passive and active techniques across multiple layers of the stack.
Passive Reconnaissance
Start by searching for information that has already been leaked outside the application’s direct control.
- Search engines and archives: Look for exposed documents or old content indexed by search engines and services like the Wayback Machine.
- Code repositories: Search GitHub, GitLab, and public paste sites for source code, configuration files, or internal documentation that was unintentionally published.
- Subdomain enumeration: Discover subdomains that may host development environments, staging sites, or administrative panels with weaker security controls.
Active Discovery
Once the application has been mapped, actively check for hidden or unlinked resources.
- Fuzzing: Brute-force subdomains, URL paths, parameters, and API endpoints. A hidden endpoint or parameter may return raw data that should not be exposed.
- Directory listings: Check whether directory indexing is enabled on web servers. An open directory may expose configuration files, backups, or logs.
- Known files: Check for standard files such as
robots.txt,sitemap.xml,.htaccess, orcrossdomain.xml. These files can reveal the existence of hidden paths or access policies.
Error Handling Analysis
Applications often leak sensitive information through error messages.
- Trigger errors intentionally: Submit malformed input, invalid parameters, or unexpected data types to provoke error responses.
- Analyze verbose errors: Look for stack traces, database schema details, internal file paths, or variable names in error messages. These details provide insight into the application’s architecture and dependencies.
Source Code Inspection
The client-side portion of the application is fully visible to an attacker and should be treated as part of the attack surface.
- HTML comments: Developers sometimes leave debugging notes, internal URLs, or credentials in HTML comments.
- JavaScript files: Minified or bundled scripts may still contain hardcoded API keys, internal endpoints, or logic that reveals how the backend operates.
- Backup files: Check for source code backups created by editors (e.g.,
index.php~,index.php.bak) or version control artifacts.
Version Control Exposure
A .git or .svn directory left in the web root can expose the entire project history.
- Directory traversal: Attempt to access
/.git/or/.svn/directly. If directory listing is enabled, it may be possible to download the repository. - Specialized tools: Tools like
git-dumpercan reconstruct a full Git repository from an exposed.gitdirectory, revealing source code, commit messages, and potentially credentials.
Remediation
Preventing information disclosure is challenging because it can arise from many different sources, code, configuration, infrastructure, or even human error.
- Define what is sensitive: Everyone involved in the system must understand what information is considered sensitive. Without a clear definition, it is impossible to protect it consistently.
- Audit code and configuration: Regularly review source code for hardcoded secrets, debugging endpoints, and verbose error handling. Ensure that debugging features are disabled in production.
- Harden third-party components: All third-party technology integrated into the system must be properly configured. Default configurations often expose unnecessary information through headers, error pages, or administrative interfaces.
- Minimize error verbosity: Configure applications to return generic error messages to users while logging detailed errors internally for debugging purposes.
- Restrict access to metadata: Disable directory listing, remove version control directories from the web root, and ensure that backup files are not served by the web server.