Add “Common Pitfalls in Input Validation” Section to Strengthen Practical Guidance#2231
Add “Common Pitfalls in Input Validation” Section to Strengthen Practical Guidance#2231sujalavnelavai wants to merge 11 commits into
Conversation
Add modern MFA attack patterns (vendor-neutral) to MFA Cheat Sheet
This pull request adds a new “Common Pitfalls in Input Validation” section to the Input Validation Cheat Sheet, as discussed in Issue OWASP#2228. The goal of this update is to highlight the mistakes that developers frequently make even when they believe they are validating input correctly. These pitfalls come up repeatedly in real-world assessments, code reviews, and incident investigations, so documenting them helps readers avoid subtle but high-impact vulnerabilities. The new section covers practical issues such as relying only on client-side validation, using blacklists instead of whitelists, skipping validation after deserialization, trusting internal services too much, unsafe regular expressions, Unicode normalization problems, and assumptions about JSON input. Each point is written in a clear, human tone and includes references to relevant OWASP, NIST, CISA, and Unicode guidance where appropriate. This addition fits naturally after the “Implementing Input Validation” section and before the allowlisting/denylisting discussion. It strengthens the cheat sheet by giving readers a realistic understanding of where input validation often fails in practice, complementing the existing guidance on how to implement it correctly.
|
cc @mackowski do you like these? |
| - Regular expressions for any other structured data covering the whole input string `(^...$)` and **not** using "any character" wildcard (such as `.` or `\S`) | ||
| - Denylisting known dangerous patterns can be used as an additional layer of defense, but it should supplement - not replace - allowlisting, to help catch some commonly observed attacks or patterns without relying on it as the main validation method. | ||
|
|
||
| ## Common Pitfalls in Input Validation |
There was a problem hiding this comment.
This section is placed immediately before the existing ### Allowlist vs Denylist section, and several of the pitfalls below duplicate guidance already in this document. Consider merging these into the relevant existing sections rather than creating a parallel structure that readers will encounter twice.
|
|
||
| ### Relying Only on Client-Side Validation | ||
|
|
||
| Client-side checks (like JavaScript validation or HTML5 rules) are helpful for user experience, but they’re not security controls. Anyone can bypass them by turning off scripts, modifying requests, or using tools like curl or Burp. The server must always validate the final input. [OWASP Testing Guide](https://owasp.org/www-project-web-security-testing-guide/) |
There was a problem hiding this comment.
This duplicates the existing guidance already present later in this document. Consider removing this pitfall and adding a cross-reference to that section instead.
|
|
||
| ### Using Blacklists Instead of Whitelists | ||
|
|
||
| Trying to block “bad” input with a blacklist almost always fails. Attackers can use encoding tricks, alternate characters, or new payloads you didn’t think of. It’s much safer to define what *good* input looks like and only allow that. [NIST SP 800‑53 SA-11](https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final) |
There was a problem hiding this comment.
This duplicates the ### Allowlist vs Denylist section that immediately follows this new block — the existing section covers this more thoroughly with examples. Recommend removing and letting the existing section carry this point.
|
|
||
| ### Skipping Validation After Deserialization | ||
|
|
||
| Data that looked safe before serialization can become unsafe once it’s parsed again. Formats like JSON, XML, and protobuf can hide unexpected structures or types. Always validate *after* deserialization, when you know exactly what the data looks like. [OWASP Deserialization Cheat Sheet](https://cheatsheetseries.owasp.org/) |
There was a problem hiding this comment.
The link text says "OWASP Deserialization Cheat Sheet" but the URL points to the cheat sheet series homepage (https://cheatsheetseries.owasp.org/). Please link directly to https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html.
|
|
||
| ### Not Validating File Uploads or Filenames | ||
|
|
||
| File uploads are a huge attack surface. You need to validate the file type, size, extension, and even the filename. Attackers can sneak in traversal sequences or special characters that cause trouble later. [OWASP File Upload Cheat Sheet](https://cheatsheetseries.owasp.org/) |
There was a problem hiding this comment.
Same issue as above — the link says "OWASP File Upload Cheat Sheet" but resolves to the homepage. Please link to https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html.
|
|
||
| ### Ignoring Nested JSON or Large Arrays | ||
|
|
||
| Attackers often hide malicious data deep inside nested objects or huge arrays. Validation needs to walk the entire structure and enforce limits at every level. |
There was a problem hiding this comment.
No citation provided. The point is valid (depth and size limits matter for billion-laughs-style attacks) but needs a reference to be mergeable.
|
|
||
| ### Overlooking Unicode Normalization | ||
|
|
||
| Unicode can be tricky. Different characters can look identical or behave unexpectedly, which can let attackers slip past naive filters. Normalizing input (like using NFC) helps avoid these issues. [Unicode Security Considerations](https://unicode.org/reports/tr36/) |
There was a problem hiding this comment.
"Normalizing input (like using NFC) helps avoid these issues" is an oversimplification — Unicode TR36 (which you cite) explicitly warns that NFC normalization can itself be exploited in certain contexts. The correct guidance is that normalization must happen before validation, not as a standalone fix. Please revise.
|
|
||
| ### Forgetting to Validate Before Logging or Storing Data | ||
|
|
||
| Unvalidated input written to logs or databases can lead to log injection, stored XSS, or parsing errors later. Validation should happen before the data goes anywhere—logs included. |
There was a problem hiding this comment.
"Validation should happen before the data goes anywhere—logs included" conflates input validation with output encoding. Log injection is an output encoding problem — you escape log entries on the way out, not reject input on the way in. Please revise to clarify the distinction.
This commit updates the “Common Pitfalls in Input Validation” section based on reviewer feedback. The changes remove duplicated guidance, correct citation links, refine technical explanations, and align the content with existing OWASP standards. Key updates: Removed pitfalls that duplicated existing sections and added cross‑references to “Server‑Side Validation” and “Allowlist vs Denylist.” Corrected URLs for the Deserialization and File Upload Cheat Sheets to point directly to the specific documents. Added a primary reference (OWASP API Security Top 10 – API4:2023) to support the pitfall on nested JSON and large arrays. Revised the Unicode normalization guidance to reflect the recommendations in Unicode TR36, emphasizing normalization before validation and clarifying that normalization alone is not a security control. Updated the logging pitfall to correctly distinguish input validation from output encoding, referencing the OWASP Logging Cheat Sheet. Ensured all remaining pitfalls are concise, accurate, and consistent with the structure and tone of the existing cheat sheet. These changes fully address the reviewer comments and improve the clarity, accuracy, and maintainability of the new section.
|
Thank you for the detailed review. I’ve applied all the requested changes: Removed the pitfalls that duplicated existing guidance and added cross‑references to the relevant sections. Updated the Deserialization and File Upload Cheat Sheet links to point directly to the correct pages. Added a primary reference (OWASP API Security Top 10 – API4:2023) for the nested JSON and large array pitfall. Revised the Unicode normalization explanation to align with Unicode TR36, clarifying that normalization must occur before validation and is not a standalone control. Updated the logging pitfall to correctly distinguish input validation from output encoding, following the OWASP Logging Cheat Sheet. Fixed formatting and citation placement across all updated sections. Please let me know if any further adjustments are needed. |
mackowski
left a comment
There was a problem hiding this comment.
Links are not working or are old
|
|
||
| ### Trusting Internal APIs or Microservices Too Much | ||
|
|
||
| It’s common for internal services to skip validation because they’re “inside the perimeter.” But modern attacks often target internal trust boundaries. Every service—internal or external—needs proper validation. [CISA Zero Trust Guidance](https://www.cisa.gov/zero-trust-maturity-model) |
There was a problem hiding this comment.
this is archived link
"
Archived Content
In an effort to keep CISA.gov current, the archive contains outdated information that may not reflect current policy or programs.
"
|
|
||
| ### MFA Fatigue / Push Abuse | ||
|
|
||
| Attackers repeatedly trigger push notifications to overwhelm users until one is approved accidentally or out of frustration. This technique has been widely documented in real-world intrusions. [Microsoft](https://www.microsoft.com/en-us/security/blog/2022/09/12/defending-against-mfa-fatigue-attacks/) |
|
|
||
| ### Token Theft | ||
|
|
||
| Session cookies, refresh tokens, or other post-authentication tokens can be stolen and reused, allowing attackers to bypass MFA entirely. NIST highlights the risks of session hijacking and the importance of binding authentication to the client. [NIST SP 800‑63B](https://pages.nist.gov/800-63-3/sp800-63b.html) |
There was a problem hiding this comment.
Old link "This revision of NIST SP 800-63 has been superseded by NIST SP 800-63-4 as of August 1, 2025. Please refer to those documents for the current guidelines."
|
|
||
| ### Reverse-Proxy Phishing Kits | ||
|
|
||
| Reverse-proxy phishing frameworks can intercept MFA codes or session tokens in real time by sitting between the user and the legitimate service. CISA recommends phishing-resistant MFA specifically to mitigate these attacks. [CISA](https://www.cisa.gov/news-events/alerts/2022/10/31/implementing-phishing-resistant-mfa) |
There was a problem hiding this comment.
Page do not exist "Page Not Found"
|
|
||
| ### Cloud MFA Misconfigurations | ||
|
|
||
| Legacy protocols, weak conditional access rules, or incomplete MFA enforcement can leave gaps that attackers exploit. Misconfigurations are a leading cause of MFA bypass in cloud environments. [Microsoft](https://www.microsoft.com/en-us/security/blog/2021/10/07/5-identity-attack-vectors-and-how-to-prevent-them/) |
Updated the section " Trusting Internal APIs or Microservices Too Much" with the reference.
Corrected the lint error.
|
Hi @mackowski — just a quick clarification. The Modern MFA Attack Patterns content appearing inside this PR was accidental. Earlier, both the Input Validation PR and the MFA PR were created from the same branch, so GitHub automatically combined the changes. I have now created a separate, clean PR specifically for the Modern MFA Attack Patterns section, with the correct base and branch. That PR contains only the MFA updates. This PR (#2231) should now be treated as Input Validation only. Thanks for the detailed review and guidance. |
|
Thank you for the review and guidance @mackowski. I have added updated references to the sections in common pitfalls. If any changes required please let me know. |
|
I want to make sure the content I’m adding is accurate because both “Common Pitfalls in Input Validation” and “Modern MFA Attack Patterns” are sensitive topics that influence secure software development. While refining these sections, I realised several modern references have moved, changed URLs, or no longer exist. I’m being careful not to include outdated or misleading sources. I would appreciate guidance on preferred or authoritative references for these topics so I can align with OWASP standards and ensure the content is reliable. Thank you for your support — I want to make sure I contribute high‑quality material. |
Added reference under the section " Assuming JSON Input Is Automatically Safe".
This update adds a new “Common Pitfalls in Input Validation” section to the Input Validation Cheat Sheet, as discussed in Issue #2228. The goal is to highlight the real‑world mistakes that developers frequently make even when they believe they are validating input correctly. These issues appear repeatedly during assessments, code reviews, and incident investigations, so documenting them helps readers avoid subtle but high‑impact vulnerabilities.
The new section covers practical pitfalls such as relying only on client‑side validation, using blacklists instead of whitelists, skipping validation after deserialization, trusting internal services too much, unsafe or overly complex regular expressions, Unicode normalization issues, assumptions about JSON input, and missing validation before logging or storing data. Each point is written in a clear, human tone and includes references to relevant OWASP, NIST, CISA, and Unicode guidance.
This section fits naturally after the “Implementing Input Validation” heading and before the allowlisting/denylisting discussion. It strengthens the cheat sheet by giving readers a realistic understanding of where input validation commonly fails in practice, complementing the existing guidance on how to implement it correctly.
You're A Rockstar
Thank you for submitting a Pull Request (PR) to the Cheat Sheet Series.
Please make sure that for your contribution:
[TEXT](URL)Scope and sourcing (required)
[text](URL).If your PR is related to an issue, please finish your PR text with the following line:
This PR fixes issue #2228 .
AI Tool Usage Disclosure (required for all PRs)
Please select exactly one of the following options. PRs that leave this section blank will be closed.
[Feel free to add more details if needed]
Thank you again for your contribution 😃