Skip to content

Add “Common Pitfalls in Input Validation” Section to Strengthen Practical Guidance#2231

Open
sujalavnelavai wants to merge 11 commits into
OWASP:masterfrom
sujalavnelavai:input-validation-pitfalls
Open

Add “Common Pitfalls in Input Validation” Section to Strengthen Practical Guidance#2231
sujalavnelavai wants to merge 11 commits into
OWASP:masterfrom
sujalavnelavai:input-validation-pitfalls

Conversation

@sujalavnelavai

Copy link
Copy Markdown
Contributor

This update adds a new “Common Pitfalls in Input Validation” section to the Input Validation Cheat Sheet, as discussed in Issue #2228. The goal is to highlight the real‑world mistakes that developers frequently make even when they believe they are validating input correctly. These issues appear repeatedly during assessments, code reviews, and incident investigations, so documenting them helps readers avoid subtle but high‑impact vulnerabilities.

The new section covers practical pitfalls such as relying only on client‑side validation, using blacklists instead of whitelists, skipping validation after deserialization, trusting internal services too much, unsafe or overly complex regular expressions, Unicode normalization issues, assumptions about JSON input, and missing validation before logging or storing data. Each point is written in a clear, human tone and includes references to relevant OWASP, NIST, CISA, and Unicode guidance.

This section fits naturally after the “Implementing Input Validation” heading and before the allowlisting/denylisting discussion. It strengthens the cheat sheet by giving readers a realistic understanding of where input validation commonly fails in practice, complementing the existing guidance on how to implement it correctly.

You're A Rockstar

Thank you for submitting a Pull Request (PR) to the Cheat Sheet Series.

🚩 If your PR is related to grammar/typo mistakes, please double-check the file for other mistakes in order to fix all the issues in the current cheat sheet.

Please make sure that for your contribution:

  • In case of a new Cheat Sheet, you have used the Cheat Sheet template.
  • All the markdown files do not raise any validation policy violation, see the policy.
  • All the markdown files follow these format rules.
  • All your assets are stored in the assets folder.
  • All the images used are in the PNG format.
  • Any references to websites have been formatted as [TEXT](URL)
  • You verified/tested the effectiveness of your contribution (e.g., the defensive code proposed is really an effective remediation? Please verify it works!).
  • The CI build of your PR pass, see the build status here.

Scope and sourcing (required)

  • This PR is focused: it modifies a single cheat sheet, or a small coordinated set, and the scope is described in the PR body.
  • Every technical claim, recommendation, or threat assertion added in this PR is supported by a primary source (RFC, NIST, OWASP standard, vendor documentation, peer-reviewed research) linked inline as [text](URL).
  • I have read each source I cite and confirm it actually supports the claim. I have not relied on summaries, hearsay, or model-generated citations.

If your PR is related to an issue, please finish your PR text with the following line:

This PR fixes issue #2228 .

AI Tool Usage Disclosure (required for all PRs)

Please select exactly one of the following options. PRs that leave this section blank will be closed.

  • I have NOT used any AI tool to generate the contents of this PR.
  • I used AI assistance only to help draft and refine the wording for this section. All content was manually reviewed, edited, and validated by me. The LLM used is Microsoft Copilot, and the prompt used was: “Help draft and refine the wording for a concise, human‑tone PR title and description. I added a new ‘Common Pitfalls in Input Validation’ section to the Input Validation Cheat Sheet based on Issue Add a "Common Pitfalls" section to the Input Validation Cheat Sheet #2228. The prompt should help produce a clear explanation of what was added, why it matters, and how it improves the cheat sheet, written in a natural and professional style.” I have independently verified every citation and technical claim against the cited sources.
    [Feel free to add more details if needed]

Thank you again for your contribution 😃

Add modern MFA attack patterns (vendor-neutral) to MFA Cheat Sheet
This pull request adds a new “Common Pitfalls in Input Validation” section to the Input Validation Cheat Sheet, as discussed in Issue OWASP#2228. The goal of this update is to highlight the mistakes that developers frequently make even when they believe they are validating input correctly. These pitfalls come up repeatedly in real-world assessments, code reviews, and incident investigations, so documenting them helps readers avoid subtle but high-impact vulnerabilities.

The new section covers practical issues such as relying only on client-side validation, using blacklists instead of whitelists, skipping validation after deserialization, trusting internal services too much, unsafe regular expressions, Unicode normalization problems, and assumptions about JSON input. Each point is written in a clear, human tone and includes references to relevant OWASP, NIST, CISA, and Unicode guidance where appropriate.

This addition fits naturally after the “Implementing Input Validation” section and before the allowlisting/denylisting discussion. It strengthens the cheat sheet by giving readers a realistic understanding of where input validation often fails in practice, complementing the existing guidance on how to implement it correctly.
jmanico
jmanico previously approved these changes Jun 16, 2026
@jmanico

jmanico commented Jun 16, 2026

Copy link
Copy Markdown
Member

cc @mackowski do you like these?

- Regular expressions for any other structured data covering the whole input string `(^...$)` and **not** using "any character" wildcard (such as `.` or `\S`)
- Denylisting known dangerous patterns can be used as an additional layer of defense, but it should supplement - not replace - allowlisting, to help catch some commonly observed attacks or patterns without relying on it as the main validation method.

## Common Pitfalls in Input Validation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is placed immediately before the existing ### Allowlist vs Denylist section, and several of the pitfalls below duplicate guidance already in this document. Consider merging these into the relevant existing sections rather than creating a parallel structure that readers will encounter twice.


### Relying Only on Client-Side Validation

Client-side checks (like JavaScript validation or HTML5 rules) are helpful for user experience, but they’re not security controls. Anyone can bypass them by turning off scripts, modifying requests, or using tools like curl or Burp. The server must always validate the final input. [OWASP Testing Guide](https://owasp.org/www-project-web-security-testing-guide/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the existing guidance already present later in this document. Consider removing this pitfall and adding a cross-reference to that section instead.


### Using Blacklists Instead of Whitelists

Trying to block “bad” input with a blacklist almost always fails. Attackers can use encoding tricks, alternate characters, or new payloads you didn’t think of. It’s much safer to define what *good* input looks like and only allow that. [NIST SP 800‑53 SA-11](https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the ### Allowlist vs Denylist section that immediately follows this new block — the existing section covers this more thoroughly with examples. Recommend removing and letting the existing section carry this point.


### Skipping Validation After Deserialization

Data that looked safe before serialization can become unsafe once it’s parsed again. Formats like JSON, XML, and protobuf can hide unexpected structures or types. Always validate *after* deserialization, when you know exactly what the data looks like. [OWASP Deserialization Cheat Sheet](https://cheatsheetseries.owasp.org/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link text says "OWASP Deserialization Cheat Sheet" but the URL points to the cheat sheet series homepage (https://cheatsheetseries.owasp.org/). Please link directly to https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html.


### Not Validating File Uploads or Filenames

File uploads are a huge attack surface. You need to validate the file type, size, extension, and even the filename. Attackers can sneak in traversal sequences or special characters that cause trouble later. [OWASP File Upload Cheat Sheet](https://cheatsheetseries.owasp.org/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above — the link says "OWASP File Upload Cheat Sheet" but resolves to the homepage. Please link to https://cheatsheetseries.owasp.org/cheatsheets/File_Upload_Cheat_Sheet.html.


### Ignoring Nested JSON or Large Arrays

Attackers often hide malicious data deep inside nested objects or huge arrays. Validation needs to walk the entire structure and enforce limits at every level.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No citation provided. The point is valid (depth and size limits matter for billion-laughs-style attacks) but needs a reference to be mergeable.


### Overlooking Unicode Normalization

Unicode can be tricky. Different characters can look identical or behave unexpectedly, which can let attackers slip past naive filters. Normalizing input (like using NFC) helps avoid these issues. [Unicode Security Considerations](https://unicode.org/reports/tr36/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Normalizing input (like using NFC) helps avoid these issues" is an oversimplification — Unicode TR36 (which you cite) explicitly warns that NFC normalization can itself be exploited in certain contexts. The correct guidance is that normalization must happen before validation, not as a standalone fix. Please revise.


### Forgetting to Validate Before Logging or Storing Data

Unvalidated input written to logs or databases can lead to log injection, stored XSS, or parsing errors later. Validation should happen before the data goes anywhere—logs included.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Validation should happen before the data goes anywhere—logs included" conflates input validation with output encoding. Log injection is an output encoding problem — you escape log entries on the way out, not reject input on the way in. Please revise to clarify the distinction.

This commit updates the “Common Pitfalls in Input Validation” section based on reviewer feedback. The changes remove duplicated guidance, correct citation links, refine technical explanations, and align the content with existing OWASP standards.
Key updates:
Removed pitfalls that duplicated existing sections and added cross‑references to “Server‑Side Validation” and “Allowlist vs Denylist.”

Corrected URLs for the Deserialization and File Upload Cheat Sheets to point directly to the specific documents.

Added a primary reference (OWASP API Security Top 10 – API4:2023) to support the pitfall on nested JSON and large arrays.

Revised the Unicode normalization guidance to reflect the recommendations in Unicode TR36, emphasizing normalization before validation and clarifying that normalization alone is not a security control.

Updated the logging pitfall to correctly distinguish input validation from output encoding, referencing the OWASP Logging Cheat Sheet.
Ensured all remaining pitfalls are concise, accurate, and consistent with the structure and tone of the existing cheat sheet.

These changes fully address the reviewer comments and improve the clarity, accuracy, and maintainability of the new section.
@sujalavnelavai

Copy link
Copy Markdown
Contributor Author

Thank you for the detailed review. I’ve applied all the requested changes:

Removed the pitfalls that duplicated existing guidance and added cross‑references to the relevant sections.

Updated the Deserialization and File Upload Cheat Sheet links to point directly to the correct pages.

Added a primary reference (OWASP API Security Top 10 – API4:2023) for the nested JSON and large array pitfall.

Revised the Unicode normalization explanation to align with Unicode TR36, clarifying that normalization must occur before validation and is not a standalone control.

Updated the logging pitfall to correctly distinguish input validation from output encoding, following the OWASP Logging Cheat Sheet.

Fixed formatting and citation placement across all updated sections.

Please let me know if any further adjustments are needed.

@mackowski mackowski left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Links are not working or are old


### Trusting Internal APIs or Microservices Too Much

It’s common for internal services to skip validation because they’re “inside the perimeter.” But modern attacks often target internal trust boundaries. Every service—internal or external—needs proper validation. [CISA Zero Trust Guidance](https://www.cisa.gov/zero-trust-maturity-model)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is archived link
"
Archived Content
In an effort to keep CISA.gov current, the archive contains outdated information that may not reflect current policy or programs.
"


### MFA Fatigue / Push Abuse

Attackers repeatedly trigger push notifications to overwhelm users until one is approved accidentally or out of frustration. This technique has been widely documented in real-world intrusions. [Microsoft](https://www.microsoft.com/en-us/security/blog/2022/09/12/defending-against-mfa-fatigue-attacks/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is 404 URL


### Token Theft

Session cookies, refresh tokens, or other post-authentication tokens can be stolen and reused, allowing attackers to bypass MFA entirely. NIST highlights the risks of session hijacking and the importance of binding authentication to the client. [NIST SP 800‑63B](https://pages.nist.gov/800-63-3/sp800-63b.html)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old link "This revision of NIST SP 800-63 has been superseded by NIST SP 800-63-4 as of August 1, 2025. Please refer to those documents for the current guidelines."


### Reverse-Proxy Phishing Kits

Reverse-proxy phishing frameworks can intercept MFA codes or session tokens in real time by sitting between the user and the legitimate service. CISA recommends phishing-resistant MFA specifically to mitigate these attacks. [CISA](https://www.cisa.gov/news-events/alerts/2022/10/31/implementing-phishing-resistant-mfa)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page do not exist "Page Not Found"


### Cloud MFA Misconfigurations

Legacy protocols, weak conditional access rules, or incomplete MFA enforcement can leave gaps that attackers exploit. Misconfigurations are a leading cause of MFA bypass in cloud environments. [Microsoft](https://www.microsoft.com/en-us/security/blog/2021/10/07/5-identity-attack-vectors-and-how-to-prevent-them/)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

404

Updated the section " Trusting Internal APIs or Microservices Too Much" with the reference.
Corrected the lint error.
@sujalavnelavai

Copy link
Copy Markdown
Contributor Author

Hi @mackowski — just a quick clarification.

The Modern MFA Attack Patterns content appearing inside this PR was accidental. Earlier, both the Input Validation PR and the MFA PR were created from the same branch, so GitHub automatically combined the changes.

I have now created a separate, clean PR specifically for the Modern MFA Attack Patterns section, with the correct base and branch. That PR contains only the MFA updates.

This PR (#2231) should now be treated as Input Validation only.

Thanks for the detailed review and guidance.

@sujalavnelavai

Copy link
Copy Markdown
Contributor Author

Thank you for the review and guidance @mackowski.

I have added updated references to the sections in common pitfalls.

If any changes required please let me know.

@sujalavnelavai

Copy link
Copy Markdown
Contributor Author

I want to make sure the content I’m adding is accurate because both “Common Pitfalls in Input Validation” and “Modern MFA Attack Patterns” are sensitive topics that influence secure software development.

While refining these sections, I realised several modern references have moved, changed URLs, or no longer exist. I’m being careful not to include outdated or misleading sources.

I would appreciate guidance on preferred or authoritative references for these topics so I can align with OWASP standards and ensure the content is reliable.

Thank you for your support — I want to make sure I contribute high‑quality material.

Added reference under the section " Assuming JSON Input Is Automatically Safe".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants