Skip to content

[ENG-158] feat: Abuse protection for OTP based login#3632

Open
praffq wants to merge 19 commits into
developfrom
ENG-158-otp-abuse-protection
Open

[ENG-158] feat: Abuse protection for OTP based login#3632
praffq wants to merge 19 commits into
developfrom
ENG-158-otp-abuse-protection

Conversation

@praffq

@praffq praffq commented May 4, 2026

Copy link
Copy Markdown
Contributor

Proposed Changes

  • enchancements to enforce more strict control over OTP

Associated Issue

  • Link to issue here, explain how the proposed solution will solve the reported issue/ feature request.

Architecture changes

  • Remove this section if not used

Merge Checklist

  • Tests added/fixed
  • Update docs in /docs
  • Linting Complete
  • Any other necessary step

Only PR's with test cases included and passing lint and test pipelines will be reviewed

@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins

Summary by CodeRabbit

  • New Features

    • Per-phone resend limits, per-OTP verification caps, and atomic validation of the latest unused OTP
    • Successful OTP login now returns an access token and consumes the OTP
  • Chores

    • Configurable OTP policies (send window, send cap, verify attempts, failures, lockout, validity) and DB tracking for failed attempts
    • Periodic cleanup to expire old OTPs; example env vars updated
  • Tests

    • End-to-end tests covering send/login flows, rate limits, lockout, retries, and edge cases

@praffq praffq requested a review from a team as a code owner May 4, 2026 08:32

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai

coderabbitai Bot commented May 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

OTP flow refactored to add per-phone failed-attempt aggregation and sliding lockout, windowed send rate limits, transactional OTP verification with row locks, a send lock class, nightly cleanup task, model field/index migration, settings/env entries for throttling, and comprehensive tests for these behaviors.

Changes

OTP Lockout & Throttling System

Layer / File(s) Summary
Configuration & Policy
config/settings/base.py, .env.example
Replaced repeat-based OTP settings with windowed throttling and lockout: added OTP_SEND_WINDOW_MINUTES, OTP_MAX_SENDS_PER_WINDOW, OTP_MAX_VERIFY_ATTEMPTS, OTP_MAX_FAILURES, OTP_LOCKOUT_MINUTES, OTP_VALIDITY_MINUTES and updated example env variables.
Data Model & Schema
care/facility/models/patient.py, care/facility/migrations/0485_patientmobileotp_failed_attempts_and_more.py, care/facility/migrations/0486_rename_patientmobileotp_mobileotp.py
Added failed_attempts (PositiveSmallIntegerField) to PatientMobileOTP; migration adds the field and two conditional indexes on (phone_number, -created_date) and (phone_number, -modified_date) filtered by active rows; updated migration dependency.
Concurrency Control
care/emr/locks/otp.py
New OTPSendLock class (subclass of Lock) that builds a per-phone redis lock key lock:otp_send:{phone_number} with configurable timeout.
Core Verification & Send Logic
care/emr/api/otp_viewsets/login.py
Added failure_count(phone_number) aggregating failed_attempts within lockout window; send now rejects when failure_countOTP_MAX_FAILURES, enforces send-cap by counting OTPs created in the send window (no longer requires is_used=False), and marks older OTPs used before creating a new one; login checks failure_count, runs in transaction.atomic() with select_for_update on the latest unused OTP, consumes OTP on match and returns token, increments failed_attempts on mismatch, invalidates OTP at verify-attempt or failure thresholds, and raises field-specific validation when max verify attempts reached; BaseOTPType.render_content converted to a @classmethod.
Background Maintenance
care/emr/tasks/cleanup_expired_otps.py, care/emr/tasks/__init__.py
New Celery task cleanup_expired_otps computes cutoff using OTP_LOCKOUT_MINUTES, soft-deletes (marks deleted=True) expired OTP rows, logs the count, and is scheduled nightly via task registration.
Tests
care/emr/tests/test_otp_login.py
New OTPLoginFlowTests exercising send/login flows and edge cases: OTP creation, successful login/token, reuse prevention, per-OTP invalidation after wrong attempts, partial-failure recovery, new-OTP semantics, per-phone lockout and sliding unlock, send-window rate limits (counts include used OTPs; aged OTPs excluded), only latest unused OTP validated, expired/no-OTP cases, and overrides for verify-attempt behavior.
Dev infra
docker-compose.local.yaml
Mounts and pip-installs a local booking notifications plugin at dev startup for backend and celery services.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

waiting-for-review

Suggested reviewers

  • vigneshhari
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.68% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description is incomplete. It contains only a vague statement about 'enhancements to enforce more strict control' without explaining the actual mechanisms, rationale, or how the solution addresses the issue. Expand the description to explain what abuse protection mechanisms were implemented (throttling, lockouts, attempt limits), link the associated issue, and describe how these changes mitigate OTP abuse.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing abuse protection mechanisms for OTP-based login with throttling, lockout windows, and verification attempt limits.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ENG-158-otp-abuse-protection

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.83838% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.64%. Comparing base (73af000) to head (19761fc).

Files with missing lines Patch % Lines
care/emr/api/otp_viewsets/login.py 83.07% 9 Missing and 2 partials ⚠️
care/emr/tasks/cleanup_expired_otps.py 76.92% 3 Missing ⚠️
care/emr/tasks/__init__.py 50.00% 1 Missing ⚠️
config/settings/deployment.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3632      +/-   ##
===========================================
+ Coverage    77.53%   77.64%   +0.11%     
===========================================
  Files          479      481       +2     
  Lines        22998    23056      +58     
  Branches      2379     2384       +5     
===========================================
+ Hits         17831    17902      +71     
+ Misses        4613     4600      -13     
  Partials       554      554              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
care/emr/api/otp_viewsets/login.py (1)

71-102: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The send throttle is still raceable.

sent_otps.count() and the subsequent insert are not serialized per phone number, so concurrent requests can all observe the same count and create more rows than OTP_MAX_SENDS_PER_WINDOW. For a DDoS-protection change, that leaves the easiest bypass right on the send path. Please move this to an atomic counter/lock keyed by phone number (Redis/cache is probably the least painful option here).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 71 - 102, The current
send-throttle is raceable because you call
PatientMobileOTP.objects.filter(...).count() then insert; replace that with an
atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}")
using your Redis/django cache: perform an atomic cache.incr(key) (if new, set
TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the
current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement
(cache.decr) and raise the ValidationError instead of proceeding; only create
PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send
fails rollback by decrementing the counter so failed sends don’t consume quota.
Ensure you reference PatientMobileOTP, otp_obj,
settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when
implementing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: The failure_count function (PatientMobileOTP.failure_count)
incorrectly enforces lockout by summing failed_attempts on rows with recent
modified_date, which lets failures expire per-row rather than by actual failure
time; change the design to enforce a true lockout window by adding a
lockout_until (DateTimeField) on PatientMobileOTP (or a per-phone aggregate
model) that is set when failed_attempts reaches settings.OTP_MAX_FAILURES and
cleared after lockout expires, then update the login logic to check
lockout_until before allowing attempts; alternatively, if you prefer per-attempt
history, store individual attempt timestamps and compute failures by counting
attempts within the last settings.OTP_LOCKOUT_MINUTES instead of summing
modified_date on rows—update functions that reference failure_count and any
unlock logic to use lockout_until or the per-attempt timestamp count.
- Around line 71-79: The current send-cap check in the login view (the sent_otps
queryset) excludes used OTPs by filtering is_used=False, allowing a client to
request-and-consume OTPs indefinitely; update the check to count all OTPs for
that phone_number within the time window (remove the is_used=False filter or
otherwise include both used and unused entries) so sent_otps.count() enforces
OTP_MAX_SENDS_PER_WINDOW correctly; keep the same variable names (sent_otps) and
the same time window logic using PatientMobileOTP and
settings.OTP_SEND_WINDOW_MINUTES/OTP_MAX_SENDS_PER_WINDOW.

In `@care/facility/migrations/0485_patientmobileotp_failed_attempts.py`:
- Around line 11-17: The migration currently only adds the failed_attempts
column but not the database indexes needed by hot OTP paths; update
care/facility/migrations/0485_patientmobileotp_failed_attempts.py to also add
two composite indexes on the patientmobileotp table — one on ["phone_number",
"created_date"] and another on ["phone_number", "modified_date"] — by adding
migrations.AddIndex(...) entries (using models.Index with fields and explicit
unique index names like "patientmobileotp_phone_created_idx" and
"patientmobileotp_phone_modified_idx") so the queries in
care/emr/api/otp_viewsets/login.py that filter/order by phone_number +
created_date/modified_date are supported efficiently.

---

Outside diff comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-102: The current send-throttle is raceable because you call
PatientMobileOTP.objects.filter(...).count() then insert; replace that with an
atomic counter keyed by phone number (e.g., cache key "otp_send:{phone_number}")
using your Redis/django cache: perform an atomic cache.incr(key) (if new, set
TTL to settings.OTP_SEND_WINDOW_MINUTES*60) and treat the returned value as the
current window count; if it exceeds settings.OTP_MAX_SENDS_PER_WINDOW, decrement
(cache.decr) and raise the ValidationError instead of proceeding; only create
PatientMobileOTP (otp_obj) after the successful incr check, and if SMS/send
fails rollback by decrementing the counter so failed sends don’t consume quota.
Ensure you reference PatientMobileOTP, otp_obj,
settings.OTP_MAX_SENDS_PER_WINDOW and settings.OTP_SEND_WINDOW_MINUTES when
implementing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f28e1cc1-43f0-4a5c-adf4-020f62f4d7ce

📥 Commits

Reviewing files that changed from the base of the PR and between 589baaa and aff7eda.

📒 Files selected for processing (5)
  • care/emr/api/otp_viewsets/login.py
  • care/emr/tests/test_otp_login.py
  • care/facility/migrations/0485_patientmobileotp_failed_attempts.py
  • care/facility/models/patient.py
  • config/settings/base.py

Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/facility/migrations/0485_patientmobileotp_failed_attempts.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
care/emr/api/otp_viewsets/login.py (1)

52-59: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

failure_count() still doesn't model a real lockout window.

This is still the same row-level window problem from the earlier review: you sum all failed_attempts from rows whose latest modified_date is recent, so failures expire in chunks per row instead of by actual attempt time. Slightly worse, the successful-login save on Line 126 refreshes modified_date, which can keep old failures “fresh” after a success. This needs lockout_until or per-attempt timestamps rather than modified_date as a proxy.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@care/emr/api/otp_viewsets/login.py` around lines 52 - 59, failure_count() is
using PatientMobileOTP.modified_date and per-row failed_attempts which creates a
row-level expiry window and is broken when modified_date is updated on success;
change the design to track attempts or an explicit lockout timestamp: add either
a separate OTPAttempt model with a timestamp for each failed attempt and rewrite
failure_count() to count attempts in the time window, or add a lockout_until
DateTimeField on PatientMobileOTP and update it on failure/success; update the
code paths that currently touch modified_date (e.g., the successful-login save
near the current success handler) to stop refreshing modified_date or to
set/clear lockout_until instead so old failures don’t remain “fresh.”
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 71-77: The current read-then-write using PatientMobileOTP
(sent_otps.count() vs later insert) is racy; wrap the check+insert in an atomic,
phone-scoped critical section. Use Django's transaction.atomic() and obtain a
phone-specific lock before re-checking the count: either create a small sentinel
model (e.g., OTPSendLock with phone_number PK) and call get_or_create(...) then
OTPSendLock.objects.select_for_update() to lock that row, or use a Redis-backed
cache.lock for the phone number; inside that locked transaction re-query
PatientMobileOTP for the window, enforce settings.OTP_MAX_SENDS_PER_WINDOW, and
only then create the new PatientMobileOTP row. Ensure you reference
PatientMobileOTP, sent_otps, and settings.OTP_MAX_SENDS_PER_WINDOW when making
the guarded re-check and insert.
- Around line 116-126: The current logic in the login view (PatientMobileOTP
handling) filters is_used=False before ordering so an older unused OTP can be
validated later; change the flow in the login handler that uses PatientMobileOTP
so you first SELECT the newest OTP for the phone (use
.select_for_update().filter(phone_number=data.phone_number).order_by("-created_date").first()
without pre-filtering is_used), then compare only against that newest row
(otp_object.otp == data.otp) and if it matches mark that row is_used=True and
save, and also atomically invalidate any older unused OTP rows for that phone
(e.g., a bulk update on PatientMobileOTP.filter(phone_number=..., is_used=False,
created_date__lt=otp_object.created_date).update(is_used=True,
modified_date=...)) so older OTPs cannot be accepted later; keep the
select_for_update lock to prevent races during this update.

---

Duplicate comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 52-59: failure_count() is using PatientMobileOTP.modified_date and
per-row failed_attempts which creates a row-level expiry window and is broken
when modified_date is updated on success; change the design to track attempts or
an explicit lockout timestamp: add either a separate OTPAttempt model with a
timestamp for each failed attempt and rewrite failure_count() to count attempts
in the time window, or add a lockout_until DateTimeField on PatientMobileOTP and
update it on failure/success; update the code paths that currently touch
modified_date (e.g., the successful-login save near the current success handler)
to stop refreshing modified_date or to set/clear lockout_until instead so old
failures don’t remain “fresh.”
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 368f6e34-d2f2-44c5-b7a5-d336d66898c0

📥 Commits

Reviewing files that changed from the base of the PR and between aff7eda and 45b86b9.

📒 Files selected for processing (1)
  • care/emr/api/otp_viewsets/login.py

Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py
Comment thread care/emr/locks/otp.py Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
care/emr/locks/otp.py (1)

6-9: 💤 Low value

Consider calling super().__init__() instead of manually setting attributes.

Bypassing the parent's __init__ works today, but if Lock.__init__ ever adds setup logic, OTPSendLock will silently miss it. Delegating to the base class is a bit more future-proof.

♻️ Suggested refactor
 class OTPSendLock(Lock):
     def __init__(self, phone_number, timeout=settings.LOCK_TIMEOUT):
-        self.key = f"lock:otp_send:{phone_number}"
-        self.timeout = timeout
+        super().__init__(f"otp_send:{phone_number}", timeout)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/locks/otp.py` around lines 6 - 9, OTPSendLock currently bypasses the
parent Lock.__init__ by setting self.key and self.timeout directly; change
OTPSendLock.__init__ to call super().__init__ and pass or compute the same key
and timeout so any future setup in Lock.__init__ runs (i.e., have
OTPSendLock.__init__ compute key = f"lock:otp_send:{phone_number}" and call
super().__init__(key, timeout=settings.LOCK_TIMEOUT) or equivalent using the
timeout argument).
care/emr/tests/test_otp_login.py (1)

24-24: 💤 Low value

Minor: DEV_OTP is duplicated between test and implementation.

The value "45612" is hardcoded in both login.py:102 and here. If someone updates one without the other, tests will fail with somewhat cryptic "Invalid OTP" errors. A constant import or a settings value might be slightly cleaner, but this is admittedly not the end of the world.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/tests/test_otp_login.py` at line 24, Tests duplicate the DEV_OTP
value ("45612") used in the implementation which causes brittle failures;
instead of hardcoding DEV_OTP in care/emr/tests/test_otp_login.py, import the
DEV_OTP constant from the implementation module (the module that defines DEV_OTP
and the send()/login logic) and use that imported constant in the test so both
test and implementation reference the single source of truth (update
test_otp_login to import DEV_OTP rather than redefining it).
care/emr/api/otp_viewsets/login.py (1)

93-98: 💤 Low value

Catching bare Exception for SMS errors.

This works, but catching a more specific exception type (if the SMS library exposes one) would prevent accidentally swallowing unrelated errors. That said, external service calls are notoriously varied in what they throw, so this is understandable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/api/otp_viewsets/login.py` around lines 93 - 98, The except block in
the OTP send flow (the try/except around sending OTP in the login viewset in
login.py that currently uses "except Exception as e") is too broad; replace it
by catching the specific exception(s) thrown by your SMS client (e.g.,
SmsClientError, SmsDeliveryError, or requests.exceptions.RequestException) and
handle those by logging and returning the 400 Response, and for any other
unexpected exceptions either re-raise them or let them propagate; also switch
logger.error(e) to logger.exception(...) (or include the error details) to
capture the traceback. Ensure you update the import to bring in the SMS library
exception class and modify the except clause from "except Exception as e" to
"except <SpecificSmsException> as e" (with a fallback re-raise/propagate for
non-SMS errors).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 93-98: The except block in the OTP send flow (the try/except
around sending OTP in the login viewset in login.py that currently uses "except
Exception as e") is too broad; replace it by catching the specific exception(s)
thrown by your SMS client (e.g., SmsClientError, SmsDeliveryError, or
requests.exceptions.RequestException) and handle those by logging and returning
the 400 Response, and for any other unexpected exceptions either re-raise them
or let them propagate; also switch logger.error(e) to logger.exception(...) (or
include the error details) to capture the traceback. Ensure you update the
import to bring in the SMS library exception class and modify the except clause
from "except Exception as e" to "except <SpecificSmsException> as e" (with a
fallback re-raise/propagate for non-SMS errors).

In `@care/emr/locks/otp.py`:
- Around line 6-9: OTPSendLock currently bypasses the parent Lock.__init__ by
setting self.key and self.timeout directly; change OTPSendLock.__init__ to call
super().__init__ and pass or compute the same key and timeout so any future
setup in Lock.__init__ runs (i.e., have OTPSendLock.__init__ compute key =
f"lock:otp_send:{phone_number}" and call super().__init__(key,
timeout=settings.LOCK_TIMEOUT) or equivalent using the timeout argument).

In `@care/emr/tests/test_otp_login.py`:
- Line 24: Tests duplicate the DEV_OTP value ("45612") used in the
implementation which causes brittle failures; instead of hardcoding DEV_OTP in
care/emr/tests/test_otp_login.py, import the DEV_OTP constant from the
implementation module (the module that defines DEV_OTP and the send()/login
logic) and use that imported constant in the test so both test and
implementation reference the single source of truth (update test_otp_login to
import DEV_OTP rather than redefining it).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c55f6b51-eca0-4bb9-a8ca-4bc407cc0c4e

📥 Commits

Reviewing files that changed from the base of the PR and between 45b86b9 and c8eb2e3.

📒 Files selected for processing (5)
  • care/emr/api/otp_viewsets/login.py
  • care/emr/locks/otp.py
  • care/emr/tests/test_otp_login.py
  • care/facility/migrations/0485_patientmobileotp_failed_attempts_and_more.py
  • care/facility/models/patient.py

@praffq praffq requested a review from sainak May 6, 2026 07:18
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py
Comment thread care/emr/api/otp_viewsets/login.py
Comment thread config/settings/base.py
Comment thread care/facility/models/patient.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
care/emr/api/otp_viewsets/login.py (2)

53-60: 💤 Low value

Consider filtering deleted=False to match the partial index.

The failure_count query doesn't exclude soft-deleted rows, but the partial index pmo_phone_modified_active_idx only covers rows where deleted=False. The planner may still use the index for the initial scan, but including deleted=False in the filter would make intent clearer and ensure consistency after the cleanup task runs.

♻️ Proposed fix
     def failure_count(self, phone_number: str) -> int:
         since = care_now() - timedelta(minutes=settings.OTP_LOCKOUT_MINUTES)
         total = PatientMobileOTP.objects.filter(
             phone_number=phone_number,
             modified_date__gte=since,
             failed_attempts__gt=0,
+            deleted=False,
         ).aggregate(total=Sum("failed_attempts"))["total"]
         return total or 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/api/otp_viewsets/login.py` around lines 53 - 60, The failure_count
function on PatientMobileOTP should explicitly exclude soft-deleted rows to
match the partial index pmo_phone_modified_active_idx; update the QuerySet in
failure_count (method name: failure_count, model: PatientMobileOTP) to include
deleted=False in the filter so the aggregate uses the same predicate as the
partial index and remains consistent after cleanup tasks run.

73-78: 💤 Low value

Same consideration: add deleted=False to match the partial index.

The sent_otps query uses created_date which has the partial index pmo_phone_created_active_idx conditioned on deleted=False. Adding the filter would ensure the index is fully utilized.

♻️ Proposed fix
             sent_otps = PatientMobileOTP.objects.filter(
                 created_date__gte=(
                     care_now() - timedelta(minutes=settings.OTP_SEND_WINDOW_MINUTES)
                 ),
                 phone_number=data.phone_number,
+                deleted=False,
             )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/api/otp_viewsets/login.py` around lines 73 - 78, The sent_otps query
on PatientMobileOTP should include the deleted=False filter so the partial index
pmo_phone_created_active_idx (on created_date where deleted=False) is used;
update the PatientMobileOTP.objects.filter call that builds sent_otps in
login.py to add deleted=False alongside created_date__gte and phone_number to
ensure the DB uses the partial index.
care/emr/tasks/cleanup_expired_otps.py (1)

20-22: ⚡ Quick win

Filter should exclude already-deleted rows.

The current query will repeatedly update rows that are already soft-deleted on every scheduled run. While not incorrect, it's a bit wasteful—especially as the table grows.

♻️ Proposed fix
-    count = PatientMobileOTP.objects.filter(created_date__lt=cutoff).update(
+    count = PatientMobileOTP.objects.filter(created_date__lt=cutoff, deleted=False).update(
         deleted=True
     )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@care/emr/tasks/cleanup_expired_otps.py` around lines 20 - 22, The query that
sets deleted=True on old OTPs is re-updating already soft-deleted rows; update
the filter used in PatientMobileOTP.objects.filter(created_date__lt=cutoff) to
exclude already-deleted records (e.g., add deleted=False or
.exclude(deleted=True)) so only non-deleted rows are updated; adjust the
statement that assigns to count accordingly (PatientMobileOTP and count variable
are the relevant symbols to change).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 93-98: The current except block in the login view (around the OTP
send logic in the login viewset) catches a bare Exception; change it to catch
the specific exceptions that the SMS stack can raise—TemplateDoesNotExist,
ImproperlyConfigured, and boto3.exceptions.ClientError—log each error with
logger.error including the exception instance and context, and return the
existing Response for failure; you can keep a final generic except Exception as
e only to re-raise or log/return a 500 if absolutely needed, but do not swallow
all exceptions silently in the except Exception block.

---

Nitpick comments:
In `@care/emr/api/otp_viewsets/login.py`:
- Around line 53-60: The failure_count function on PatientMobileOTP should
explicitly exclude soft-deleted rows to match the partial index
pmo_phone_modified_active_idx; update the QuerySet in failure_count (method
name: failure_count, model: PatientMobileOTP) to include deleted=False in the
filter so the aggregate uses the same predicate as the partial index and remains
consistent after cleanup tasks run.
- Around line 73-78: The sent_otps query on PatientMobileOTP should include the
deleted=False filter so the partial index pmo_phone_created_active_idx (on
created_date where deleted=False) is used; update the
PatientMobileOTP.objects.filter call that builds sent_otps in login.py to add
deleted=False alongside created_date__gte and phone_number to ensure the DB uses
the partial index.

In `@care/emr/tasks/cleanup_expired_otps.py`:
- Around line 20-22: The query that sets deleted=True on old OTPs is re-updating
already soft-deleted rows; update the filter used in
PatientMobileOTP.objects.filter(created_date__lt=cutoff) to exclude
already-deleted records (e.g., add deleted=False or .exclude(deleted=True)) so
only non-deleted rows are updated; adjust the statement that assigns to count
accordingly (PatientMobileOTP and count variable are the relevant symbols to
change).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3e1c8922-c1c5-4b78-8d38-993edb272a3b

📥 Commits

Reviewing files that changed from the base of the PR and between c8eb2e3 and 516ab65.

📒 Files selected for processing (8)
  • .env.example
  • care/emr/api/otp_viewsets/login.py
  • care/emr/tasks/__init__.py
  • care/emr/tasks/cleanup_expired_otps.py
  • care/emr/tests/test_otp_login.py
  • care/facility/migrations/0485_patientmobileotp_failed_attempts_and_more.py
  • care/facility/models/patient.py
  • config/settings/base.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • care/emr/tasks/init.py
  • config/settings/base.py

Comment thread care/emr/api/otp_viewsets/login.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@config/settings/base.py`:
- Around line 525-529: The OTP repeat window is passed as a positional arg to
datetime.timedelta (interpreted as days) in care/emr/api/otp_viewsets/login.py,
so settings.OTP_REPEAT_WINDOW (documented/housed as hours) becomes days; change
the timedelta construction to use a keyword hours=... (e.g.,
timedelta(hours=settings.OTP_REPEAT_WINDOW)) wherever settings.OTP_REPEAT_WINDOW
is used to compute the repeat window (search for usages in the OTP rate-limiting
logic / functions in login.py), ensure the value is an int before passing, and
update any related tests or comments that assumed days.

In `@docker-compose.local.yaml`:
- Around line 12-21: The compose currently forces plugin wiring by always
setting ADDITIONAL_PLUGS and mounting ../care_booking_notifications_be then
unconditionally running pip install -e /plugs/care_booking_notifications_be in
the entrypoint; make this opt-in by: only setting ADDITIONAL_PLUGS when a new
env var (e.g. ENABLE_BOOKING_NOTIFICATIONS) is truthy, avoid mounting
../care_booking_notifications_be unless that env var is set, and change the
entrypoint step that runs pip install -e /plugs/care_booking_notifications_be to
first check the env var and/or the path exists before attempting installation
(guarding the pip install and exec of scripts/start-dev.sh accordingly) so local
startup won’t fail if the sibling repo is absent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 462ddc76-be47-46ca-9ce7-76e92c3af36b

📥 Commits

Reviewing files that changed from the base of the PR and between 516ab65 and 74dac3d.

📒 Files selected for processing (7)
  • care/emr/api/otp_viewsets/login.py
  • care/emr/tasks/cleanup_expired_otps.py
  • care/emr/tests/test_otp_login.py
  • care/facility/migrations/0486_rename_patientmobileotp_mobileotp.py
  • care/facility/models/patient.py
  • config/settings/base.py
  • docker-compose.local.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • care/facility/models/patient.py
  • care/emr/api/otp_viewsets/login.py
  • care/emr/tests/test_otp_login.py

Comment thread config/settings/base.py
Comment thread docker-compose.local.yaml Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
raise Throttled(detail="Too many failed login attempts. Try again later.")

with OTPSendLock(data.phone_number):
sent_otps = MobileOTP.objects.filter(

@nandkishorr nandkishorr Jun 8, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we can reuse the def send_otp() here right ?

@praffq praffq requested a review from nandkishorr June 8, 2026 19:57
Comment thread care/emr/tasks/cleanup_expired_otps.py Outdated
Soft-deletes MobileOTP rows older than the lockout window
"""
cutoff = care_now() - timedelta(minutes=settings.OTP_LOCKOUT_MINUTES)
count = MobileOTP.objects.filter(created_date__lt=cutoff).update(deleted=True)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets hard delete, no point in keeping these

Comment thread care/emr/api/otp_viewsets/login.py Outdated
expired = False
with transaction.atomic():
otp_object = (
MobileOTP.objects.select_for_update()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use locks in redis instead of select_for_update

return Response({"otp": "generated"})

try:
send_otp(data.phone_number, otp_type=ResetPasswordOTP)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the validations removed ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation are moved into send_otp

Comment thread care/emr/locks/otp.py Fixed
@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR hardens OTP-based login and password-reset with per-phone send-rate limits, per-OTP verification caps, a sliding-window failure lockout, OTP validity windows, atomic invalidation of previous OTPs on re-send, and a nightly cleanup task — a meaningful security improvement over the prior implementation.

  • send_otp now enforces OTPSendLock, failure-count gating, and send-rate limiting before creating a new OTP, while the login and confirm endpoints gain OTPVerifyLock serialization with per-OTP failed_attempts tracking and phone-level lockout.
  • Six new env-driven settings (OTP_SEND_WINDOW_MINUTES, OTP_MAX_SENDS_PER_WINDOW, OTP_MAX_VERIFY_ATTEMPTS, OTP_MAX_FAILURES, OTP_LOCKOUT_MINUTES, OTP_VALIDITY_MINUTES) replace hardcoded values, and a comprehensive test suite is added covering happy paths, lockout, and rate-limit edge cases.
  • Several concurrency and atomicity gaps remain unaddressed from earlier review rounds (transaction safety in send_otp, cross-endpoint lock scope, TOCTOU in confirm, cleanup window vs. password-reset send window, and the IS_PRODUCTION OTP-randomness gate) and should be resolved before merge.

Confidence Score: 2/5

Not safe to merge — the OTP authentication path still has multiple concurrency holes and an atomicity gap that can leave users unable to log in or allow a concurrent request to consume an OTP twice.

The two highest-risk files, login.py and reset_password.py, both have confirmed defects on the authentication hot path: the send_otp function performs a two-step mark-and-create without a transaction so a create-time failure silently invalidates all existing OTPs; the confirm endpoint releases OTPVerifyLock with the matched OTP still unused, giving a concurrent request a window to consume the same credential twice; the cleanup task's retention window is shorter than the password-reset send window, allowing the send-rate cap to be bypassed after midnight; and the OTP-randomness gate keys on IS_PRODUCTION rather than USE_SMS, so a staging server with USE_SMS=True delivers the hardcoded string '45612' over SMS. These are independent issues, each affecting a different part of the auth flow.

care/emr/api/otp_viewsets/login.py (atomicity gap in send_otp, cross-lock scope of failure_count, OTP-randomness gate), care/users/api/otp_viewset/reset_password.py (TOCTOU in confirm), and care/emr/tasks/cleanup_expired_otps.py (cleanup window narrower than password-reset send window).

Important Files Changed

Filename Overview
care/emr/api/otp_viewsets/login.py Core OTP send/login logic rewritten with rate-limiting and lockout; contains multiple concurrency and atomicity gaps flagged in prior review rounds that remain unaddressed.
care/users/api/otp_viewset/reset_password.py Password-reset confirm endpoint now has per-OTP failure cap and lockout check, but the matched OTP is still not marked used inside the lock before releasing it, leaving a TOCTOU window for concurrent double-use.
care/emr/tasks/cleanup_expired_otps.py New Celery task hard-deletes expired OTP rows; cleanup window (OTP_LOCKOUT_MINUTES) is narrower than the password-reset send window (OTP_REPEAT_WINDOW hours), allowing the send-rate counter to be bypassed after midnight.
care/emr/locks/otp.py Two new lock classes (OTPSendLock / OTPVerifyLock) with distinct cache-key namespaces; the separate namespaces mean the send and verify paths don't share a lock, which contributes to a cross-endpoint race already flagged in review.
care/emr/tests/test_otp_login.py Comprehensive new test suite for OTP login flow covering happy paths, per-OTP cap, phone lockout, send rate limits, and OTP validity windows; hardcoded DEV_OTP '45612' acknowledges the staging/dev OTP concern.
care/emr/tests/test_otp_reset_password_api.py Adds tests for OTP attempt capping and multi-user retry behavior in the reset-password confirm endpoint; good coverage of the new failure-tracking logic.
config/settings/base.py Replaces hardcoded OTP constants with six new env-driven settings for send window, send cap, verify attempts, failures, lockout, and OTP validity; defaults are reasonable.
care/facility/models/patient.py Adds failed_attempts field and two partial indexes on (phone_number, -created_date) and (phone_number, -modified_date); model and migration are consistent.
care/facility/migrations/0486_mobileotp_failed_attempts_and_more.py Auto-generated migration adds failed_attempts field and two partial indexes matching the model Meta; no issues.
care/emr/tasks/init.py Registers the new cleanup_expired_otps task to run at midnight; no issues.
config/settings/deployment.py Adds SMS_BACKEND env var; trivial one-line addition, no issues.
.env.example Documents new SMS_BACKEND and six OTP rate-limiting env vars with defaults; informational, no issues.

Reviews (7): Last reviewed commit: "Merge branch 'develop' into ENG-158-otp-..." | Re-trigger Greptile

Comment thread care/emr/tasks/cleanup_expired_otps.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py Outdated
Comment thread care/emr/api/otp_viewsets/login.py
Comment thread care/emr/locks/otp.py
from care.utils.lock import Lock


class OTPSendLock(Lock):
Comment thread care/emr/locks/otp.py Dismissed
Comment on lines +19 to +20
cutoff = care_now() - timedelta(minutes=settings.OTP_LOCKOUT_MINUTES)
count, _ = MobileOTP.objects.filter(modified_date__lt=cutoff).delete()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Cleanup window too narrow for password-reset send rate limit

The cleanup task deletes rows where modified_date < care_now() - OTP_LOCKOUT_MINUTES (default 60 min), but ResetPasswordOTP.send_window() returns timedelta(hours=OTP_REPEAT_WINDOW) — 6 hours by default. Any reset-password OTP whose modified_date is more than 60 minutes old but still within the 6-hour send window will be hard-deleted, making sent_otps.count() undercount the actual OTPs issued that day.

Concrete bypass: send 9 reset-password OTPs before 11 pm; the midnight cleanup deletes them (modified_date < midnight − 60 min = 11 pm); at 12:01 am the created_date__gte = now − 6 h window still covers that slot but the rows are gone, so 10 more OTPs can be sent — totalling 19 instead of the intended 10 per 6 hours. The fix is to retain rows until the end of the longest applicable send window, e.g. max(OTP_LOCKOUT_MINUTES, OTP_REPEAT_WINDOW_MINUTES).

Comment thread care/emr/api/otp_viewsets/login.py
@praffq praffq requested a review from nandkishorr June 17, 2026 21:18
Comment thread care/emr/api/otp_viewsets/login.py
@vigneshhari

Copy link
Copy Markdown
Member

Can you confirm that the reset password flow also has abuse protection ?

Comment thread care/users/api/otp_viewset/reset_password.py
Comment thread care/emr/api/otp_viewsets/login.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants