chore: allow running and linting docs notebooks#493
Conversation
Signed-off-by: Matt Kornfield <mkornfield@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (10)
✅ Files skipped from review due to trivial changes (3)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdds bootstrap reminders, docs helper targets, a Python snippet linter with tests, and a Fern notebook runner with tests and usage docs. ChangesDocumentation tooling
Sequence Diagram(s)sequenceDiagram
participant Makefile
participant Lint as lint_python_snippets.py
participant Ast as ast.parse
participant Ty as ty
Makefile->>Lint: uv run --frozen python docs/_scripts/lint_python_snippets.py DOCS_PATH
Lint->>Ast: parse fenced Python snippets
Lint->>Ty: uv run --frozen ty check
Ty-->>Lint: diagnostics
Lint-->>Makefile: exit code
sequenceDiagram
participant Makefile
participant Runner as run_notebooks.py
participant Notebook as NotebookSelection
participant Executor as nmp.testing.notebooks.execute_notebook
Makefile->>Runner: uv run --frozen python docs/fern/scripts/run_notebooks.py $(ARGS) $(DOCS_PATH)
Runner->>Runner: select_notebooks(paths)
Runner->>Notebook: resolve notebook/source pairs
Runner->>Executor: execute selected notebook
Executor-->>Runner: result
Runner-->>Makefile: exit code
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/_scripts/lint_python_snippets.py`:
- Around line 260-263: The temp file naming in PreparedTypeCheckFile creation is
vulnerable to collisions because the sanitized doc_path-based name in the
temp_path construction can overlap for different documents, and prefix-based
matching can misroute diagnostics when temp paths share prefixes. Update the
path generation and lookup logic around PreparedTypeCheckFile, temp_path, and
the diagnostic mapping code in the affected snippet to use a collision-proof
unique identifier per document and exact temp_path equality when resolving
diagnostics, so each source document maps to only its own temporary file and
line mapping.
In `@docs/AGENTS.md`:
- Around line 15-16: The Makefile examples in the docs use the wrong variable
name, so update the documented invocations for docs-check-python-snippets and
docs-run-notebook to use DOCS_PATH instead of DOC_PATH. Keep the wording aligned
with the actual Makefile contract and ensure the examples match the target names
so users can copy them without hitting the empty-variable guard.
In `@docs/fern/scripts/README.md`:
- Around line 84-86: The README examples for make docs-run-notebook use the
wrong variable name, so the commands won’t pick up the notebook path. Update
both example invocations in README to use DOCS_PATH, matching the
docs-run-notebook target and the variable referenced by Makefile; use the
docs-run-notebook command examples as the anchor for the fix.
In `@docs/fern/scripts/run_notebooks.py`:
- Around line 247-250: The MDX preprocessing in run_notebooks() happens before
the per-notebook failure handling, so a materialize_mdx_as_markdown() error can
stop the whole batch. Move the .mdx conversion inside the existing try block for
each notebook selection so include expansion or file I/O failures are caught,
the notebook is marked failed, and processing continues for the remaining items.
Use the existing run_notebooks flow and the selection.path / run_path handling
to keep the logic localized.
- Around line 33-36: Both notebook URL regexes are too strict because they only
allow a single path segment after blob/, so branch refs with slashes do not
match. Update COLAB_NOTEBOOK_RE and FERN_NOTEBOOK_RE in run_notebooks.py to
accept multi-segment refs before the docs/... notebook path, while still
capturing the notebook path in the existing named path group. Keep the rest of
the matching behavior unchanged so notebook links from .mdx pages continue to
resolve.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 00a5b4ad-15c8-4326-902a-b5046d93d3e0
📒 Files selected for processing (7)
Makefiledocs/AGENTS.mddocs/_scripts/lint_python_snippets.pydocs/_scripts/test_lint_python_snippets.pydocs/fern/scripts/README.mddocs/fern/scripts/run_notebooks.pydocs/fern/scripts/test_run_notebooks.py
| safe_name = re.sub(r"[^A-Za-z0-9_.-]+", "_", str(doc_path.with_suffix(""))) | ||
| temp_path = temp_dir / f"{safe_name}.py" | ||
| temp_path.write_text("\n".join(source_lines), encoding="utf-8") | ||
| return PreparedTypeCheckFile(doc_path=doc_path, temp_path=temp_path, line_mapping=tuple(line_mapping)) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Make temp paths collision-proof and match diagnostics exactly.
Flattened sanitized names can collide (a/b.md vs a_b.md), overwriting temp files and mappings. Prefix matching can also map diagnostics to the wrong doc when temp paths share prefixes.
Proposed fix
+import hashlib
import argparse- safe_name = re.sub(r"[^A-Za-z0-9_.-]+", "_", str(doc_path.with_suffix("")))
- temp_path = temp_dir / f"{safe_name}.py"
+ safe_stem = re.sub(r"[^A-Za-z0-9_.-]+", "_", doc_path.with_suffix("").name)[:80] or "snippet"
+ digest = hashlib.sha256(str(doc_path.resolve()).encode("utf-8")).hexdigest()[:12]
+ temp_path = temp_dir / f"{safe_stem}-{digest}.py" for line in output.splitlines():
- matched_prepared: PreparedTypeCheckFile | None = None
- matched_temp_path: Path | None = None
- for temp_path, prepared in temp_to_prepared.items():
- if line.startswith(str(temp_path)):
- matched_prepared = prepared
- matched_temp_path = temp_path
- break
-
- if matched_prepared is None or matched_temp_path is None:
+ match = TY_OUTPUT_RE.match(line)
+ if match is None:
if line.strip():
unmatched_lines.append(line)
continue
-
- match = TY_OUTPUT_RE.match(line)
- if match is None:
- results[matched_prepared.doc_path].append(
- line.replace(str(matched_temp_path), str(matched_prepared.doc_path))
- )
+
+ matched_prepared = temp_to_prepared.get(Path(match.group(1)))
+ if matched_prepared is None:
+ if line.strip():
+ unmatched_lines.append(line)
continueAlso applies to: 327-351
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/_scripts/lint_python_snippets.py` around lines 260 - 263, The temp file
naming in PreparedTypeCheckFile creation is vulnerable to collisions because the
sanitized doc_path-based name in the temp_path construction can overlap for
different documents, and prefix-based matching can misroute diagnostics when
temp paths share prefixes. Update the path generation and lookup logic around
PreparedTypeCheckFile, temp_path, and the diagnostic mapping code in the
affected snippet to use a collision-proof unique identifier per document and
exact temp_path equality when resolving diagnostics, so each source document
maps to only its own temporary file and line mapping.
| COLAB_NOTEBOOK_RE = re.compile( | ||
| r"https://colab\.research\.google\.com/github/[^/]+/[^/]+/blob/[^/]+/(?P<path>docs/[^)\"'\s]+\.ipynb)" | ||
| ) | ||
| FERN_NOTEBOOK_RE = re.compile(r"colabUrl=[\"'].*?/blob/[^/]+/(?P<path>docs/[^\"']+\.ipynb)[\"']") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Allow slashes in Colab refs.
Both regexes assume the blob/<ref>/... segment has no /. Branch names like docs-allow-running-linting/mck will not match, so .mdx pages that rely on the linked notebook path fail to resolve.
Suggested fix
COLAB_NOTEBOOK_RE = re.compile(
- r"https://colab\.research\.google\.com/github/[^/]+/[^/]+/blob/[^/]+/(?P<path>docs/[^)\"'\s]+\.ipynb)"
+ r"https://colab\.research\.google\.com/github/[^/]+/[^/]+/blob/.+?/(?P<path>docs/[^)\"'\s]+\.ipynb)"
)
-FERN_NOTEBOOK_RE = re.compile(r"colabUrl=[\"'].*?/blob/[^/]+/(?P<path>docs/[^\"']+\.ipynb)[\"']")
+FERN_NOTEBOOK_RE = re.compile(r"colabUrl=[\"'].*?/blob/.+?/(?P<path>docs/[^\"']+\.ipynb)[\"']")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| COLAB_NOTEBOOK_RE = re.compile( | |
| r"https://colab\.research\.google\.com/github/[^/]+/[^/]+/blob/[^/]+/(?P<path>docs/[^)\"'\s]+\.ipynb)" | |
| ) | |
| FERN_NOTEBOOK_RE = re.compile(r"colabUrl=[\"'].*?/blob/[^/]+/(?P<path>docs/[^\"']+\.ipynb)[\"']") | |
| COLAB_NOTEBOOK_RE = re.compile( | |
| r"https://colab\.research\.google\.com/github/[^/]+/[^/]+/blob/.+?/(?P<path>docs/[^)\"'\s]+\.ipynb)" | |
| ) | |
| FERN_NOTEBOOK_RE = re.compile(r"colabUrl=[\"'].*?/blob/.+?/(?P<path>docs/[^\"']+\.ipynb)[\"']") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/fern/scripts/run_notebooks.py` around lines 33 - 36, Both notebook URL
regexes are too strict because they only allow a single path segment after
blob/, so branch refs with slashes do not match. Update COLAB_NOTEBOOK_RE and
FERN_NOTEBOOK_RE in run_notebooks.py to accept multi-segment refs before the
docs/... notebook path, while still capturing the notebook path in the existing
named path group. Keep the rest of the matching behavior unchanged so notebook
links from .mdx pages continue to resolve.
|
tylersbray
left a comment
There was a problem hiding this comment.
Approving but do the code rabbit stuff please.
Signed-off-by: Matt Kornfield <mkornfield@nvidia.com>
Summary by CodeRabbit
New Features
make docs-check-python-snippetsto lint Python fenced-code snippets in Markdown/MDX.make docs-run-notebookto run Fern documentation notebooks with marker-based selection.Bug Fixes
Documentation
Tests