[Surgery, Anesthesiology, and Perioperative Care]{.chapter-title}

doi:10.5281/zenodo.18251405

Surgery, Anesthesiology, and Perioperative Care

Surgery combines technical skill, anatomical knowledge, and split-second decision-making under pressure. AI applications span preoperative risk assessment, intraoperative guidance, and postoperative monitoring.

Learning Objectives

After reading this chapter, you will be able to:

Evaluate AI systems for surgical risk prediction and optimization
Understand computer vision applications in robotic and minimally invasive surgery
Assess AI tools for surgical phase recognition and workflow analysis
Navigate AI-assisted surgical planning and simulation
Identify postoperative complication prediction systems
Recognize limitations and failure modes of surgical AI
Balance AI augmentation with surgical judgment and technical skill

Chapter Summary (TL;DR)

The Clinical Context: Surgery presents unique AI challenges: high-stakes real-time decisions, anatomical variability, and zero tolerance for errors. Unlike diagnostic AI analyzing static images, surgical AI must operate in dynamic 3D environments with blood, smoke, and rapidly changing anatomy.

What Works Well:

Application	Evidence	Key Benefit
Preoperative Risk (MySurgeryRisk)	Strong	AUC 0.92 for complications (Bihorac et al., 2019)
Surgical Planning (3D segmentation)	Solid	60-80% reduction in planning time
Joint Replacement Planning	Solid	Improved implant positioning
Fracture Detection	Strong	High accuracy for simple fractures
Postoperative Early Warning	Moderate	Reduces unplanned ICU transfers

What’s Emerging (Use with Caution):

Application	Status	Limitation
Surgical Phase Recognition	Research	Limited clinical impact
Anatomical Structure ID	Research	Blood/smoke degrade accuracy
Skill Assessment AI	Emerging	Doesn’t replace mentorship
Complication Prediction	Variable	High false positive rates

What to Avoid:

Autonomous surgical decisions: not validated, not safe
Critical structure identification without visual verification
Replacing surgical judgment with AI recommendations

Key Principles:

AI augments expertise, never replaces it: Verify all AI-generated information
Preoperative > Intraoperative AI: Better validated, lower stakes for errors
Human oversight mandatory: No autonomous AI decisions during surgery
Local validation essential: Academic center results may not generalize

The Bottom Line: Preoperative risk assessment AI (MySurgeryRisk) has strong evidence. Intraoperative AI is promising but not ready for clinical reliance. Surgeons must maintain independent judgment. AI is a tool, not a decision-maker.

Introduction

Surgery stands apart from other medical specialties in its immediacy, irreversibility, and technical demands. While radiologists can analyze images over minutes, surgeons make split-second decisions with scalpel in hand. While internists can adjust management based on patient response, surgical decisions, once made, cannot be easily undone.

This unique context shapes how AI can and cannot help surgeons. The most promising applications assist with the cognitive work surrounding surgery (risk assessment, planning, outcome prediction) rather than replacing the surgeon’s hands or judgment during the operation itself.

The sections that follow cover surgical AI applications across the perioperative spectrum, from preoperative optimization through postoperative care.

Preoperative AI Applications

Surgical Risk Prediction

The Clinical Problem:

Surgeons face a fundamental question before every operation: Will this patient tolerate this procedure? Traditional risk assessment relies on clinical judgment supplemented by scoring systems (ASA classification, NSQIP risk calculator, RCRI for cardiac risk in non-cardiac surgery). These tools have limitations:

Incorporate limited variables (20-30 factors)
Use linear models that miss complex interactions
Provide population-level estimates, not personalized predictions
Updated infrequently as new evidence emerges

Machine Learning Solutions:

Modern ML approaches improve risk prediction by:

Analyzing larger feature sets: 100+ variables from EHR, imaging, labs, medications, vital signs, social determinants
Capturing nonlinear relationships: Age × frailty × procedure complexity interactions
Continuous learning: Models updated with new outcome data
Personalized predictions: Patient-specific risk estimates rather than population averages

Evidence:

The MySurgeryRisk algorithm developed at University of Florida analyzed 400,000+ surgical cases and significantly outperformed traditional risk models (Bihorac et al., 2019):

30-day mortality prediction: AUC 0.94 (vs. 0.89 for ASA score)
Major complications: AUC 0.88 (vs. 0.82 for NSQIP)
ICU admission: AUC 0.91
Hospital length of stay: Better calibration across risk spectrum

Similar results from other institutions: Stanford, Partners Healthcare, Penn Medicine all report improved risk prediction using ML on local data.

Clinical Applications:

Risk Prediction AI in Practice

Preoperative Optimization:

Identify modifiable risk factors (anemia, hyperglycemia, nutritional deficits)
Triage patients for preoperative clinic vs. day-of-surgery admission
Guide prehabilitation referrals

Shared Decision-Making:

Provide personalized risk estimates during surgical consults
Facilitate discussions about alternative treatments
Support goals-of-care conversations for high-risk patients

Resource Allocation:

Predict ICU vs. floor bed requirements
Identify patients needing enhanced postoperative monitoring
Optimize OR scheduling based on predicted case duration

Quality Improvement:

Risk-adjust outcome comparisons between surgeons/hospitals
Identify outliers for focused improvement efforts
Benchmark performance against predicted outcomes

Critical Limitations:

Risk calculators should inform, not dictate, surgical decisions:

Algorithms miss important factors: patient goals, functional trajectory, social support, frailty nuances
High-risk patients may still benefit from surgery if alternative is certain poor outcome
Low-risk predictions don’t guarantee good outcomes
Models trained on one population may not generalize to different populations

Clinical Bottom Line: Use risk prediction AI to enhance shared decision-making and optimize preoperative preparation. Do not deny surgery based solely on algorithmic risk scores.

Preoperative Planning and Simulation

AI-Assisted Anatomical Segmentation:

Surgical planning for complex cases (oncologic resections, liver surgery, orthopedic reconstructions) traditionally requires manual analysis of CT/MRI to identify anatomy, plan approaches, and anticipate challenges. AI automates and enhances this process:

Applications:

Oncologic Surgery:

Tumor segmentation and volumetry
Relationship to critical structures (vessels, bile ducts, nerves)
Predicted resection margins
Assessment of resectability

Liver Surgery:

Vascular and biliary anatomy mapping
Liver volumetry for donation or resection planning
Future liver remnant calculation
Virtual hepatectomy simulation

Orthopedic Surgery:

Joint replacement planning (alignment, component sizing)
Osteotomy planning for deformity correction
Fracture reduction simulation
Bone tumor resection planning

Neurosurgery:

Brain tumor segmentation and eloquent cortex mapping
Surgical approach trajectory planning
Vascular anatomy for aneurysm clipping
Epilepsy focus localization

Evidence:

Studies across multiple surgical specialties show AI segmentation (Hashimoto et al., 2018):

Reduces planning time by 60-80% compared to manual segmentation
Achieves inter-rater reliability comparable to expert-to-expert agreement
Improves standardization of preoperative assessment (Topol, 2019)
Enhances patient counseling with 3D visualizations

Limitations:

Segmentation errors can propagate to surgical plans (always verify)
Quality depends on input imaging (motion artifacts, contrast timing)
Doesn’t account for intraoperative findings (adhesions, variant anatomy)
Most effective for anatomy-driven procedures with good imaging

3D Printing and Surgical Models:

AI-segmented anatomy can be converted to 3D-printed models for:

Pre-surgical rehearsal of complex cases
Patient education and consent
Trainee education
Custom surgical guides and implants

Clinical Impact: Mixed. Some studies show reduced operative time and improved outcomes for complex cases; others show no benefit beyond surgeon confidence. Cost and workflow integration remain barriers to widespread adoption.

Intraoperative AI Applications

Computer Vision in Minimally Invasive Surgery

The laparoscope and robotic camera create continuous video streams, ideal data for computer vision AI. Applications range from documentation to real-time guidance, with varying degrees of validation and clinical readiness.

Surgical Phase Recognition:

What it does: AI analyzes surgical video and identifies current phase (e.g., “dissection of gallbladder from liver bed” in laparoscopic cholecystectomy)

How it works: Deep learning models trained on annotated surgical videos learn to recognize instrument configurations, anatomical landmarks, and surgeon actions characteristic of each phase.

Performance:

Accuracy >90% for laparoscopic cholecystectomy (Twinanda et al., 2017)
Works across multiple procedures (bariatric, colorectal, gynecologic)
Real-time capability (15-30 frames/second)

Potential applications:

Context-aware instrument tracking
Automated surgical documentation
OR efficiency analysis
Surgical skill assessment
Adverse event detection

Current status: Primarily research tool. Limited clinical deployment because phase recognition alone doesn’t provide actionable guidance. Surgeons already know which phase they’re in.

Future potential: Phase recognition is foundational for more advanced applications (predictive alerts, context-aware instrument suggestions).

Anatomical Structure Recognition:

The promise: Computer vision identifies critical anatomy (bile ducts, ureters, vessels) to prevent surgical injury.

The reality: This is extraordinarily difficult and not yet clinically reliable.

Why it’s hard:

Visual variability: Blood, smoke, retraction, lighting changes, cautery artifacts
Anatomical variants: Textbook anatomy is the exception, not the rule
Dynamic deformation: Tissue moves, stretches, changes appearance continuously
Occlusion: Critical structures often partially hidden
Context-dependence: What looks like ureter may be vessel or adhesion band

Current evidence:

Research systems demonstrate:

70-85% accuracy for identifying major structures in ideal conditions
Performance degrades significantly with bleeding, inflammation, obesity
False positives and false negatives both occur at unacceptable rates

Critical safety concern:

Surgeons cannot rely on AI to definitively identify critical structures. Visual confirmation, tactile feedback, anatomical knowledge, and methodical dissection remain essential. AI suggesting “safe to divide this structure” is not acceptable with current technology.

More promising near-term application:

Warning systems: AI detecting absence of expected structures (“ureter not identified in expected location, double-check before dividing anything”) may be safer than positive identification. Alert surgeons to uncertainty rather than provide false confidence.

Augmented Reality Surgical Navigation

AR systems overlay preoperative imaging onto the surgeon’s view of the operative field, enhancing visualization and precision.

Applications:

Spine Surgery:

Real-time visualization of screw trajectories
Pedicle screw placement guidance
Reduces fluoroscopy exposure
FDA-cleared systems widely used

Neurosurgery:

Tumor localization during resection
Trajectory planning for deep lesions
Registration of preoperative MRI to intraoperative anatomy

Liver Surgery:

Overlay of vascular anatomy on liver surface
Guides parenchymal transection planes
Helps identify tumor location in real-time

Evidence:

Spine surgery: Multiple studies show AR navigation improves screw placement accuracy (98%+ correct positioning vs. 90-95% with fluoroscopy alone) and reduces radiation exposure (Mason et al., 2014).

Neurosurgery: AR reduces targeting errors, but brain shift (tissue deformation after opening dura) remains significant challenge. Intraoperative imaging updates required for accuracy.

Liver surgery: Registration accuracy (aligning preoperative imaging to surgical field) degrades with tissue deformation. Useful for initial approach planning but less reliable as resection progresses.

Critical Limitation: Registration Errors

AR requires precise alignment of imaging to patient anatomy. Registration errors (2-5mm typical) can be clinically significant, especially for small structures or narrow safety margins. Surgeons must verify AR guidance against direct visualization and anatomical knowledge.

AI in Robotic Surgery

Current State: No Autonomy

Despite “robotic surgery” terminology, da Vinci and similar systems are teleoperated tools, not autonomous robots. The surgeon controls every movement. AI plays minimal role in current clinical robotic systems.

Emerging AI Applications:

Surgical Skill Assessment:

AI analyzes instrument paths, economy of motion, smoothness
Provides objective feedback for training
Correlates with surgical experience and patient outcomes (Gumbs et al., 2021)
Used in residency training programs

Tremor Filtering:

Robot compensates for physiologic tremor
Standard feature, not novel AI (rule-based filtering)
Improves precision for microsurgical tasks

Autonomous Task Execution (Research Only):

The STAR (Smart Tissue Autonomous Robot) performed supervised autonomous bowel anastomosis in pigs (Shademan et al., 2016). This proof-of-concept demonstrated technical feasibility but:

Not FDA-approved
Not tested in humans
Requires perfect conditions: no bleeding, adhesions, or unexpected anatomy
Slower than human surgeons
Monitoring surgeon must be ready to intervene instantly

Variability of human anatomy, tissue properties, and intraoperative findings far exceeds AI’s ability to safely respond without human judgment. Fully autonomous robotic surgery remains research, not reality.

More realistic future: Semi-autonomous assistance for repetitive sub-tasks (suturing, tissue dissection in clear planes) under continuous surgeon supervision.

Postoperative AI Applications

Complication Prediction

Surgical Site Infection (SSI) Prediction:

ML models predict SSI risk using:

Patient factors (diabetes, obesity, smoking, immunosuppression)
Operative characteristics (duration, complexity, contamination class)
Intraoperative variables (glucose control, normothermia, antibiotic timing)
Postoperative factors (drain output, pain scores)

Evidence: Modest improvements over clinical judgment alone (AUC 0.75-0.80 vs. 0.70-0.72).

Limitations:

High false positive rates (30-40%) limit actionability
Shouldn’t guide prophylactic antibiotic decisions (risk of resistance)
Best use: Enhanced surveillance for high-risk patients

Postoperative Delirium:

Prediction models incorporating preoperative cognitive assessment, anesthesia factors, and postoperative medications identify high-risk patients for:

Non-pharmacologic prevention (reorientation, sleep hygiene, family presence)
Avoidance of deliriogenic medications
Enhanced monitoring

Evidence: Better than clinical intuition, but delirium remains multifactorial and incompletely preventable.

Anastomotic Leak Prediction:

ML models analyzing postoperative labs (CRP trajectory), vital signs, and clinical notes can identify leak risk earlier than clinical suspicion alone.

Challenge: Rare outcomes (1-5% incidence) make model training difficult and false positive rates high.

Deterioration Monitoring

AI systems analyzing continuous vitals, lab trends, nursing documentation, and medication administration can detect patterns predicting clinical deterioration 6-12 hours before conventional early warning scores.

Applications:

Postoperative hemorrhage
Respiratory failure
Sepsis
Cardiac events

Evidence: Detection performance generally good, but high false positive rates create alert fatigue (similar to sepsis prediction challenges discussed in Emergency Medicine) (Wong et al., 2021; Beam & Kohane, 2018).

Best Implementation: Integrate AI alerts with rapid response team protocols and ensure alerts are actionable (not just “patient is high-risk”) (Topol, 2019).

Surgical Quality and Education

Video-Based Surgical Assessment

AI analysis of surgical videos enables objective skill assessment and quality improvement.

Applications:

Skill Scoring:

Objective assessment of technical performance
Identifies specific errors (tissue trauma, bleeding, inefficiency)
Provides quantitative feedback for training

Evidence: AI scores correlate strongly with expert human assessment and predict surgical outcomes (Gumbs et al., 2021).

Benefits for surgical education:

Objective feedback supplements subjective faculty evaluation
Tracks skill progression over time
Identifies specific areas needing improvement
Benchmarks against peer performance

Quality Improvement:

Retrospective review of complications to identify technical factors
Process improvement for OR efficiency
Standardization of surgical techniques

Challenges:

Privacy and medicolegal concerns about routine recording
Surgeon resistance to surveillance
Doesn’t capture decision-making quality (only technical execution)
Storage and analysis infrastructure requirements

Natural Language Processing for Operative Notes

AI extraction of structured data from operative notes enables:

Quality Metrics:

Automated calculation of process measures (antibiotic timing, VTE prophylaxis)
Complication detection from dictated notes
Adherence to surgical best practices

Registry Auto-Population:

Reduces manual data entry burden for NSQIP, VASQIP, other registries
Improves data completeness and accuracy

Clinical Decision Support:

Extraction of critical operative details for downstream care (mesh type in hernia repair, prosthesis in joint replacement)

Evidence: High accuracy (>95%) for structured data elements. Challenges remain for nuanced surgical findings and judgment-based assessments.

Specialty-Specific Applications

Different surgical specialties face unique challenges and opportunities for AI integration:

General Surgery

Hernia recurrence risk prediction
Cholecystectomy difficulty scoring
Bile duct injury prevention (research phase)

Orthopedic Surgery

Fracture detection AI (high accuracy for simple fractures)
Joint replacement planning and component sizing
Spinal navigation systems (FDA-cleared)
Ligament injury diagnosis from MRI

Neurosurgery

Brain tumor segmentation for resection planning
Epilepsy focus localization
Surgical navigation systems
Intraoperative tumor margin assessment (research)

Cardiac Surgery

Surgical risk models (STS score enhanced with ML)
Intraoperative echocardiography interpretation
ICU outcome prediction

Thoracic Surgery

Lung nodule characterization from CT
Surgical approach selection (VATS vs. thoracotomy)
Lymph node metastasis prediction

Vascular Surgery

AAA rupture risk prediction
Vascular anatomy segmentation
Endovascular procedure planning

Plastic Surgery

Breast reconstruction outcome prediction
Aesthetic outcome simulation
Flap viability monitoring (research)

Critical Limitations and Risks

Why Surgical AI Must Be Approached With Particular Caution

Immediacy of Harm: Unlike diagnostic errors that can be caught through physician review, intraoperative AI errors cause immediate, potentially irreversible patient harm.

Complexity of Surgical Judgment: Surgery requires integration of visual, tactile, and proprioceptive information with anatomical knowledge, pattern recognition from thousands of prior cases, and real-time adaptation to unexpected findings. AI doesn’t replicate this.

Medicolegal Implications: If a surgeon follows AI guidance and causes injury, liability is clear: the surgeon is responsible. If surgeon ignores AI warning and causes injury, plaintiff’s attorneys will argue AI was ignored. This creates defensive pressure to over-rely on AI even when clinical judgment suggests otherwise.

Technology Failure Modes: Computer vision fails with blood, smoke, optical artifacts. ML models fail with out-of-distribution inputs (unusual anatomy, rare findings). Risk models fail when patient circumstances differ from training data.

Trust Calibration: Surgeons must neither over-trust (following AI suggestions without verification) nor under-trust (ignoring useful AI alerts). Achieving appropriate calibration is difficult (Char et al., 2020).

Regulatory and Medicolegal Considerations

FDA Regulation of Surgical AI

Surgical planning software: Class II (510k clearance)
Surgical navigation systems: Class II (moderate-risk devices)
Autonomous surgical robots: Would be Class III (PMA required)
Risk calculators: Often considered clinical decision support (no FDA oversight)

Medicolegal Principles

Surgeons remain legally responsible for AI-assisted decisions. Key documentation practices:

Informed consent should mention AI use when material to patient decision
Documentation should note AI tools used and how output was interpreted
Malpractice risk if AI recommendation followed without independent verification

The Liability Dilemma

Following AI that’s wrong: Surgeon liable for not exercising independent judgment
Ignoring AI that’s right: Plaintiff attorneys argue surgeon ignored available technology
Best practice: Document independent verification of AI outputs, explain clinical reasoning when overriding AI recommendations

Evidence-Based Guidelines for Surgical AI Adoption

Recommendations for Surgeons and Surgical Departments

Before Adopting Any Surgical AI:

Demand evidence: Prospective validation studies in diverse populations, not just retrospective accuracy metrics (Nagendran et al., 2020)
Understand training data: Was the model trained on cases like yours? (Procedure types, patient populations, institutional practices) (Beam & Kohane, 2018)
Know the failure modes: How does the system fail? What are the error rates? What happens with unusual cases? (Vabalas et al., 2019)
Assess workflow integration: Does this fit your existing workflow or require disruptive changes?
Clarify liability: What does your malpractice carrier say about using this AI? What does hospital legal counsel advise?
Verify regulatory status: Is this FDA-cleared? For what specific indication?
Evaluate cost-effectiveness: Does the benefit justify the cost (both financial and cognitive/workflow burden)?

Safe Implementation Practices:

Pilot testing: Start with low-stakes applications, expand carefully based on performance
Parallel validation: Run AI alongside current practice, compare results before replacing current approach
Defined oversight: Clear protocols for who reviews AI outputs and how discrepancies are resolved
Incident reporting: Systems to capture AI errors or near-misses
Ongoing validation: Monitor real-world performance, don’t assume initial validation persists indefinitely
User training: Ensure all users understand AI capabilities, limitations, and appropriate use
Informed consent: Discuss AI use with patients when material to their decision-making

Red Flags (Avoid These AI Systems):

Claims of autonomous surgical decision-making
Black-box models with no explanation of predictions
Lack of prospective validation studies
Vendors unwilling to disclose training data characteristics
No mechanism for reporting errors or failures
Regulatory status unclear or misrepresented
Pressure to adopt without adequate evaluation period

Professional Society Guidelines on AI in Surgery

ACS Leadership on AI in Surgery (2024-2025)

The American College of Surgeons has established significant AI infrastructure:

Leadership:

Dr. Genevieve Melton-Meaux appointed as inaugural Chief Health Informatics Officer (2024)
Practicing colorectal surgeon and director of the Center for Learning Health System Sciences at University of Minnesota

Educational Programs:

“Artificial Intelligence and Machine Learning: Transforming Surgical Practice and Education” - online course available since 2023
Clinical Congress sessions addressing ethical and regulatory AI considerations

Strategic Direction: The ACS emphasizes that surgeons must take the lead in integrating AI, defining how it affects their practice, and influencing what good patient care means. If surgeons don’t step up, what defines successful surgery will be decided by others.

AI Applications Recognized by ACS

The ACS recognizes three primary AI categories transforming surgical practice:

Ambient AI: Automated documentation of surgical encounters and procedures
Prediction tools: Perioperative risk assessment and outcome prediction
Research and writing solutions: Literature review, manuscript preparation assistance

NSQIP and Risk Prediction

The ACS National Surgical Quality Improvement Program (NSQIP) Surgical Risk Calculator represents one of the most validated AI-adjacent tools in surgery:

Developed from outcomes data on millions of surgical patients
Provides patient-specific risk predictions for major complications
Continuously updated with new outcome data
Endorsed by ACS as a shared decision-making tool

SAGES Guidelines

The Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) has engaged with AI particularly in:

Computer vision for surgical field analysis
Real-time anatomical structure identification during laparoscopic procedures
Surgical video analysis for quality improvement and training

Implementation Note: SAGES emphasizes that AI in the OR must be validated for the specific surgical context and population before clinical deployment.

Future Directions

Realistic Near-Term Progress (2-5 years)

Routine integration of ML risk calculators into preoperative clinics
Expanded use of AI surgical planning for complex cases
Video-based quality feedback becoming standard in training
Better postoperative monitoring with AI-augmented early warning systems

Medium-Term Possibilities (5-10 years)

Improved real-time anatomical recognition (still with human verification required)
Context-aware intraoperative decision support (suggestions, not autonomous action)
Personalized surgical technique optimization based on patient anatomy
Semi-autonomous robotic assistance for specific sub-tasks under continuous human supervision

Long-Term Speculation (10+ years)

Highly accurate real-time tissue characterization (pathology-level information intraoperatively)
Predictive models anticipating surgical course and complications with high accuracy
Integration of multi-omic patient data into surgical decision-making
Robotic systems handling increasing proportions of routine surgical tasks (still under surgeon control)

Unlikely Despite Hype

Fully autonomous robotic surgery without surgeon in the loop
AI replacing surgical judgment for complex, high-stakes decisions
Elimination of surgical complications through AI

Conclusion

Surgery is fundamentally a human activity requiring manual skill, real-time judgment, and adaptation to unique patient circumstances. AI can enhance the cognitive work surrounding surgery (risk assessment, planning, quality improvement) and may eventually provide useful intraoperative information. But the surgeon’s hands, eyes, judgment, and responsibility remain central.

The most successful surgical AI applications will be those that respect the complexity of surgery, acknowledge uncertainty transparently, augment rather than replace expertise, and prioritize patient safety over technological impressiveness.

Surgeons should embrace AI as a powerful adjunct while maintaining the healthy skepticism, independent verification, and personal accountability that define good surgical practice.

Check Your Understanding

Scenario 1: AI Risk Calculator Overestimates Surgical Risk

You’re a colorectal surgeon evaluating an 82-year-old woman with Stage III colon cancer. She’s otherwise healthy: active, independent ADLs, no major comorbidities, ECOG 0.

AI surgical risk calculator (MySurgeryRisk) estimates:

30-day mortality risk: 18%
Major complication risk: 45%
Recommendation: “High risk - consider non-operative management”

Traditional ACS NSQIP calculator estimates:

30-day mortality: 3.2%
Major complication: 12%

Patient’s oncologist refers to you stating “AI says surgery too risky. Recommend palliative chemo only.”

Your clinical assessment: Patient is good surgical candidate. Age alone shouldn’t preclude curative surgery. Frailty assessment normal. Cardiopulmonary exam reassuring.

Decision point: Do you:

1. Follow AI recommendation, refer to medical oncology for palliative chemotherapy
1. Override AI, recommend surgery based on your clinical judgment

Answer 1: What explains the discrepancy between AI and traditional calculators?

AI model likely over-weighted age without considering:

Functional status: Patient is ECOG 0, independent, not frail
Comorbidity burden: Minimal comorbidities despite age 82
Fitness indicators: Normal cardiopulmonary reserve

Potential AI training bias:

If AI trained on data where older patients had higher complication rates, model may learn “age 80+ = high risk” without distinguishing fit vs. frail
Simpson’s paradox: Age correlates with frailty in training data, but THIS patient defies that correlation

Traditional NSQIP calculator:

Uses validated risk factors (ASA class, functional status, comorbidities)
May better account for physiologic age vs. chronologic age

Answer 2: What are the liability implications of each choice?

Choice A (Follow AI, decline surgery):

Plaintiff argument (if patient dies from untreated cancer):

Surgeon inappropriately deferred to AI algorithm
Failed to exercise independent clinical judgment
Denied patient potentially curative treatment based on flawed AI estimate
Standard of care requires surgeons to assess patient individually, not defer to algorithm

Legal precedent: Multiple cases where physicians found liable for following decision support tools that contradict clinical judgment

Choice B (Override AI, proceed with surgery):

If patient has major complication or dies:

Plaintiff argument:

Surgeon ignored AI warning of 18% mortality risk
Proceeded with high-risk surgery against AI recommendation
Reckless disregard for patient safety

Defense argument:

AI is decision support tool, not substitute for clinical judgment
Surgeon’s assessment (frailty, functional status, cardiopulmonary reserve) more accurate than AI age-based estimate
Standard of care requires individualized assessment, not algorithmic adherence
Traditional NSQIP calculator (validated, widely used) supported decision
Patient underwent informed consent understanding risks

Likely outcome: Defense verdict if surgeon documented thorough clinical assessment, explained AI discrepancy, obtained informed consent discussing both AI and traditional estimates.

Answer 3: How should you handle this AI-clinical judgment conflict?

Appropriate approach:

Investigate AI discrepancy
- Review AI inputs: What features drove high-risk estimate?
- Compare with traditional validated tools (NSQIP, ERAS)
- Consult surgical colleagues: Would they operate on this patient?
Comprehensive clinical assessment
- Gait speed, grip strength (frailty markers)
- Cardiopulmonary exercise testing if available
- Geriatric assessment
- Functional status (independent vs. dependent ADLs)
Multidisciplinary discussion
- Present case at tumor board
- Geriatric surgery consult if available
- Anesthesia risk assessment
Transparent informed consent
- Discuss both AI and traditional risk estimates with patient
- Explain why estimates differ (age vs. physiologic status)
- Present alternatives (surgery, chemotherapy alone, observation)
- Document: “AI calculator estimated 18% mortality; however, clinical assessment suggests patient is physiologically fit. Traditional NSQIP calculator estimates 3.2% mortality. Discussed both estimates with patient. Patient understands risks, chooses surgery.”
Document clinical reasoning
- “AI risk calculator estimates high risk primarily based on age 82. However, patient demonstrates excellent functional status (ECOG 0, independent ADLs, normal gait speed), minimal comorbidities, normal cardiopulmonary reserve. Traditional NSQIP calculator estimates mortality 3.2%. Clinical judgment: patient is appropriate surgical candidate. AI estimate likely over-weighted chronologic age without adequate consideration of physiologic fitness.”

Lesson: AI risk calculators are tools to inform, not dictate, surgical decisions. When AI conflicts with clinical judgment and validated traditional tools, surgeon must exercise independent assessment. Age alone should not preclude surgery in fit older adults. Document thorough reasoning when overriding AI recommendations.

Scenario 2: Intraoperative AI Misidentifies Critical Anatomy

You’re performing robotic-assisted partial nephrectomy for small renal mass using da Vinci Xi with integrated AI “Surgical Intelligence” system.

AI system features:

Real-time anatomical labeling (kidney, renal artery, renal vein, ureter, tumor)
Proximity alerts when instruments near critical structures
Augmented reality overlay on surgical view

Intraoperative event:

During hilar dissection, AI labels renal artery and renal vein on display. You prepare to clamp renal artery for tumor excision.

Your visual assessment: Structure labeled “renal artery” appears larger than expected, bluish tint, pulsations not prominent.

Uncertainty: Is this truly renal artery or is AI mislabeling renal vein as artery?

Decision point: Do you:

1. Trust AI label, clamp structure labeled “renal artery”
1. Pause, verify anatomy manually before clamping

You choose: Option B (pause and verify)

Manual verification: Doppler ultrasound confirms structure labeled “renal artery” is actually renal vein. True renal artery is 2mm posterior, unlabeled by AI.

If you had clamped based on AI label: Would have clamped renal vein, not artery → inadequate ischemic control → bleeding during tumor excision, potential need for total nephrectomy.

Answer 1: Why did the AI mislabel critical anatomy?

AI computer vision failure modes:

Anatomical variation: This patient had variant renal vascular anatomy (early branching, aberrant vessel course)
- AI trained on typical anatomy
- Variants (present in 20-30% of patients) not well-represented in training data
Tissue appearance similarity: Renal artery and vein can appear similar on video (both red/pink, both tubular)
- AI relies on position, caliber, pulsatility
- In variant anatomy, typical positional relationships disrupted
Partial occlusion: Surgical manipulation may have partially occluded artery → reduced pulsations → AI misidentified as vein
Confidence threshold: AI may have been 60-70% confident (below human comfort level) but still displayed label without uncertainty indication

Answer 2: What are the liability implications if you had clamped the wrong vessel?

If you clamped renal vein instead of artery:

Immediate consequences:

Inadequate tumor ischemia → bleeding during excision
Potential renal vein thrombosis
May require total nephrectomy instead of partial
Patient loses kidney function unnecessarily

Malpractice analysis:

Plaintiff argument:

Surgeon blindly followed AI labeling without manual verification
Failed to exercise fundamental surgical principle: verify anatomy before clamping/cutting
Fell below standard of care by deferring anatomical judgment to AI
Patient lost kidney due to surgeon’s inappropriate reliance on technology

Defense argument:

AI was marketed as “surgical intelligence” system
Reasonable to rely on technology validated by manufacturer, FDA-cleared
Anatomical variation not surgeon’s fault
Damage was not from negligence, but from AI error

Likely outcome:

Plaintiff verdict likely: Courts hold surgeons to personal anatomical verification standard
“AI told me to” is NOT valid defense
Fundamental principle: surgeon must personally verify anatomy before irreversible action
FDA clearance of AI tool does not absolve surgeon of personal responsibility

Precedent: In Smith v. Hospital (hypothetical but representative), surgeon relied on navigation system for spine surgery, placed pedicle screw in wrong location causing nerve injury. Court ruled surgeon liable despite navigation system error: “technology augments but does not replace surgeon’s duty to verify.”

Answer 3: What are the appropriate use principles for intraoperative AI?

Surgical AI as “junior resident”:

AI suggestions are hypotheses, not facts
- AI labels = “This might be renal artery”
- Surgeon verifies = “I confirm this is renal artery”
Verify before irreversible action
- Before clamping, cutting, coagulating: manual confirmation
- Use additional tools: Doppler, manual palpation, ICG angiography, direct visualization
Heightened skepticism in variant anatomy
- If anatomical landmarks don’t match expected positions
- If AI labels conflict with visual assessment
- If patient has known anatomical variants (duplicated vessels, horseshoe kidney)
Demand uncertainty quantification
- AI should display confidence levels
- “Renal artery (92% confident)” vs. “Renal artery (60% confident)”
- Low confidence → require additional verification
Continuous cross-checking
- Compare AI labels with your visual assessment at each step
- If discrepancy, investigate before proceeding

Institutional safeguards:

Training requirements
- Surgeons using AI-augmented systems must complete training on:
  - AI failure modes
  - When to trust vs. verify AI
  - Manual verification techniques
Quality assurance
- Review cases where AI labeling was incorrect
- Share at M&M conferences
- Track AI error rates by anatomy type, procedure
Documentation
- When AI labeling conflicts with surgeon assessment, document:
  - “AI labeled [structure] as [label]; however, manual verification with [Doppler/ICG/palpation] confirmed [correct identity]”

Lesson: Intraoperative AI is assistive, not authoritative. Surgeons remain responsible for anatomical identification regardless of AI labels. Verify critical anatomy manually before irreversible actions. “Trust but verify” is insufficient. Standard should be “Verify independently, AI assists.”

Scenario 3: Postoperative AI Alert Fatigue

You’re surgical quality director implementing AI-based early warning system (Rothman Index, commercial product) for postoperative complication detection.

AI system: Analyzes vital signs, lab values, nursing assessments every 15 minutes. Generates alert when patient predicted to be at increased risk for:

Sepsis
Respiratory failure
Acute kidney injury
Need for ICU transfer

Month 1 performance:

Alerts generated: 847 alerts across 320 postoperative patients (2.6 alerts per patient)
True positives: 23 patients developed complications flagged by AI
False positives: 824 alerts did not correspond to actual complications
False positive rate: 97.3%
Positive predictive value: 2.7%

Clinical impact:

Nursing staff overwhelmed by alerts
Most alerts dismissed as “AI crying wolf”
Alert fatigue setting in (nurses ignoring alerts)

Week 4 critical event:

62-year-old man, post-colectomy day 2
AI generates alert at 2 AM: “High risk for sepsis - recommend immediate evaluation”
Night nurse dismisses alert (patient appears stable, vital signs acceptable)
No physician notification
6 AM: Patient found hypotensive (BP 82/45), tachycardic (HR 128), altered mental status
Diagnosis: Anastomotic leak with peritonitis and sepsis
Patient requires emergent return to OR, ICU care
Prolonged hospital stay, family files complaint: “Why wasn’t the AI alert acted on?”

Answer 1: What caused the alert fatigue?

High false positive rate driven by:

Low disease prevalence
- True complication rate: ~7% of post-op patients
- AI optimized for high sensitivity (catches 23/25 true complications = 92% sensitivity)
- But: At 7% prevalence with 92% sensitivity, 85% specificity → PPV only 2.7%
Threshold calibration
- AI vendor set low threshold to maximize sensitivity (fear of missing complications)
- Resulted in extreme false positive burden
Lack of clinical context
- AI analyzes physiologic data only
- Does not know: patient just returned from 2-hour physical therapy session (explains elevated HR), patient received fluid bolus (explains improved BP trends), patient had expected postoperative fever
Poor alarm design
- All alerts same priority level
- No distinction between “mild concern” vs. “urgent evaluation needed”
- No incorporation of clinical trajectories (improving vs. worsening trends)

Alert fatigue:

824 false positives → nurses learn “AI alerts usually wrong”
Cognitive bias: When 97.3% of alerts are false, dismissing alerts becomes learned behavior
The 23 true positives get lost in noise

Answer 2: Who is liable for the missed anastomotic leak?

Potentially both hospital and individual nurse:

Hospital institutional liability:

Plaintiff argument:

Hospital deployed AI system with 97.3% false positive rate
Created alert fatigue environment where critical alerts ignored
Failed to calibrate system before clinical deployment
Should have monitored alert fatigue, intervened when nurses began dismissing alerts

Nursing liability:

Plaintiff argument:

Nurse dismissed AI alert without evaluating patient
Failed to notify physician of high-risk alert
Did not document why alert was dismissed
Fell below nursing standard of care

Defense argument (nursing):

97.3% false positive rate meant 97 of every 100 alerts were false
Nurse made reasonable judgment based on clinical assessment (patient appeared stable)
Hospital created untenable alert burden
Individual nurse cannot be expected to thoroughly evaluate 2.6 alerts per patient per shift

Likely outcome:

Shared liability: Hospital bears primary responsibility for deploying poorly calibrated system
Individual nurse may bear some liability for not documenting assessment and physician notification

Answer 3: How should AI early warning systems be implemented safely?

System calibration:

Acceptable false positive rate
- Target PPV ≥10-15% (not 2.7%)
- May require reducing sensitivity from 92% → 70-75%
- Trade-off: Catch fewer complications, but those caught are more likely real
Tiered alert system
- Low priority (informational): “Monitor patient closely”
- Medium priority (nursing assessment): “Evaluate patient within 1 hour”
- High priority (physician notification): “Urgent evaluation needed, notify MD immediately”
- Reserve high-priority alerts for PPV >30%
Clinical context integration
- Suppress alerts during expected post-op recovery (first 24 hours)
- Incorporate clinical context (patient just ambulated, received fluid bolus, normal post-op fever)
- Trend analysis (worsening vs. stable vs. improving)

Workflow integration:

Alert response protocol
- High-priority alert → Mandatory nursing assessment within 15 minutes + physician notification
- Document: “AI alert reviewed. Patient assessed. Findings: [stable vs. concerning]. Action: [continued monitoring vs. physician notified].”
Feedback loop
- Track AI alert accuracy
- Monthly review: How many alerts were true positives?
- Adjust thresholds based on performance
Human oversight
- Nurse or physician reviews AI alerts, decides which require action
- AI does not page physician directly (human gatekeeper)

Quality monitoring:

Track alert fatigue
- Monitor alert dismissal rates
- If >80% of alerts dismissed without assessment → system is failing
- Survey staff on alert burden monthly
Audit missed complications
- For every complication, determine: Did AI alert? Was alert acted on?
- If multiple complications missed due to dismissed alerts → pause system, recalibrate
Continuous improvement
- Vendor partnership: Provide feedback on false positives
- Request threshold adjustment or better risk stratification

Lesson: AI early warning systems can improve outcomes only if positive predictive value is high enough to avoid alert fatigue. A system with 97% false positive rate creates more harm (ignored alerts, missed complications) than benefit. Implementation requires careful calibration, tiered alerts, clinical context, and continuous monitoring. “High sensitivity” is not enough. PPV must be clinically actionable (≥10-15% minimum).