10  Obstetrics and Gynecology

TipLearning Objectives

Obstetrics and gynecology encompasses reproductive health across the lifespan, from adolescence through menopause, including pregnancy and childbirth. AI applications span prenatal screening, fetal monitoring, cervical cancer screening, and surgical planning. This chapter examines evidence-based AI tools in women’s health. You will learn to:

  • Evaluate AI systems for fetal monitoring and pregnancy risk prediction
  • Understand AI applications in prenatal screening and ultrasound interpretation
  • Assess AI tools for cervical and breast cancer screening
  • Navigate ethical challenges specific to maternal-fetal medicine
  • Identify failure modes in obstetric AI (high-stakes, two-patient scenarios)
  • Recognize equity concerns in women’s health AI
  • Apply evidence-based frameworks for OBGYN AI adoption

Essential for obstetricians, gynecologists, maternal-fetal medicine specialists, midwives, and women’s health providers.

The Clinical Context: OBGYN AI faces unique challenges: two patients (mother and fetus) with potentially competing interests, significant health disparities by race/ethnicity (maternal mortality 2-3x higher for Black women (petersen2019racial?)), medico-legal environment (malpractice rates highest in obstetrics), and societal attitudes toward reproductive autonomy. AI applications must navigate these complexities while addressing real clinical needs.

Key AI Applications in Obstetrics:

10.0.1 1. Fetal Monitoring and Interpretation

⚠️ Electronic Fetal Monitoring (EFM) Interpretation:

Clinical problem: Cardiotocography (CTG) monitoring universal in US labor management, but interpretation highly subjective. Inter-observer agreement κ=0.30-0.50 (poor) (blackwell2011interobserver?).

AI approaches: - Automated CTG interpretation - Fetal heart rate pattern recognition - Prediction of fetal acidemia/hypoxia

Evidence: - Multiple ML systems developed (comert2019prognostic?; zhao2019computer?) - Sensitivity 85-95% for detecting fetal compromise - Specificity variable (60-80%) - Published in Computers in Biology and Medicine (comert2019prognostic?)

Major limitation: - High false positive rates → increased cesarean sections - No RCT showing improved outcomes (Apgar scores, cord pH, neonatal morbidity) - Risk of automation bias (providers over-relying on AI categorizations)

Cochrane review conclusion: - Insufficient evidence that computer analysis of CTG improves perinatal outcomes (grivell2015antenatal?) - May increase intervention rates without clear benefit

⚠️ Verdict: Not yet ready for routine clinical use. Requires prospective trials demonstrating improved maternal/neonatal outcomes, not just pattern recognition accuracy.

10.0.2 2. Preterm Birth Prediction

⚠️ ML Models for Preterm Birth Risk:

Clinical problem: Preterm birth affects 10% of US pregnancies (martin2021births?), leading cause of neonatal morbidity/mortality.

Traditional screening: - Cervical length ultrasound - Fetal fibronectin testing - Clinical history

AI enhancement: - Integrate EHR data (demographics, medical history, labs, medications) - Predict spontaneous preterm birth <37, <34, <28 weeks

Evidence: - ML models achieve AUC 0.70-0.80 for preterm birth prediction (fergus2021prediction?) - Better than individual risk factors alone but modest improvement - Published in PLOS ONE (fergus2021prediction?) - External validation shows performance degradation (mailath-pokorny2023machine?)

Limitations: - Positive predictive value low (10-20%) due to low prevalence - Unclear how to act on predictions (progesterone, cerclage only effective in specific subgroups) - Social determinants of health (stress, racism, housing instability) not captured in EHR - Racial disparities in preterm birth (Black women 1.5x rate) (march2015born?) not explained by medical factors

⚠️ Verdict: Research promising but clinical utility limited by low PPV and lack of effective interventions for most high-risk women. Does not address root causes (structural racism, social determinants).

10.0.3 3. Prenatal Genetic Screening and Ultrasound

Cell-Free DNA Screening (NIPT) Enhanced Reporting:

Application: AI analysis of sequencing data for aneuploidy detection - Trisomy 21, 18, 13 detection - Sex chromosome abnormalities - Microdeletion syndromes (emerging)

Evidence: - NIPT sensitivity >99% for Trisomy 21 (norton2015cellfree?) - Published in NEJM (norton2015cellfree?) - AI improves detection of rare aneuploidies and mosaicism (egbert2022cellfree?)

Important caveats: - NIPT is screening, not diagnostic (amniocentesis/CVS for confirmation) - False positives occur (especially low-prevalence conditions) - Incidental findings (maternal malignancy) require counseling infrastructure - Equity concern: Expensive ($500-2000), not always covered by insurance

Verdict: Well-validated screening tool. AI enhancements improve detection but counseling about limitations essential.

⚠️ Automated Fetal Ultrasound Analysis:

Applications: - Nuchal translucency measurement - Fetal biometry (head circumference, abdominal circumference, femur length) - Anatomic survey automated views and measurements - Placental localization

Evidence: - AI-assisted biometry reduces measurement time and variability (namburete2015fully?) - Anatomic landmark detection 85-95% accuracy in research settings (baumgartner2017sononet?) - Published in Medical Image Analysis (namburete2015fully?)

Limitations: - Image quality dependent (maternal habitus, fetal position) - Rare anomalies may be missed - Does not replace skilled sonographer/perinatologist interpretation - Liability concerns: who is responsible for missed anomalies?

⚠️ Verdict: Useful for standardization and efficiency, but cannot replace human expertise for comprehensive fetal assessment.

10.0.4 4. Maternal Risk Prediction

Preeclampsia Prediction Models:

Traditional screening: - First trimester: maternal factors + PAPP-A + PlGF - Fetal Medicine Foundation algorithm

AI enhancement: - ML models integrating clinical + biochemical + ultrasound data - Predict early-onset (<34 weeks) vs. late-onset preeclampsia

Evidence: - ML models AUC 0.85-0.90 for early-onset preeclampsia (tan2018screening?) - Better discrimination than traditional models - Published in Ultrasound in Obstetrics & Gynecology (tan2018screening?) - Prospective validation ongoing (ASPRE trial derivatives)

Clinical utility: - Aspirin prophylaxis reduces preeclampsia risk by 50-60% in high-risk women (rolnik2017aspirin?) - AI identifies women who benefit most from prophylaxis

Verdict: Promising. Improved risk stratification could enable targeted prevention. Requires prospective validation and cost-effectiveness analysis.

⚠️ Postpartum Hemorrhage (PPH) Prediction:

Challenge: PPH leading cause of maternal mortality globally, difficult to predict

AI approaches: - Real-time prediction during labor - Integrate vitals, labs, medications, obstetric factors

Evidence: - ML models predict PPH with AUC 0.70-0.75 (venkatesh2020machine?) - Modest improvement over clinical risk factors - Published in American Journal of Obstetrics & Gynecology (venkatesh2020machine?)

Limitation: - Many PPH cases occur in low-risk women (unpredictable) - Interventions (uterotonics, surgical preparedness) already standard - Alert fatigue if predictions not actionable

⚠️ Verdict: Research ongoing. Clinical benefit uncertain given current prophylactic practices.

AI Applications in Gynecology:

10.0.5 5. Cervical Cancer Screening

AI-Assisted Cytology and HPV Testing:

Traditional screening: - Pap smear cytology - HPV DNA testing - Co-testing strategies (ASCCP guidelines)

AI enhancement: - Automated cytology interpretation - HPV genotyping risk stratification - Colposcopy image analysis

Evidence: - Automated Pap cytology: Sensitivity 85-95% for HSIL (bao2023artificial?) - Reduces false negatives by 10-20% - Published in Cancer Cytopathology (bao2023artificial?)

  • AI colposcopy: Detects CIN2+ with sensitivity 90-95% (hu2019automated?)
  • Published in Journal of Lower Genital Tract Disease (hu2019automated?)
  • Particularly valuable in low-resource settings lacking cytopathology infrastructure

WHO pilot studies: - AI-based visual inspection with acetic acid (VIA) screening in low-middle income countries - Sensitivity comparable to expert clinicians (xue2020artificial?) - Could improve access to screening where resources limited

Verdict: Well-validated adjunct to cervical cancer screening. Particularly promising for improving access in under-resourced settings.

10.0.6 6. Breast and Ovarian Cancer Risk Prediction

Breast Cancer Risk Models Enhanced with AI:

Traditional models: Gail, Tyrer-Cuzick, BRCAPRO

AI enhancement: - Integrate mammographic density, SNPs, family history, reproductive factors - Polygenic risk scores

Evidence: - ML models improve breast cancer prediction AUC 0.65-0.70 (yala2019deep?) - Published in Radiology (yala2019deep?) - Identifies women who may benefit from enhanced screening (MRI, tomosynthesis)

Verdict: Useful for personalized screening recommendations. Should complement, not replace, shared decision-making.

⚠️ Ovarian Cancer Early Detection:

Challenge: Ovarian cancer often diagnosed at late stage, screening strategies (CA-125, ultrasound) have high false positive rates

AI approaches: - Multimarker panels analyzed with ML - Ultrasound-based ovarian mass characterization

Evidence: - O-RADS (Ovarian-Adnexal Reporting and Data System) with AI improves malignancy prediction (cao2022ovary?) - Reduces unnecessary surgeries for benign masses - Published in Radiology (cao2022ovary?)

⚠️ Limitation: No screening strategy (including AI-enhanced) shown to reduce ovarian cancer mortality in average-risk women. USPSTF recommends against screening (henderson2018screening?).

⚠️ Verdict: Useful for characterizing known masses, but NOT for general population screening.

10.0.7 7. Surgical Planning in Gynecology

Endometriosis Detection and Mapping:

AI applications: - MRI-based endometriosis detection - Surgical planning for deep infiltrating endometriosis - Predicting surgical complexity

Evidence: - Deep learning models detect endometriomas with 90-95% accuracy (andres2020mri?) - Published in European Radiology (andres2020mri?) - Helps surgeons plan approach and counsel patients about surgical complexity

Myomectomy/Hysterectomy Planning:

  • AI analysis of fibroid size, location, vascularity
  • Predicts surgical approach (laparoscopic vs. abdominal)
  • Estimates blood loss risk

Evidence: Limited but promising feasibility studies

Verdict: Useful adjunct for complex surgical planning. Does not replace clinical judgment but improves standardization.

Equity and Racial Bias in OBGYN AI:

WarningMaternal Health Disparities and Algorithmic Bias

Documented Disparities in Maternal Outcomes:

1. Maternal Mortality: - Black women 2-3x more likely to die from pregnancy-related causes (petersen2019racial?) - Published in MMWR by CDC (petersen2019racial?) - Persists across education and income levels - Driven by structural racism, implicit bias, access to quality care

2. Preterm Birth: - Black women 50% higher rate of preterm birth (march2015born?) - Not explained by traditional medical risk factors - Chronic stress from racism implicated (weathering hypothesis)

3. Cesarean Section Rates: - Black and Hispanic women have higher cesarean rates after controlling for medical indications (edmonds2013racial?)

How AI Can Worsen Disparities:

Training Data Bias: - Most datasets overrepresent white, insured women - Minority women underrepresented or data incomplete - Algorithms learn patterns that don’t generalize

Examples: - Preterm birth prediction models trained predominantly on white women may underperform in Black women - Preeclampsia risk calculators may misclassify Hispanic women (different biomarker distributions) - Fetal monitoring algorithms may have different accuracy across racial groups (not well studied)

Pulse Oximetry Bias: - Overestimates oxygen saturation in dark skin by 2-3% (sjoding2020racial?) - AI relying on pulse ox data inherits this measurement bias - Affects intrapartum monitoring of mothers and neonates

Mitigation Strategies: - Require diverse datasets for training and validation - Stratify performance reporting by race/ethnicity - Address social determinants of health in models - Engage community members in AI development - Monitor for bias after deployment - Invest in addressing root causes of disparities, not just prediction

Ethical Challenges Specific to Maternal-Fetal Medicine:

ImportantTwo-Patient Dilemmas in Obstetric AI

Maternal-Fetal Conflict: - AI recommendations may optimize fetal outcomes at maternal expense (e.g., early cesarean delivery) - Maternal autonomy must be preserved - Cannot treat fetus as independent patient against maternal wishes

Example Scenario: - AI predicts 30% risk of stillbirth if pregnancy continues - Recommends immediate cesarean delivery at 35 weeks - Mother prefers expectant management to avoid surgery and prematurity risks - Ethically: Mother’s decision prevails (informed refusal) - Legally: Cannot compel cesarean delivery

Informed Consent Challenges: - How to communicate AI-generated risk predictions? - Uncertainty in predictions (confidence intervals rarely provided) - Risk of coercing decisions through statistical intimidation

Liability Concerns: - If AI predicts complication and physician doesn’t act, malpractice risk? - If AI wrong and intervention causes harm, who is liable? - Documentation burden: Must explain AI role and rationale for agreement/disagreement

Reproductive Autonomy: - Prenatal screening AI may influence pregnancy termination decisions - Access to screening differs by geography, insurance (equity) - Disability rights advocates concerned about selective termination

Principles for Ethical Obstetric AI: 1. Maternal autonomy paramount 2. AI provides information, not directives 3. Shared decision-making framework 4. Transparent communication of uncertainty 5. Respect diverse values and preferences 6. Address disparities, don’t worsen them

Clinical Guidelines for OBGYN AI Adoption:

TipACOG Principles for AI in Women’s Health

Before Adopting OBGYN AI:

  1. Demand robust validation:
    • Prospective studies showing improved outcomes (not just prediction accuracy)
    • Validation in diverse populations (race, ethnicity, SES, geography)
    • Transparent reporting of performance by subgroups
  2. Assess impact on maternal autonomy:
    • Does this support or constrain patient decision-making?
    • How will recommendations be communicated?
    • Can patients decline AI-assisted care?
  3. Evaluate equity implications:
    • Will this widen or narrow disparities?
    • Is it accessible regardless of insurance/geography?
    • Does training data reflect population diversity?
  4. Consider medico-legal landscape:
    • What does malpractice carrier advise?
    • Documentation requirements?
    • Informed consent necessary?
  5. Ensure multidisciplinary review:
    • Obstetricians, midwives, nurses, patients, ethicists
    • Diverse perspectives on benefits and risks

Safe Implementation:

  1. Pilot testing in low-stakes scenarios first
  2. Enhanced informed consent process
  3. Clear protocols for discordance (AI says X, clinician thinks Y)
  4. Systematic bias monitoring (outcomes by race/ethnicity)
  5. Patient feedback mechanisms
  6. Regular performance audits

Red Flags (Avoid These Systems):

❌ No validation in diverse populations ❌ Claims to replace clinical judgment in high-stakes decisions ❌ Black-box models without explanation ❌ Vendor resists equity audits ❌ Recommends interventions without evidence base ❌ Doesn’t account for patient preferences/values

Future Directions:

Near-Term (2-5 years): - Enhanced preeclampsia screening with targeted prevention - AI-assisted cervical cancer screening in low-resource settings - Standardized fetal biometry and anatomic survey - Improved endometriosis detection on MRI

Medium-Term (5-10 years): - Integration of social determinants of health into risk prediction - Real-time intrapartum decision support (with appropriate safeguards) - Personalized cesarean delivery risk counseling - AI-guided fertility treatment optimization

Long-Term (10+ years): - Predictive models for pregnancy complications incorporating genetics, environment, social factors - Continuous remote monitoring for high-risk pregnancies - AI-assisted robotic gynecologic surgery (surgeon-in-the-loop)

Unlikely Despite Hype: - AI replacing clinical judgment for delivery timing/mode - Fully automated prenatal diagnosis - Elimination of health disparities through AI alone (requires addressing root causes)

10.1 Conclusion

AI in obstetrics and gynecology must navigate unique ethical challenges: two-patient scenarios, profound health disparities, and reproductive autonomy. While AI shows promise for improving prenatal screening, cervical cancer detection, and risk stratification, deployment must center maternal autonomy, address rather than worsen disparities, and prove clinical benefit beyond prediction accuracy.

Obstetricians and gynecologists should demand robust evidence, transparent algorithms, and equity analyses before adopting AI systems. The goal is not just more accurate predictions, but healthier mothers and babies—especially those from communities bearing disproportionate burdens of maternal morbidity and mortality.

As the American College of Obstetricians and Gynecologists emphasizes: Technology must serve patients’ best interests and respect their autonomous decision-making (acog2021technology?).


10.2 References