Personal Statement

I am currently a PhD Candidate at Columbia University where I am advised by Prof. Noémie Elhadad, Chair of the Department of Biomedical Informatics (DBMI). For most of my PhD, I was also a Visiting Postgraduate Research Fellow at Harvard Medical School. My long-term goal is to pursue a career in academia as a professor. Building on two decades of domestic and international experience in clinical research and public health informatics, my research focuses on human-centered artificial intelligence (AI) and development of systematic, scalable data-driven approaches to promote health equity. My work usually examines and applies methods such as machine learning, natural language processing, and spatiotemporal analysis in addition to traditional biostatistics and epidemiology.

My scholarship is informed by a longstanding commitment to justice, ethics, diversity, and inclusion (JEDI). This is largely demonstrated by many years of service on JEDI-oriented research studies, committees, and working groups; capacity-building efforts devoted to resource-limited settings and marginalized communities; and voluntary civil service. Among these experiences, perhaps one of the more formative was my three-year term appointment as a Commissioner of Human Rights for the city of Cambridge, Massachusetts. During that time, I presided over hearings and facilitated conciliations regarding complaints of discrimination against protected classes related to housing, employment, education, and public accommodation. I also engaged the Massachusetts State Legislature on important policies such as pay equity, protections relative to domestic violence, and anti-discrimination legislation for sexual and gender minority populations (e.g., people who identify as lesbian, gay, bisexual, transgender, or queer).

I am particularly interested in using and interrogating multimodal data sources and the vast toolbox that computational learning offers to better understand, improve, and facilitate study of health in populations and communities that are marginalized. Generally, my research can be grouped into four primary domains: (i) ethical considerations in AI, clinical practice, and digital health; (ii) promoting health data equity and creating knowledge bases; (iii) elucidating health inequity and creating tools to facilitate further discovery; and (iv) enabling equitable learning health systems and precision health.

Scientific Contributions

Ethical considerations in AI, clinical practice, and digital health

My research on ethical considerations, perceptions, responsibilities, and implications in the digital age of healthcare often examines stakeholders and their relationships to one another, notions of data and algorithmic justice, and transparency and trust in clinicians and AI. It also involves encoding human values and goals into AI via a process known as alignment. Among these studies, our work on ethical considerations of clinical natural language processing explored both how it may contribute to healthcare biases as well as potential applications to improve equity in healthcare.

I also led a national survey that examined physician attitudes and practices related to caring acts that have been questioned as “inappropriate” or “unethical” crossing of professional-patient boundaries. We found that the medical profession is divided on appropriate boundary setting in relationships with patients, though a majority considered many caring practices acceptable. In collaboration with a high-level consortium that included the Massachusetts Medical Society, Institute for Healthcare Improvement, and Health Care For All, my work has also led to the creation of a consensus document regarding disclosure of medical errors and, when appropriate, taking responsibility and apologizing for any resulting harm.

Most recently, I collaborated with both my doctoral advisor and Prof. Sandra Soo-Jin Lee, chair of the Division of Ethics at Columbia University Irving Medical Center, on the development of a new graduate seminar, Interrogating Ethics and Justice in Digital Health, for which I helped design the curriculum and course material. We also received grant funding to write a book of educational case studies focused on JEDI in health-related AI. The case book is soon to be released for wide dissemination.

Selected works:

Promoting health data equity and creating knowledge bases

My work promoting data equity in healthcare seeks to improve documentation of demographics and the social determinants of health to support culturally competent care and research of particular interest to minoritized populations. While I often employ advanced data science methods to extract information from existing data sources (e.g., natural language processing of clinical narratives in the electronic health record [EHR]) or integrate community-level information from large public datasets (e.g., national survey data), my work has also involved expanding primary data collection during clinical care.

While co-chair of the Brigham and Women’s Hospital (BWH) LGBTQ and Allies Employee Resource Group, I led advocacy efforts to lobby Mass General Brigham (MGB), previously known as Partners Healthcare, to include sexual orientation and gender identity (SOGI) among the core set of demographics routinely collected in the EHR. Our campaign was successful and I served on the MGB SOGI eCare Working Group that oversaw design and implementation plans impacting care for millions of patients. These efforts resulted in MGB becoming one of the first health systems in the US to collect SOGI in the EHR in a standardized, scalable fashion. Epic, the EHR vendor, would also later incorporate elements from our design into its baseline EHR platform. Subsequent research identified several independent patient, provider, and clinic-level predictors of SOGI documentation which may be used to tailor targeted interventions to improve quality of care and use of these fields.

More recently, my work has encompassed development of scalable, reproducible methods to process, characterize, and monitor trends in the literature and compare them to real-world data on population groups. While this research largely focuses on using AI/ML to build and curate health equity knowledge bases, the methods are generalizable for use on any large body of literature, regardless of domain. My latest publication in the high-impact journal, Science Advances, mined nearly a quarter million scientific articles and insurance claims data on 42 million Americans to spotlight less well-studied conditions, topics, and populations in health disparities and minority health (HDMH) research. To support further investigation, I also created Health Disparities and Minority Health (HDMH) Monitor, a publicly available interactive dashboard and article repository generated from HDMH literature found in PubMed/MEDLINE.

Selected works:

Elucidating health inequity and creating tools to facilitate further discovery

My research also leverages real-world data sources to conduct comprehensive large-scale characterization studies, spotlighting the myriad factors that influence the health outcomes of people from diverse backgrounds, or applies advanced data science techniques to uncover hidden areas of health inequity across intersectional dimensions of lived experience and group identity. For example, using insurance claims and EHR data for nearly 200 million Americans, we conducted a review of 112 acute and chronic diseases, highlighting systematic gender differences in patterns of disease diagnosis and suggesting that symptoms of disease are measured or weighed differently for women and men.

In another study, our investigation of patient terminations led to policy changes across a large academic health system, including a new mandate that clinicians can no longer “fire” patients for missing or failing to cancel clinical appointments, a practice that disproportionately harms people from historically marginalized groups. Prior to the change, “no-shows” were cited as the cause for termination in more than a third of cases. Notably, the majority of terminations were also formalized via EHR functionality that made it both possible and relatively easy to prevent patients from scheduling new appointments, exacerbating barriers to healthcare access.

Selected works:

Enabling learning health systems and precision health

I am also eager to support efforts to transform care, both domestically and internationally, in service of the quintuple aim of improving population health, enhancing the care experience, reducing costs, addressing clinician burnout, and advancing health equity. While at the Harvard T.H. Chan School of Public Health (HSPH), I was a member of the Strategic Information division of the U.S. President’s Emergency Plan for AIDS Relief (PEPFAR) program, a $362 million grant to rapidly expand treatment and care programs for people living with HIV/AIDS in sub-Saharan Africa. In collaboration with clinical and other technical experts, I co-developed population-level dashboards and data-driven point-of-care tools to monitor and evaluate patient health in clinics and hospitals across Nigeria. To date, over 100,000 people receive life-saving HIV treatment and care at these sites. In the interest of strengthening health systems, I also supported data-related educational material development and training activities in Botswana, Nigeria, and Tanzania.

The first study for which I served as senior author elucidated common pitfalls made by clinicians when diagnosing patients, information that can then be used to develop early warning screens in the EHR. My research has also evaluated clinician and patient-facing smart medical devices such as an electronic pillbox that alarms to remind patients to take their medication; automated sensor-based detection of pill removal; alerts to patients or caregivers by phone, email, or text if medications were not taken; and EHR-embedded adherence reports accessible by providers. More recent work has involved development and monitoring of safe, reliable, fair, and robust AI in healthcare, including deep learning models to support precision health and quality improvement during the mpox outbreak.

I have also extended the learning health system paradigm to public health, leveraging alternative data sources (e.g., health information exchanges and large publicly available datasets) to perform enhanced near real-time regional disease surveillance at a level of granularity, timeliness, and complexity not typically possible using conventional public health reporting practices. In particular, these methods were especially helpful in spotlighting inequity, including incongruence between testing patterns and rates of positivity for HIV and other STIs in certain populations and neighborhoods, suggesting disparities related to under- and over-testing.

Selected works: