Oct 11, 2025

Public workspaceAnalysis of Couples’ Microbiomes and Health Outcomes

  • Siddharth Singh1
  • 1Indian Institute of Technology Indore
Icon indicating open access to content
QR code linking to this content
Protocol CitationSiddharth Singh 2025. Analysis of Couples’ Microbiomes and Health Outcomes. protocols.io https://dx.doi.org/10.17504/protocols.io.n2bvjejqxgk5/v1
License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License,  which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Protocol status: In development
We are still developing and optimizing this protocol
Created: October 11, 2025
Last Modified: October 11, 2025
Protocol Integer ID: 229563
Keywords: similar microbiomes across gut, microbiome, social microbiome, oral microbiome, similar microbiome, metagenomic study, life microbiome seeding, shotgun metagenomic, person microbial transmission, metagenome, pathway profiling, analysis of couple, genital sites than unrelated individual, microbial transmission, elevated partner similarity, strain sharing, couple, identifiable partner, linked analysis, partner similarity, cohabiting partner, associations with fertility, genital site, species profiling, family health, strong partner convergence on skin, analyses integrate fertility
Abstract
Background: Cohabiting partners share more similar microbiomes across gut, oral, skin, and genital sites than unrelated individuals, with metagenomic studies demonstrating measurable strain sharing (median ~12% gut; ~32% oral) and convergence that scales with duration of cohabitation. This “social microbiome” may influence reproductive, metabolic, and child health outcomes.
Objective: In-silico workflow for exploratory, couple-level, multi-site microbiome analysis using only public datasets (shotgun metagenomics, 16S supported), with emphasis on strain-resolved transmission, functional convergence, and associations with fertility, pregnancy, fetal development, and child health.
Methods: We harmonize public multi-site datasets with identifiable partner/household links (e.g., gut, oral, skin, genital) and rich metadata. Amplicon reads are reprocessed with a uniform QIIME 2/DADA2 pipeline; metagenomes undergo host depletion, species profiling (MetaPhlAn 4), and pathway profiling (HUMAnN 3). Strain sharing is quantified with StrainPhlAn/inStrain across prioritized taxa using stringent ANI/breadth thresholds to reduce false positives. Dyadic analytics include partner-vs-non-partner beta-diversity contrasts, permutation tests, mixed-effects models, and actor-partner interdependence models; functional similarity and resistomes are compared within couples; transmission and cross-site co-occurrence networks are reconstructed. Outcome-linked analyses integrate fertility and perinatal phenotypes when available.
Expected results: We anticipate (i) elevated partner similarity and strain sharing in gut and oral microbiomes; (ii) strong partner convergence on skin and measurable oral transfer tied to intimate behaviors and (iii) couple-level functional convergence, with exploratory links to reproductive and family health.
Significance: This protocol operationalizes a reproducible framework to study the couple as the analytical unit-advancing hypotheses on person-to-person microbial transmission, co-adaptation, and their relevance for preconception care, BV recurrence prevention, fertility optimization, and early-life microbiome seeding.
Troubleshooting
Analysis of Couples’ Microbiomes and Health Outcomes
Introduction and Background
Couples who live together do more than share their lives, they also share their microbes. Cohabiting partners have been shown to exchange and harbor more similar microbiomes across various body sites (gut, skin, oral, and genital) compared to unrelated individuals [1][2]. This sharing arises through sustained close contact, a shared environment, and intimate behaviors. Cohabiting family members especially spouses have significantly more microbes in common than people from different households, with the strongest cohabitation effects observed on the skin microbiome, followed by the oral and gut microbiomes [1]. In one study, algorithms could even identify couples with ~86% accuracy based solely on skin microbiome similarity [2]. Such findings suggest that prolonged close interaction and a shared home led partners to exchange microbial species and strains regularly.
Gut Microbiome Similarity in Couples: In the gut, sustained cohabitation can make partners’ microbial communities resemble each other. Research integrating microbiome data into the Wisconsin Longitudinal Study found that spouses had significantly more similar gut microbiota compositions and shared more bacterial taxa than either siblings or random unrelated pairs [3]. Notably, spouses’ microbiomes were more alike than even those of siblings, despite siblings sharing genetics and upbringing [3]. These similarities persisted after adjusting for diet, indicating that marital cohabitation per se influences the gut microbiome [3]. In the same study, married individuals also harbored greater gut microbial diversity and richness relative to those living alone, with the highest diversity seen in individuals reporting very close marital relationships [3]. This is intriguing given that high microbiome diversity is generally associated with health benefits, such as lower risk of gastrointestinal disorders [3]. It suggests that the intimate social and physical interaction of marriage might promote a more diverse gut ecosystem, potentially through exchange of microbes between partners or through indirect effects like shared diet and lifestyle [3]. In fact, cohabiting couples in that study had significantly higher microbiome diversity than unmarried individuals (p<0.01) [3], supporting the idea that living together enriches the gut microbiota. These results align with other findings that cohabitation (even in animals like pigs) can increase microbiome diversity [3].
Skin and Oral Microbiome Sharing: The skin microbiome is especially prone to partner influence. A landmark study swabbing 17 body sites in cohabiting couples found that partners’ skin microbiomes were much more similar to each other than expected by chance [2]. The most striking resemblance was on the feet, likely because partners walk barefoot on the same floors and share shower surfaces, facilitating microbial exchange [2]. In contrast, skin sites like the inner thighs were more strongly shaped by the person’s sex than by their partner (men’s thigh microbiomes resembled other men’s more than their female partner’s) [2]. Overall, though, the shared environment and direct skin contact in daily life cause partners to significantly influence each other’s skin microbial communities [2]. Similarly, in the oral cavity, intimate behaviors like kissing can rapidly transmit microbes. A 10-second intimate kiss can transfer on the order of 80 million bacteria between partners [4]. Frequent French kissing leads to couples developing a shared salivary microbiome over time [4]. One study showed that the more often a couple reported kissing, the more similar their saliva microbial profiles were, and certain bacteria (e.g. probiotic Lactobacillus introduced in one partner) could be detected in the other partner after a kiss [4]. Interestingly, while saliva sharing requires continuous exchange to remain similar, the tongue microbiota of long-term partners was more similar than that of random individuals even independent of recent kissing [4]. This suggests that beyond direct exchange, partners’ oral microbiomes may also converge due to shared diet, habits, and cohabitation environment [4]. Indeed, scientists have even hypothesized an evolutionary role for kissing in humans: to exchange microbiota (and viruses) as a form of immunological priming. One hypothesis posits that intimate kissing evolved to transmit cytomegalovirus (CMV) between partners before pregnancy, so that the woman gains adaptive immunity and the risk of dangerous fetal infection is reduced [4]. This illustrates a potential co-evolutionary link between human pair-bonding behaviors and microbiome sharing for the benefit of reproduction.
Microbial Transmission and Strain Sharing: Modern metagenomic studies have confirmed that cohabiting partners don’t just share similar species, they often share the same strains of microbes. A recent large-scale analysis of person-to-person microbiome transmission (including >800 individuals) found that within-household adults share significantly more gut bacterial strains with each other than with outsiders [5]. In fact, the rate of strain sharing between cohabiting partners was on par with that between parents and children or between siblings in the same home [5]. By contrast, people from different households (even in the same village) shared very few strains [5]. Moreover, when cohabitation ends, shared strains gradually wane: studies of adult twins showed that the longer they had lived apart, the fewer gut strains they still had in common [5]. Together, these findings indicate that living together facilitates frequent bidirectional transfer of microbes, and some transferred strains can persist long-term in partners’ microbiomes [5]. Certain bacterial species were identified as “highly transmitted” within households, specific Bifidobacterium and Bacteroides strains were efficiently spread between cohabitants [5]. Intriguingly, many of these same bacteria are also known to pass from mothers to infants, suggesting they are generally effective at finding niches in new hosts [5]. The persistence of shared strains underscores that partner-to-partner transmission is not just transient; cohabitation can lead to partners maintaining a common reservoir of microbes.
Health Implications of Couples’ Microbiomes: The convergence of microbiomes in couples opens up new questions about health. It has long been observed that married people tend to have better health outcomes and longevity than singles, partly due to lifestyle, social support, and stress reduction. Now, researchers speculate the microbiome could be an additional biological factor linking close relationships to health [3]. The 2019 gut microbiome study noted that the health benefits of marriage (e.g. lower inflammation and mortality) might coincide with the greater microbial diversity found in happily cohabiting couples [3]. A more diverse gut microbiome is generally considered beneficial, and in that study, couples reporting very close relationships had the most diverse microbiomes [3]. This raises the intriguing possibility that positive relationship dynamics can indirectly promote microbiome diversity (perhaps via stress buffering or shared healthy behaviors), which in turn could confer health resilience [3]. Conversely, sharing microbes also means sharing potential pathogens or dysbiosis: an unhealthy microbiome in one partner might adversely affect the other. Indeed, certain microbiome-mediated conditions have a clear dyadic aspect. A striking example is bacterial vaginosis (BV) in women, a dysbiosis of the vaginal microbiome often involving Gardnerella and other anaerobes. BV is not classified as a classic STI, but studies show male partners can harbor BV-associated bacteria on their penis and reintroduce them to the female partner after treatment [6]. A recent randomized trial demonstrated that treating the male partner with antibiotics alongside the female greatly reduced BV recurrence (35% vs 63% recurrence within 12 weeks when only the woman was treated) [6]. This proves the male genital microbiome can influence the female’s vaginal health, and that a couple-focused approach is essential to break the cycle of reinfection [6]. Such findings emphasize that for microbiome-related conditions, the unit of treatment may need to be the couple rather than just the individual.
Beyond infections, microbiome sharing may play a role in chronic health conditions that spouses often share. Partners frequently exhibit correlated weights and metabolic profiles, which could partly stem from a shared diet and microbiome. The gut microbiota is known to affect energy harvest and metabolism; thus, dysbiosis in one partner could be “contagious” in influencing obesity risk or metabolic disease in the other, compounding lifestyle factors. Although not yet conclusively proven, researchers are exploring whether an “obese” or pro-inflammatory microbiome can transmit metabolic effects between cohabiting humans, as has been shown in animal models. Similarly, the household microbiome environment influences children’s health: infants acquire their initial microbiome from the mother (through birth canal and breastfeeding), but subsequent colonization is shaped by exposure to the father, siblings, and home environment. Cohabiting couples create a microbial milieu that their offspring will inherit. In fact, the eLife study found that dog-owning couples shared microbes with their dog, and that having a dog increased microbial sharing between human family members (likely by acting as a carrier between people) [1]. This kind of rich microbial exchange in a family with pets has been associated with children developing more diverse gut microbiota and possibly fewer allergies [1][7]. Early-life exposure to diverse microbes (from parents, pets, and the home) is thought to train the immune system and may protect against allergies and asthma [8]. Thus, studying couples’ microbiomes is not only about the partners themselves, but also about how parental microbiomes co-influence the next generation’s health.
Couples’ Microbiome and Reproductive Success: A special focus of this protocol is the link between couples’ microbiomes and reproductive outcomes fertility, pregnancy, fetal development, and child health. Emerging research suggests that the microbiomes of both male and female partners can affect conception and pregnancy. The female reproductive tract microbiome (especially the vaginal microbiome) is crucial for fertility and healthy pregnancy. Lactobacillus-dominated vaginal microbiota is associated with lower infection risk and better fertility outcomes, whereas vaginal dysbiosis (like BV) has been linked to infertility, IVF failure, and pregnancy complications [9]. What about the male partner’s role? Besides the BV example above, there is growing interest in the seminal microbiome and male fertility. Semen is not sterile; it carries bacteria that could ascend and affect the female tract or directly impact sperm function. Recent metagenomic analysis of infertile couples examined both partners’ genital microbiota, including semen and follicular fluid [9]. It found that women’s vaginal and follicular fluid communities were, as expected, dominated by Lactobacilli (in healthier profiles), while men’s penile and semen samples were more diverse and often contained skin and genital bacteria, including some associated with BV or urethral infections [9]. Interestingly, that study observed correlations between the partners’ microbiomes, hinting at microbial transmission or shared environmental influences on the genital tract [9]. However, direct inter-partner microbiota transfer in the genital tract appeared limited in that cross-sectional sampling [9]. Even so, the presence of BV-related bacteria in some male samples underscores a possible mechanism for how male microbiota impact female reproductive health [9]. Beyond local genital effects, the gut microbiome of each partner may influence systemic reproductive health, through metabolism of sex hormones or modulation of immune function. There is evidence that gut dysbiosis can affect estrogen levels (via the estrobolome) and contribute to conditions like polycystic ovary syndrome or endometriosis, which impact fertility [10]. In males, gut microbiota composition has been linked to testosterone and semen quality, suggesting a “gut-testis axis” [11]. Notably, paternal microbiome and diet before conception might have lasting effects on offspring via epigenetic programming of sperm. Animal studies show that altering a male’s gut microbiota (through diet or probiotics) can change his sperm epigenetic markers and influence embryo development and offspring health [12]. Obese male mice with dysbiotic guts have fathered pups with higher risks of metabolic disorders, presumably through sperm molecular changes rather than direct microbial transfer. These insights have led to the concept of a paternal microbiome impact on fetal development, sometimes termed a gut-germline axis While human data is still nascent, a 2025 review highlights this as a frontier: understanding how a father’s gut and oral microbiota, in addition to his seminal microbiota, may affect pregnancy outcomes and long-term child health [12]. Taken together, these findings justify a comprehensive, couple-level approach to reproductive microbiome health, treating the couple as an integrated biological unit.
Goals and Rationale: Given the above, this protocol aims to deeply explore couples’ microbiomes and their co-evolution with health. We will leverage the wealth of public human microbiome data to examine how partners’ microbiomes interact and whether those interactions relate to outcomes in reproductive health (fertility, miscarriage, pregnancy outcomes), neonatal/fetal development, child health, and other health indicators. By focusing on metagenomics data (whole-genome shotgun sequencing where available), we can go beyond basic microbiome profiling to analyze strain sharing, functional genes, and potential microbial co-adaptations in couples. The overarching hypothesis is that sustained cohabitation and intimacy lead to measurable microbiome convergence in couples, which in turn may have co-evolved to impact their shared health and reproductive success. This protocol is deliberately exploratory across a broad range of health metadata, from metabolic and immune markers to fertility metrics to uncover any and all associations pertinent to couple microbiome dynamics. Ultimately, understanding these connections could inform interventions (preconception microbiome optimization in both partners, or synchronizing probiotic therapies among couples) to improve outcomes for parents and children.
Data Sources and Materials
This in-silico study will utilize existing, publicly available microbiome datasets, as no new wet-lab sampling or sequencing will be performed. Key data sources include:
Cohort Studies of Households/Couples: Foundational data from studies like Song et al. (2013) [1], which surveyed fecal, skin, and oral microbiomes in 60 cohabiting families (including couples), and Dill-McFarland et al. (2019) [3], which profiled gut microbes of hundreds of older adults including spouse pairs, will be obtained if available. These provide baseline information on microbiome similarity vs. between households.
Targeted Couples Microbiome Studies: More specialized datasets will be included, such as:
Skin microbiome of couples: Ross et al. (2017, mSystems) sampled 17 skin sites from 10 cohabiting couples [2].
Oral microbiome and kissing: Kort et al. (2014, Microbiome) sequenced saliva and tongue samples from 21 couples before and after kissing [4]. Raw sequence data from this study will allow analysis of oral microbial sharing.
Gut microbiome and social networks: Any datasets linking gut microbiome with social relationship data (e.g. cohabitation status, relationship quality), the above WLS dataset [6] or the American Gut Project (which includes some self-reported family links). Will be integrated to examine social determinants of the microbiome.
Reproductive Microbiome Data: We will leverage recent data on couples undergoing fertility treatments: the Microbiome of Infertile Couples project (Peric et al. 2023/2025) which performed metagenomic sequencing of vaginal swabs, semen, follicular fluid, and penile swabs in infertile heterosexual couples [9]. The authors have made data available (e.g. via Zenodo DOI:10.5281/zenodo.7885591), enabling us to analyze genital microbiome composition in couples and link to clinical fertility outcomes. Additionally, any available 16S rRNA or metagenomic data from vaginal microbiome studies that include male partner information (such as BV studies or longitudinal pregnancy cohorts) will be included. If a study of recurrent miscarriage or preterm birth collected microbiomes from both the woman and her partner, those data would be extremely relevant. We will search repositories like NCBI SRA, EMBL-ENA, and Qiita for keywords like “couples”, “partner”, “spouse” combined with “microbiome” or specific conditions.
Public Database of Strain Transmission: The Nature 2022 study on person-to-person strain transfer [5] assembled metagenomes from multiple populations (including cohabiting pairs and twins). If those sequencing data (or at least the reported strain-sharing metrics) are available, we will incorporate them to bolster our strain-level analysis.
Metadata and Health Outcomes: Alongside microbiome sequences, we will compile relevant metadata for each couple or individual from these studies. This may include: age, sex, BMI, diet, health status (e.g. presence of chronic diseases), relationship information (duration of cohabitation, closeness or relationship satisfaction scores if available[3]), fertility outcomes (e.g. pregnant vs not, IVF success rates, time-to-pregnancy), pregnancy outcomes (term vs preterm birth, birth weight, etc.), and child health follow-ups (if provided, such as development of allergies, growth metrics, etc.). We will also note any environmental factors like pet ownership from household studies, as these can influence microbiome sharing [1]. All data will be handled in compliance with database usage terms and any applicable ethical guidelines, focusing on de-identified, aggregate analysis.
Tools and Software: The computational analysis will be conducted with established microbiome bioinformatics tools. We will use QIIME 2 and/or QIITA for amplicon (16S) data processing and diversity analysis, and tools like MetaPhlAn 4 or Kraken2/Bracken for taxonomic profiling of shotgun metagenomic data. For functional profiling of metagenomes, we will employ HUMAnN 3 to quantify microbial gene pathways in each sample. To detect strain sharing, we plan to use strain-resolution methods such as StrainPhlAn (for species where genomes can be reconstructed from metagenomes) or k-mer based approaches to identify identical/similar strains between partners. Custom scripts in R/Python will be written for data integration, statistical analysis, and visualization. Key libraries will include vegan (for diversity stats), lme4/metafor (for mixed models or meta-analyses combining datasets), and networkx (to create networks of microbial transmission among individuals). If needed, we will leverage public computing resources or cloud platforms for the large-scale sequence analyses (since metagenomic datasets can be sizable).
Before We Begin
Assemble a metadata spreadsheet
For each dataset, create a table with columns:
Sample ID, participant ID and body site (oral, vaginal, rectal, faecal/gut).
Health status / disease group. Demographics (age, race/ethnicity, BMI). Reproductive factors, including pregnancy stage, menstrual phase, contraceptive use (non‑hormonal, combined oral contraceptives, levonorgestrel intrauterine system), and oestradiol/progesterone levels. Lifestyle variables, e.g. smoking, diet, sexual activity. Sequencing platform (Illumina, PacBio) and 16S region or read length. Accession numbers for raw data.
Set up computational environment Install the following software (latest versions recommended):
QIIME 2 for 16S amplicon processing (denoising with DADA2 or Deblur, taxonomy assignment).
HUMAnN 3 and MetaPhlAn 4 for species‑ and pathway‑level profiling of shotgun data (the latter uses an expanded marker catalogue and supports long reads).
R packages such as phyloseq, vegan, MaAsLin2, ANCOM‑BC for diversity analyses and differential‑abundance testing.
Python packages (pandas, numpy, scikit‑bio) for data wrangling and custom analyses.
CoNet, Spiec‑Easi or other network‑inference tools to explore cross‑site co‑occurrence networks.

All scripts should be version‑controlled (e.g., via Git) and documented. Use containers (e.g., Docker/Singularity) for reproducibility.
Analysis Workflow
Data Acquisition and Harmonization: We will download raw sequencing reads or processed tables from the selected studies. Because these data come from different projects, part of the protocol is harmonizing them for combined analysis. For 16S rRNA gene sequences, raw reads will be re-processed through a uniform pipeline (QIIME2 with the same trimming, denoising parameters) to obtain comparable Amplicon Sequence Variant (ASV) tables. Metagenomic reads will be quality filtered (using FastQC/Trimmomatic), then profiled for species and strains. We will take care to separate datasets by body site i.e., analyze gut microbiome sharing separately from skin or oral sharing, as the dynamics can differ by site [2]. We will also standardize metadata (ensure consistent units or categories for diet, or binary vs continuous encoding of outcomes).
Search and download public datasets
1. Catalogue potential studies by searching ENA, SRA, MG‑RAST and Qiita for keywords related to your research question: e.g., “oral AND vaginal AND rectal microbiome”, “cervical cancer microbiome”, “pregnancy microbiome multi‑site”, “endometriosis rectal vaginal 16S”.
2. Screen studies by reading abstracts and methods to ensure that multiple body sites were sampled and that raw data are publicly available. Record accession numbers and sample counts.
3. Retrieve raw reads using appropriate tools (e.g., enaDataGet, prefetch/fasterq‑dump). Organize files into directories by study and body site.
4. Obtain metadata from supplementary tables or repository submissions. For the healthy body‑site study (PRJEB37731), use the provided Excel file to map sample IDs to subject, site and timepoint.
5. Update your metadata spreadsheet with all sample and subject information. Harmonize variable names across studies.
Taxonomic and Diversity Profiling: For each individual sample, we will characterize the microbiome composition. This includes calculating alpha-diversity metrics (e.g. Shannon diversity, species richness) and beta-diversity distances between samples (using measures like Bray-Curtis and UniFrac for 16S data). At this stage, we will already check basic trends: do cohabiting partners have more similar microbiomes than random pairings? We will compute the intra-couple beta distance for each couple and compare it to the distribution of distances between all possible non-couple pairs in the same dataset. Prior studies show couples’ gut microbiota are significantly more alike than unrelated pairs [3]; we will verify this across our aggregated data. Statistical significance can be assessed with permutation tests (randomly shuffling partner labels) or paired comparisons (each couple vs random). We will also examine whether couples tend to cluster together in ordination plots (PCoA/nMDS), as an illustrative visualization [3]. If multiple time points are available (e.g. some studies sampled couples longitudinally), we will plot how microbial profiles maybe converge over time with cohabitation. Diversity metrics of individuals will be compared by cohabitation status: consistent with Dill-McFarland et al., we expect cohabiting individuals to have higher gut diversity than singles [3], an observation we will test on any dataset with both groups.
Preprocess sequencing data This protocol assumes two data types: 16S amplicon sequences and shotgun metagenomes.
16S amplicon data
1. Import reads into QIIME 2 and demultiplex if necessary.
2. Denoise using DADA2 or Deblur. Set parameters according to read length (e.g., for Illumina V3-V4 2×250 bp reads, truncating at 240 bp may ensure overlap).
3. Assign taxonomy to amplicon sequence variants (ASVs) using a classifier trained in the same 16S region. Preferred references include SILVA (v138 or later) or Greengenes2, for oral samples, supplement with the Human Oral Microbiome Database.
4. Filter contaminants by removing ASVs classified as mitochondria/chloroplasts and low‑prevalence features. Use negative controls (if available) to identify reagent contaminants.
5. Export ASV tables and taxonomy for downstream analyses.
Shotgun metagenomic data
1. Host depletion and QC: Use KneadData or an equivalent pipeline to remove human reads (mapping to GRCh38) and common contaminants (e.g., PhiX).
2. Species and pathway profiling: Run MetaPhlAn 4 on host‑depleted reads to obtain species‑level relative abundances. Optionally, run HUMAnN 3 with the UniRef90 database to profile pathways. Use the VMGC and VIRGO catalogs for improved detection of vaginal species.
3. Strain analysis: For deeper exploration of specific taxa, tools like StrainPhlAn 4, inStrain, or StrainFacts can be applied to shotgun data. These can reveal within‑species variation and potential strain sharing across body sites, though coverage may be limited for low‑biomass samples.
4. Functional annotation: Map reads gene catalogs (e.g., VIRGO) to identify virulent factors or metabolic functions relevant to your question.
Alpha and beta diversity
1. Compute alpha diversity (e.g., observed ASVs, Shannon index) for each sample using QIIME 2 or R. Compare diversity across body sites (oral vs. vaginal vs. rectal/gut) and disease groups using non‑parametric tests. Studies have reported that vaginal communities dominated by Lactobacillus are less diverse than those dominated by anaerobes, whereas the oral and gut microbiomes are generally more diverse.
2. Compute beta diversity matrices (Bray-Curtis, UniFrac) and visualize using principal coordinates analysis (PCoA). Expect clustering by body site and by individual. Use PERMANOVA to test whether disease state or menstrual phase significantly affects community composition.
Strain Sharing Analysis: Using high-resolution metagenomic data, we will identify shared strains between partners. For each species detected in both members of a couple, we will determine if they share the same strain or only the same genus/species. A shared strain can be inferred if the nucleotide variation within that microbe’s genome is nearly identical between the two samples (e.g., via StrainPhlAn phylogenies or by finding a high fraction of identical single-nucleotide variants). We will calculate a “strain-sharing rate” per couple, defined as the percentage of species in common for which the same strain (rather than different strains) is present [5]. These rates will be compared against background levels (e.g. unrelated individuals rarely share strains except very prevalent ones). We anticipate seeing that spouses have elevated strain sharing for a subset of gut species, akin to findings in large cohorts [5]. We will also compare strain-sharing across different relationships: e.g., do we replicate that partners, parents, and siblings all show higher sharing than non-relatives [5]? If the dataset permits, we’ll test if longer cohabitation correlates with more strains shared, which is biologically plausible and supported indirectly by twin studies of time apart [5]. The output of this step will be a catalogue of candidate transmissible microbes within households, highlighting which microbial strains appear to pass effectively between partners. We will pay special attention to strains of health relevance, Bifidobacterium strains (important for digestion and infant gut seeding) or Gardnerella strains (implicated in BV) that are shared between partners.
Functional and Resistome Profiling: Using the metagenomic data, we will profile the functional genes and pathways present in each partner’s microbiome. This addresses whether couples not only share species but also share functional traits (perhaps through different microbes performing similar roles). We’ll use HUMAnN3 to quantify pathways (like butyrate production, B-vitamin synthesis, etc.) in each sample. Then, analogous to taxonomic similarity, we will see if the metabolic pathway profile of partners is more similar than that of non-partners. It may be that even if specific species differ, couples’ microbiomes converge functionally due to their shared diet and environment. Additionally, we will profile the antibiotic resistance genes (resistome) in each microbiome using tools like AMRFinder or ARGs-OAP, to see if cohabitation leads to exchange of resistance genes. (This has public health importance, e.g., if one partner’s gut acquires a resistant organism, does the other partner soon harbor it too?). Any consistently shared functional elements will be noted as potential signs of co-adaptation.
Association with Health and Metadata: A major part of the analysis is correlating the couples’ microbiome features with health outcomes and other metadata. We will pursue a series of exploratory analyses:
Reproductive Outcomes: Within the infertile couple’s cohort [9], compare microbiomes of couples who achieved pregnancy vs those who did not (if follow-up data exists). We will look for microbial characteristics associated with success, e.g. higher Lactobacillus in both partners and absence of certain pathogens. We will also assess if microbiome similarity between partners correlates with fertility, one hypothesis could be that more microbiome similarity (due to frequent intimacy) might reflect a level of healthy interaction that could correlate with pregnancy (though too much similarity might also imply shared dysbiosis. We will examine directionality). If longitudinal samples are available (e.g. microbiome before and after an intervention or over the course of pregnancy), we will analyze those trajectories.
Female Gynecologic Health: Analyze the vaginal microbiome state in the context of the partner. Using data from BV or vaginal microbiome studies, test whether the male partner’s penile microbiome diversity or composition associates with the female’s vaginal microbiome stability. Does a man harboring anaerobic bacteria correspond to his partner having recurrent dysbiosis? We may integrate recent trial data (like Vodstrcil et al. 2025) to see what male microbiome features (perhaps a high load of BV-associated bacteria on the penis) predicted BV recurrence [6]. These analyses could identify specific microbes whose presence in males is risk factor for female issues.
Systemic Health Measures: If available, we will assess couples’ microbiome in relation to metabolic and immune markers. In the WLS data, extensive metadata (diet, blood markers, disease history) exist [3]. We can examine whether a couple sharing certain microbiome traits corresponds with shared phenotypes, e.g., do couples with more similar gut microbiomes also have more similar BMIs or blood pressure? Prior research shows spouses often share lifestyle diseases; here we’ll add the microbiome dimension. We can also test if microbiome “quality” (like high diversity or beneficial taxa) in one partner is linked to better health outcomes in the other, if a husband has an anti-inflammatory gut microbiome, does the wife perhaps show lower inflammatory markers? Such cross-partner analyses can reveal whether microbiome-associated health effects extend beyond the individual.
Pregnancy and Child Outcomes: We will explore any data linking parental microbiome to child health. In absence of a dedicated dataset following couples through pregnancy to child outcomes, we will combine evidence from literature: e.g., use the mother’s microbiome during pregnancy and father’s microbiome pre-conception to see if any known patterns emerge with reported child outcomes (like allergy development [8] or birth weight). If any cohort (like the HMP or a birth cohort study) includes paternal microbiome sampling, we will include that. Specifically, we plan to analyze the paternal gut microbiome features alongside maternal microbiome during pregnancy to look for associations with neonatal outcomes (even if measured indirectly via known risk factors, e.g. paternal obesity and infant birth weight, considering paternal microbiome as a mediator). We will also examine the overlap between strains in parents and those found in infants (using reported vertical transmission rates for reference).
For all associations, we will use appropriate statistical models. Many analyses will use paired statistical tests or mixed models, since partners are a paired unit. To test partner vs non-partner microbiome similarity, we treat couples as a blocking factor. For health outcomes, we may use linear mixed models that include random effects for couple identity (to account for shared environment) when looking at individual-level outcomes. In some cases, the dyad will be the unit of analysis (regressing a couple’s fertility outcome on their combined microbiome metrics). Machine learning techniques (random forests, etc.) may be applied to identify key microbial species or functions that together predict an outcome (such as IVF success or the couple’s disease status). However, given the exploratory nature and likely limited sample sizes in each category, statistical inference will be cautious and mainly hypothesis-generating.
Visualization and Interpretation: We will generate comprehensive visualizations to illustrate findings. This includes scatter plots or heatmaps of partner vs partner microbial abundance (to see which microbes are strongly correlated within couples), network diagrams of microbial transmission (connecting individuals who share strains, expecting to see couples as tight clusters), and perhaps Sankey or alluvial plots showing the flow of microbes between partners, parents, children, and pets. We will also map any interesting co-evolutionary patterns, if we find that certain microbial genes are complementary between spouses (just as a hypothetical, say one partner’s microbiome produces a compound extensively that the other lacks), we will highlight that as a potential co-adaptation. Key results will be overlaid with existing literature: e.g., if couples with high microbiome similarity tend to be older or have lived together longer, that matches known phenomena [5]. Or if we observe that only couples with very “close” relationships (possibly measured by questionnaires in some studies) show significant microbiome convergence, we will note that as supportive evidence of psychosocial influences [3].
Analysis Workflow - Couple‑Level Multi‑Site Microbiome Studies
Building on the general multi‑site protocol, this section describes a detailed workflow to analyse the microbiomes of cohabiting couples across multiple body sites (gut, oral, skin, genital) using public data. It is designed for researchers aiming to investigate how partners’ microbial communities converge, which strains are exchanged, and how these microbial dynamics relate to reproductive health, fertility, pregnancy outcomes and overall, well‑being. The workflow assumes that raw sequencing data (16S amplicon or shotgun metagenomics) and metadata are available from public repositories, and that couples can be identified via shared household IDs or partner labels.
Step 1 - Identify couple‑focused multi‑site datasets
Search for studies with partner or household information. Use ENA/SRA metadata to filter projects that collected samples from multiple body sites and recorded participant relationships. Datasets to prioritise include:
Wisconsin Longitudinal Study (WLS) gut microbiome dataset integrates microbiota data into a long‑running social‑science cohort. Analysis of 94 spouse pairs and 83 sibling pairs showed that spouses share more bacterial taxa than siblings and unrelated individuals, and that married individuals harbour more diverse gut communities, particularly those reporting close relationships [3]. This dataset includes metadata on diet, lifestyle, relationship quality and health outcomes, making it suitable for dyadic analyses.
Person‑to‑person transmission landscape dataset: a comprehensive meta‑analysis that compiled 31 metagenomic studies with household and family relationships. It reported millions of strain‑sharing events and found that cohabiting couples share about 12 % of gut strains and 32 % of oral strains on average, whereas non‑cohabiting adults share very few strains. The dataset includes kinship information (mother‑offspring, partners, siblings) and allows for cross‑relationship comparisons [5].
Skin and oral couples datasets: studies like Ross et al. (2017) swabbed 17 skin sites and saliva from cohabiting couples, demonstrating strong skin microbiome convergence and moderate oral sharing. The kissing study by Kort et al. (2014) sequenced saliva and tongue microbiomes of couples before and after kissing, showing that kissing transfers tens of millions of bacteria and can make partners’ oral microbiomes more similar. These datasets, although smaller, provide high‑resolution timepoint data.
Infertile couples/metagenomic projects: the Microbiome of Infertile Couples cohort sequenced vaginal swabs, semen, penile swabs and follicular fluid from couples undergoing IVF. This dataset contains information on fertilisation success and can be used to investigate genital microbiome convergence and its effect on fertility outcomes.
Any pregnancy cohorts with paternal sampling: search for birth‑cohort studies that collected paternal stool or oral samples along with maternal and infant samples. Some birth cohorts measure paternal microbiome to study environmental influences on child health.
Confirm availability of partner identifiers. Many datasets mark partners through shared household IDs or “spouse” labels. Ensure that each sample can be linked to a specific individual and body site, and that partner relationships are clearly annotated (e.g., partner ID, date of sample, site).
Capture health and reproductive metadata. In addition to microbiome data, extract information on fertility status, pregnancy outcomes (e.g., time‑to‑pregnancy, miscarriages, preterm birth), hormonal assays, semen quality, and child health measures (birth weight, growth, allergies). Combine these variables into a unified metadata file to enable cross‑subject and cross‑couple analyses.
Step 2 - Data harmonisation and preprocessing
Unify metadata across datasets. Create a relational database where each individual has a unique ID, a partner ID (if applicable), and links to all sample IDs across sites and timepoints. Include demographic variables (age, BMI, ethnicity), relationship variables (years cohabiting, relationship quality), and health outcomes. Use controlled vocabularies for body sites (e.g., gut, stool, rectal, oral, saliva, tongue, skin, vaginal, penile, semen). Flag samples taken at the same timepoint for both partners to facilitate paired analyses.
Standardise sequence processing by data type:
16S amplicon data: process all raw reads through a consistent pipeline as described earlier (QIIME 2 denoising with DADA2 or Deblur, taxonomy assignment with SILVA). For cross‑dataset comparability, consider re‑training the taxonomic classifier on the specific 16S region (e.g., V3-V4) and trimming reads to the same length. Export ASV abundance tables and taxonomy.
Shotgun metagenomic data: perform quality control with fastp and host depletion with KneadData. Use MetaPhlAn 4 for species profiling and HUMAnN 3 for functional pathways, using the same version of the database across datasets. For strain resolution, gather species of interest (e.g., Bifidobacterium, Bacteroides, Gardnerella) and use tools like StrainPhlAn or inStrain to call strains. Because couples often share environment and diet, set stringent thresholds for declaring “shared strains” (e.g., ≥99.999 % average nucleotide identity and ≥25 % genome coverage [14]) to reduce false positives. Retain intermediate files (mapped reads, marker sequences) to enable downstream analysis.
Address batch effects. Integrate data from multiple studies by transforming abundances into compositions (e.g., centred log‑ratio) and applying batch‑correction methods (ComBat, cross‑dataset scaling). Use internal controls (mock communities, negative controls) if available to calibrate across projects.
Step 3 - Pairwise similarity and diversity analyses
Calculate within‑couple vs. between‑couple diversity. For each body site, compute alpha diversity (Shannon, Simpson) and beta diversity (Bray-Curtis, Aitchison distances). Use paired statistical tests to compare partners’ microbial richness and diversity. Prior work showed that married individuals harbour microbial communities of greater diversity than those living alone [3]; check if this holds across your datasets.
Quantify pairwise similarity. For each site and dataset, calculate the dissimilarity between each individual and their partner and compare it to the distribution of dissimilarities for all non‑partner pairs. Use permutation tests: randomly pair individuals (while preserving dataset composition) 1000 times and compute the fraction of permutations where the random pair distance is lower than the observed partner distance. A low permutation p‑value indicates that partners are significantly more similar than expected by chance.
Assess relationship quality and cohabitation duration effects. If metadata include relationship quality (e.g., closeness ratings from the WLS) and years of cohabitation, model the association between these variables and microbiome similarity. The WLS study reported that couples reporting very close relationships had significantly more similar gut microbiota and greater diversity [3]; replicate this by regressing pairwise dissimilarity on relationship quality and adjusting for age, diet and other covariates. Use linear mixed models with random intercepts for dataset or household.
Cross‑body‑site comparisons. Within each individual, compute dissimilarities between body sites (e.g., gut-oral, gut-skin, gut-genital) and compare them to cross‑partner dissimilarities (e.g., husband gut vs. wife gut). This can reveal whether partners are more similar at certain sites than the sites are within the same person (contrasting intrapersonal vs. interpersonal variation). Prior evidence suggests that skin microbiomes of cohabiting partners are extremely similar, whereas gut microbiomes show moderate convergence; verifying this across datasets can yield insights into transmission routes.
Step 4 - Strain‑level transmission analysis
Reconstruct species‑specific phylogenies. For each species of interest (e.g., Bifidobacterium longum, Bacteroides ovatus, Gardnerella vaginalis, Prevotella spp.), run StrainPhlAn across all samples. This will align clade‑specific marker genes and build a maximum‑likelihood tree. Use pairwise patristic distance to determine if a partner pair shares the same strain (very small branch length) versus distinct strains.
Compute genome‑wide similarity. Use inStrain on shotgun data mapped to reference genomes or metagenome‑assembled genomes. For each pair of samples, compute average nucleotide identity (ANI), genome breadth coverage, and identity threshold. Use the criteria from the cautionary article: require ≥25 % genome coverage and ≥99.999 % ANI to define a transmission event. Note that cohabitation alone can produce high strain sharing due to shared diet or environment [14]; thus, compare partner strain sharing to background sharing among non‑partners.
Estimate strain‑sharing rates. For each couple, calculate the proportion of species present in both partners for which they share the same strain. Summarise across couples to obtain median strain‑sharing rates. In the person‑to‑person transmission study, cohabiting partners shared about 12 % of gut strains and 32 % of oral strains [5]; replicate this analysis on your data and test whether rates are higher in couples with longer cohabitation or greater intimacy.
Evaluate directionality. In longitudinal datasets (if available), identify the timepoint at which a strain first appears in one partner and later in the other. Assign potential directionality (e.g., wife→husband or husband→wife) based on temporal precedence. Use caution: retention and re‑introduction can make direction inference uncertain [14].
Assess cross‑household contamination. Calculate strain sharing among unrelated individuals within the same dataset (e.g., same neighbourhood or village). This provides a null distribution; the transmission landscape study found that non‑cohabiting adults shared almost no strains. If couples show significantly higher sharing than this baseline, it supports partner‑mediated transmission.
Step 5 - Functional and network analyses
Functional convergence. From HUMAnN 3 pathway profiles, compute functional similarity between partners using Bray-Curtis distance. Test whether partners share similar functional capacities despite differences in species composition. Explore whether couples with high functional convergence have better metabolic or reproductive health (e.g., lower inflammation or higher semen quality). Build mixed models where partner pair functional similarity is the predictor and health outcomes (e.g., IVF success, pregnancy duration) are the response.
Cross‑site and cross‑partner correlation networks. Construct correlation matrices of taxa abundances between body sites and between partners. Correlate the husband’s gut abundance of Bifidobacterium with the wife’s vaginal abundance of Lactobacillus. Use non‑parametric correlations (Spearman) and correct for compositionality using methods like SparCC. Visualise networks with nodes representing taxa in each partner and site, and edges representing significant correlations. Identify taxa that act as “bridge” species connecting partners across sites. Determine if certain microbes frequently co‑occur across partners and sites (e.g., Bacteroides shared in gut-oral network). Such patterns might hint at shared sources (diet, environment) or direct transfer.
Transmission network reconstruction. Build a bipartite network of individuals and strains/species. For strains shared between partners, draw edges connecting the two individuals. Compute network metrics such as degree (number of shared strains) and clustering coefficient (propensity to form triads, e.g., partners sharing strains with their children or pets). Compare networks across different relationship types (partners vs. siblings) to see if partners form tighter clusters, as observed in the transmission landscape study where partners had high strain‑sharing rates.
Step 6 - Association analyses with health and reproductive outcomes
Couple‑level outcome modelling. Create outcome variables at the couple level (e.g., time to conception, live birth, number of miscarriages, partner’s metabolic health score). Summarise microbiome features for each individual (diversity, specific taxa abundances, strain‑sharing count) and compute pairwise measures (difference or similarity). Fit regression models (linear, logistic, Cox proportional hazards) with the couple‑level outcome as the response and microbiome features as predictors. Include covariates such as age, BMI, smoking, diet, cohabitation duration and relationship quality. Use random intercepts for dataset or study. Test whether a high rate of vaginal-penile strain sharing increases risk of BV recurrence or infertility.
Cross‑partner influence modelling. Use dyadic data analysis techniques (Actor-Partner Interdependence Models) to account for the fact that each partner’s microbiome may influence not only their own health (actor effect) but also their partner’s health (partner effect). Examine whether the husband’s gut microbiome diversity predicts the wife’s pregnancy outcome after controlling for her own microbiome. APIM models can partition direct and cross‑partner effects and include random dyad intercepts.
Mediation and moderation analyses. Explore whether the relationship between partner microbiome similarity and health outcomes is mediated by intermediate variables (e.g., immune markers, hormonal levels) or moderated by factors like physical intimacy frequency or shared diet. This can be done using structural equation modelling or causal mediation analysis.
Machine learning for pattern discovery. Train classification models (e.g., random forests, gradient boosting) to distinguish successful vs. unsuccessful pregnancies or high vs. low semen quality based on combined microbiome features from both partners. Use cross‑validation within each dataset and test generalisability across datasets. Identify the most informative taxa, pathways or strain‑sharing metrics driving model performance.
Step 7 - Co‑evolutionary and cross‑generational analysis
Evolutionary perspective on microbial sharing. Investigate whether microbes that are efficiently transmitted between partners (high strain‑sharing rates) are enriched for functions that could benefit reproduction or offspring health. Compare the functional profiles of shared vs. non‑shared strains and test whether shared strains include beneficial taxa like Bifidobacterium (important for infant gut colonisation) or pro‑inflammatory genera. Evaluate whether shared strains have genetic signatures of host adaptation.
Intergenerational transmission. If datasets include infants born to the couples, compare infants’ microbiomes with those of both parents. Determine the proportion of infant strains that match those of the mother vs. father, and assess whether paternal microbes contribute significantly to the infant microbiome (via skin or gut). In the transmission landscape study, maternal-infant pairs shared ~34 % of strains while partners shared ~12 %, highlighting how parental roles differ. Test whether paternal contribution is higher for species that are highly transmitted between partners (e.g., oral microbes via kissing).
Life‑course perspective. Use longitudinal data (if available) to track how partner microbiomes converge over time. Model the rate of convergence and relate it to lifestyle changes (e.g., cohabitation start date, pregnancy, dietary changes). Evaluate whether couples that maintain stable microbiome similarity have different health trajectories than those whose microbiomes diverge.
Step 8 - Reproducibility and reporting
Open science practices. Document all code and analyses in Jupyter notebooks or R scripts and version control them. Provide a project metadata file describing each dataset, sample counts, body sites, sequencing technologies, and partner IDs.
Data sharing. Because the study relies on public data, cite all original datasets and abide by their usage terms. For derived results (e.g., strain‑sharing networks), deposit summary tables and network files in an open repository (Zenodo, figshare) with appropriate attribution.
Transparent limitations. Discuss the limitations of the data and analyses, including heterogeneity of sampling protocols, small sample sizes for certain relationship types, and confounding by shared environment and diet. Acknowledge that high strain‑sharing rates alone do not prove direct person‑to‑person transmission; instead, interpret results cautiously and consider complementary evidence (temporal precedence, cohabitation duration). Clarify that associations do not necessarily imply causation.
By following this comprehensive workflow, researchers can leverage publicly available datasets to explore how cohabitation and intimate relationships shape the microbiome across multiple body sites, identify transmissible microbes and their functions, and examine how these microbial dynamics relate to fertility, pregnancy, and family health. Such analyses deepen our understanding of the “social microbiome” and its potential role in human evolution and health.
Expected Findings and Significance
By executing this protocol, we expect to solidify several observations and potentially uncover new insights about couples’ microbiomes:
Extent of Microbiome Convergence: We anticipate confirming that cohabiting couples have a measurable increase in microbiome similarity across multiple body sites. This will be quantified in terms of shared species and strains. Prior work suggests spouses share ~25-50% of their gut microbial species in common, a higher fraction than two random people [13]. We expect to see that in our aggregated data, and to refine that estimate (e.g., does the shared fraction increase with each year of cohabitation?). We also expect to show that this convergence is site-dependent skin microbiomes may converge the most [2], gut moderately [3], and perhaps oral if high intimacy is present [4]. Genital microbiomes might converge least strongly (apart from transmitting infections), as suggested by limited partner overlap in the infertile couples study [9].
Microbial Transmission Networks: We foresee identifying specific bacteria that are frequently exchanged between partners. We might find that common gut commensals like Bacteroides, Bifidobacterium, or Faecalibacterium strains are often shared, supporting the idea that household members constantly swap these microbes [5]. On skin, we expect shared Staphylococcus strains, etc., and in the oral cavity, shared Streptococcus and anaerobes due to kissing. If our analysis is successful, we will be able to list particular strain clusters that show evidence of transmission, essentially cataloguing part of the “person-to-person transmissible microbiome.” This has evolutionary implications: microbes that can move between hosts may have co-evolved with human social behavior to spread within families or communities.
Links to Health Outcomes: While exploratory, we hope to find suggestive associations that generate new hypotheses. Some expected outcomes include:
Couples who both have a “healthy” microbiome profile (e.g. high fiber-degrading bacteria, high diversity) may report better general health metrics and possibly higher fertility. There is evidence that gut microbiota can modulate systemic inflammation and even hormone levels [11]. If both partners have favorable microbiomes, perhaps their combined metabolic health is improved, potentially aiding fertility.
Conversely, couples sharing a pathogenic or dysbiotic microbiome might show shared health burdens. If one partner carries an opportunistic pathogen like Campylobacter or E. coli with toxin genes, the other might also acquire it and both could have gut inflammation or diarrheal episodes. On a chronic level, partners who both have microbiomes indicative of dysbiosis (low diversity, high proteobacteria, etc.) might have a harder time conceiving or could face pregnancy complications. We will look for any correlation between microbiome dysbiosis indices in couples and their history of miscarriages or IVF failure (if data allows).
Recurrent infections and interventions: As demonstrated with BV [6], one partner’s microbiome can reinstate an infection in the other. We expect our analysis to reinforce the need for dual-partner treatment approaches in conditions like BV, yeast infections, or even H. pylori (for which spouses often both need treatment). If possible, we will check our data, of both partners carrying the same potential pathogen or pathobiont.
Child microbiome and health: We may find that a child’s early microbiome more closely resembles the mother initially (due to birth), but over time the father’s microbes and the shared home microbes significantly contribute. If we have data of infants at 1 year old, we might detect bacteria that are not present in the mother but are present in the father or home environment, indicating horizontal acquisition. This could tie into known child health outcomes: e.g., high diversity exposure (both parents + maybe pet microbes) might correlate with lower allergy incidence [7]. While our focus is couples, including the downstream effect on children highlights the evolutionary and health significance of couple microbiome co-evolution.
If our analyses are sufficiently robust, we aim to propose a conceptual model of “couple microbiome co-evolution”. In this model, long-term partners would gradually align their microbiomes, and this alignment could confer certain adaptive advantages, synchronized microbial metabolisms that might stabilize household metabolic homeostasis, or shared benign microbes that outcompete pathogens (the idea that spouses might immunize each other by microbe exchange). On the flip side, this co-evolution model would also account for shared vulnerability, where an adverse microbiome changes in one (due to antibiotics or illness) could perturb the other’s microbiome as well.
Reference