High Rate of Infection by Only Oncogenic Human Papillomavirus in Amerindians

The role of HPV type distribution on the disparity of cervical cancer (CC) incidence between human populations remains unknown. The incidence of CC in the Amazonas State of Venezuela is higher than the national average. In this study, we determined the diversity of known HPV types (the viral agent of CC) in Amerindian and mestizo women living in the Venezuelan Amazonas State. Understanding the ecological diversity of HPV in populations undergoing lifestyle transformations has important implication on public health measures for cervical cancer prevention.

sexual partners (Table 1 and Table S1). As expected, intestinal protozoa and helminthes were more prevalent in Amerindians than in mestizos ( Table 2 and Table S2), and there was a significant increase in Amerindian schooling, sexual contact with mestizo men, smoking, and reduction in parity (number of times a woman has given birth), with urbanization (Table 1 and Table S1). Amerindian women reported practicing only vaginal intercourse, while 28% of mestizo women reported additional practice of oral and/or anal sex.
The overall prevalence of cervical HPV in this study was 75% (74%, excluding cervical abnormalities; see below) and did not differ between urban groups (P Ͼ 0.05; Table 2,  Table S2, Fig. 2a, and Fig. S2a). There was a median of 1 to 2 HPV types per woman (Table 3 and Table S3; not different between groups; P Ͼ 0.05), and the differences in the frequency of single or multiple HPV infections were not significant between groups (P Ͼ 0.124; Table 2 and Table S2). In Amerindians, but not in mestizos, the prevalence of infections by exclusively high-risk HPVs was higher than infections with exclusively low-risk HPVs or with both HPV risk types (P ϭ 0.007; Fig. 2b and Fig. S2b).
A total of 23 HPV types were detected, of which 22 were from the cervix (Table S3). Alpha diversity was significantly higher in mestizos than in Amerindians from the FIG 1 Diagram of geographic locations included in this study. Sampling was performed at eight locations with different urbanization levels: five locations with low urbanization level (green), one location with medium urbanization level (orange), and two locations with high urbanization (blue). Distances to the urban town were 150 to 210 km (by road and river) for the medium and low urban-level communities. Most communities can be reached only by river; however, some low-level urban communities can be accessed by 1 to 2 days of trekking through the forest. The medium urban level community is located 190 km from an urban location (130 km by river and 60 km by road). The two high urban level communities are located 8 km from each other. The map was generated using Quantum GIS Geographic Information System v. 2.18.14 (https://www.qgis.org/en/site/).  31.1  31.3  29.2  26.7     No prevalence differences were found among Amerindian groups (P ϭ 0.540 by 2 test) or between Amerindians from the high urban group and mestizos (P ϭ 1.000 by 2 test). Unlike mestizos, Amerindian women showed higher prevalence of only high-risk HPV types in relation to low-risk HPV or both types (P ϭ 0.007 in the log linear model). The circles represent mean prevalence, and the bars show 95% confidence intervals (95% CIs). Prevalence that is statistically significantly (P Ͻ 0.050) different is indicated by a bar and asterisk. (c) Shannon diversity (Hill number q ϭ 1) of cervical HPV by urban groups, based on a rarefied/extrapolated sample size of 28 women. Amerindians for low and medium urban groups were significantly less diverse than mestizos. There was a nonsignificant tendency to increasing HPV diversity with urbanization. The solid line curve fraction (interpolation) corresponds to the actual number of women sampled. The dashed line corresponds to the estimated diversity (extrapolation). Curved shaded areas represent the 95% CIs estimated from the bootstrap (50 replications). Significant differences are reached when 95% CIs do not overlap. Different letters indicate significant differences. (d) Beta diversity analysis by urban groups. Median distance to the centroid using Sorensen dissimilarity index. No difference among or within a group's dispersion was observed (P Ͼ 0.05, PERMANOVA and permutation test for homogeneity of multivariate dispersions). (e) Heat map of prevalence of cervical HPV types. HPV18 and HPV39 of the ␣7 family showed the highest relative proportions. HPV L1 region sequences were used to generate a maximum likelihood tree rooted with theta HPV type (not shown). HPV families and their relative proportions (as a percentage; among only HPV-positive samples) are shown on the right. HPV68 and HPV73 were excluded from the tree, since the LiPA25 kit does not discriminate between these two types.

Urban Group (N subject) Ethnicity
lowest urban levels (based on subject-based groups in Fig. 2c, Table 3, and Table S4; community-based classification in Table S3 and Fig. S2c). The group differences in alpha diversity were mainly due to relative abundance rather than richness of HPV types (Table 3 and Tables S3 and S4). Beta diversity analysis using Sorensen dissimilarity index showed a nonsignificant tendency of increasing with urbanization (Fig. 2d, Fig. S2d, Table 3, and Table S3). For both classification approaches (subject-and communitybased urban groups), a hierarchical tree showed that mestizos segregated from Amerindian low and medium urban groups (see the data posted at https://figshare.com/ s/9bffb3ea746016f78b4e).
The most common HPV family was ␣7, followed by ␣9. HPV18 and HPV39 were the most prevalent cervical types ( Fig. 2e and Fig. S2e). Only six cervical HPVs, all of them high-risk types, were shared among women in the four groups (Fig. S3). A comparative analysis of body site HPV distribution in 16 women with at least one body site positive for HPV showed 15 viral types in the cervix (14 women), 6 in the introitus (10 women), 4 anal (7 women), and 7 oral (6 women) ( Table S5). The highest HPV prevalence and diversity was found in the cervix (P Ͻ 0.050; Table S5), and cooccurrence of any high-risk HPV or HPV18 in different body sites was low (Cohen's kappa coefficients of Յ0.26 and Յ0.37, respectively; Table S6).
Nine women presented cervical abnormalities, and they were all Amerindians with mostly high-risk HPV infections (Table 2, Table S2, and Table S7; see the data posted at https://doi.org/10.6084/m9.figshare.5579299.v1). The presence of cervical lesions in these women did not significantly change HPV diversity.

DISCUSSION
Urban groups segregated better using subject-based rather than community-based metrics of classification, likely because villages in transition are heterogeneous in the lifestyles of their individuals. Interestingly, the Amerindian women who lived in the town did not reach the high urban scores of mestizo women, reflecting a certain level of attachment to their traditional lifestyles.
Cervical HPV prevalence in this study is similar to that reported in high-risk populations (20,21) and higher than in other reports that used the same detection method, in Latin America (37 to 51%) (20,22), Europe (Ͻ20%) (23), and Japan (Ͻ20%) (24). This disparity in prevalence may be due to the high prevalence of anemia and intestinal helminthes, which may reduce HPV clearance (25), degree of isolation from the global HPV pool, etc., while for industrialized countries, vaccination is significantly reducing HPV infection (26,27). The results of more isolated Amerindians having lower HPV diversity than mestizos confirmed our hypothesis and are consistent not only with the Amerindian's higher isolation from the global viral pool but also with their lower genetic diversity. Amerindians descend from Asians who migrated east from Africa in successive genetic bottlenecks, thus only a fraction of the population-and gene pools-advanced (19,28). Across the urbanization gradient, Amerindians become more exposed to mestizos and increase their genetic diversity (mestizaje) as well as their exposure to the global viral pool. However, a previous study in isolated Yanomamis of Brazil (15) reported higher HPV diversity than in more urbanized Macuxi and Wapishana Amerindians. This contradiction might be affected by the HPV detection methods used or by the degree of real isolation of the studied populations. In this study, we used a sensitive hybridization method that recognizes 25 HPV types (29, 30) based on L1 gene, the most conserved region in the HPV genome (31). The sensitivity of the detection method could decrease if there was divergence of the HPV during the isolation of Amerindian groups in the last 12,000 to 24,000 years. However, the probability of new diversity seems low, based on the estimated 200,000 years of evolution for intratypic variation of HPV18 (32). The question of novel variants in Amerindians is beyond the scope of the present study that aimed at characterizing the known HPV types, but future metagenomic studies should address this important question.
In relation to the presence of HPV in multiple body sites, our study shows 33% oral HPV (which is higher than in other reports using the same detection method; e.g., 1.6% in Costa Rica [33]) and 30% anal prevalence (similar to that in other reports [22]). There was low cooccurrence of specific HPV types in different body sites, which might result from epithelial tropism (34,35) or site-related clearance (36) and may depend on sexual practices, such as nonvaginal sex (37), that are uncommon in our studied population. However, there can be extrasexual HPV transmission, such as self-transmitted to different body sites, or mother-child vertical transmission (38). The fact that the introitus site showed lower HPV prevalence than cervix (24 versus 75%, respectively) has implications when self-sampling is used for sample collection in population cervical HPV screenings.
Understanding the causes underlying the high incidence of CC in Amerindians is of crucial importance for decisions in public health interventions. While the same virulent types circulate among Amerindian and mestizo women, Amerindians showed higher prevalence of infections by the virulent types than infection by low-risk types or both. Amerindians in this study did show high HPV18 and HPV16, common virulent types in other human groups, but they also had high prevalence of a rare high-risk HPV type of the ␣7 family, HPV39, consistent with reports for Amerindians in the northern United States (39) and Central and South America (40). Its prevalence in this study shows a nonsignificant trend to decrease with urbanization. Regrettably, contemporary HPV vaccines do not include this virulent HPV39 highly prevalent in these populations.
That cervical abnormalities were found only in Amerindians, consistent with the epidemiological evidence of high CC incidence in this human group (12), suggests that infections by only oncogenic HPVs increase the risk of cervical abnormalities; this was reported before for squamous CC (41). Amerindian genetic variations in the immunerelevant HLA-B locus may also increase their susceptibility to colonization by oncogenic types (42,43). A high prevalence of only oncogenic HPV infections is consistent with the more efficient clearance of low-risk HPVs in relation to high-risk HPVs, which evade immune clearance, producing low virion yields (44,45), and thus, the factors that sustain the coexistence of different HPV risk types in mestizos are unclear. Coexistence of high-and low-risk HPVs has been associated with higher sexual partner turnover (46), although we did not find differences in the reported number of sexual partners. Definitely, more studies are needed to clarify the relative contribution of lifestyle and host genetic factors to the type of HPV infection and health risks. The results of this study are consistent with the association between high-risk HPVs and increased inflammation and risks of cervical lesions (41,47), and this is particularly serious in regions with precarious or nonexistent health services (48). Finally, the elimination of high-risk HPV types with the current vaccines is a promising scenario to reduce the dramatically high CC mortality in Venezuelan Amerindians. Studies that follow up the effects of the vaccines on the circulating HPV diversity, using metagenomic approaches (15) will be important for monitoring the evolution of HPV type virulence.

MATERIALS AND METHODS
Experimental design. This study included young adult, nonpregnant, healthy women from the Venezuelan Amazon. The women were from the following two groups: Piaroa Amerindian from villages in a spectrum of urbanization (from traditional to urban lifestyles) or urban mestizo. All experimental protocols were approved by SA Centro Amazónico de Investigación y Control de Enfermedades Tropicales Simón, Bolívar, Venezuela (SACAICET, IRB 78-2014), and University of Puerto Rico (IRB 1314-163).
Inclusion criteria. Women included in the study belonged to eight different villages in northern Amazonas State, Venezuela: one urban town, Puerto Ayacucho (state capital), one village in the periurban area, and six villages at the Orinoco Basin on the Sipapo River, Autana River, and Cuao River (Fig. 1). A total of 228 sexually active women attending a health evaluation were invited to participate, and 111 (82 women who self-identified as Amerindians with Piaroa ethnicity, appeared to be Piaroa Amerindians, and also spoke Piaroa language and 29 urban mestizos) aged 12 to 53 years were included in the study. We had received prior approval from the captain/leader to visit the villages. Informed consent was obtained from all participants and/or their legal guardians. Parental consent was requested for women less than 18 years old. Inclusion criteria included women at reproductive age who at the time of recruitment had none of the following: pregnancy, menses, bleeding in the last 24 h, sexually transmitted infection diagnosed in the last 2 months, antibiotics in the last month, vaginal douches in the last 24 h, sexual intercourse in the last 24 h, hysterectomy, diabetes, urinary incontinence, urinary tract infections, and HIV. Individuals excluded from the study (n Ͻ/itϾϭ 117) were mostly due to recent exposure to antibiotics or antiparasitic drugs (28%), menses (25%), postmenopausal (13%), pregnant (12%), urinary infections (8%), refusing to participate (4%), sexual contact in the last 24 h (3%), hysterectomy (2%), belonging to a different ethnicity (1%), diabetes (1%), and HIV (1%).
Surveys and urban classification. Each woman received two urbanization indices, one based on her individual exposure to urban practices (subject-based index) and another on her community urban level (community-based index) (see data posted at https://doi.org/10.6084/m9.figshare.5579299.v1). Subjectbased surveys included education, identification document (ID) possession, purchasing power, preservation of traditional practices, frequency of mobility to urbanized towns, level of environmental exposure (drinking water treatment, use of shoes, etc.), use and acceptance of Western medicine, and level of adoption of nontraditional diets (see data posted at https://doi.org/10.6084/m9.figshare.5579299.v1). Community-based urbanization survey included access to health, urban services (electricity, telephone, gas, and water), political representation, education, salaries, and language command (Spanish-Piaroa). This village survey was completed with the community captains, schoolteachers, or health workers (see data posted at https://doi.org/10.6084/m9.figshare.5579299.v1).
Categorical variables of the urbanization surveys were transformed into numeric values ranked between 0 and 1, with 1 being the highest level of urbanization (also reflecting the loss of traditional practices). Each indicator component was equally weighted, and its values were averaged using arithmetic means. Community-based groups included 111 women, but subject-based groups included only 91 women due to missing data in the surveys. Urban groups had similar sample sizes (Table 1 and  see Table S1 in the supplemental material). Community urban indices were categorized in three levels: low (scores below 0.33; n ϭ 24 women), medium (scores of Ͼ0.33 and Ͻ0.66; n ϭ 28 women), and high (scores above 0.66; n ϭ 30 women). Subject-based urbanization groups were built first, sorting in ascending order individual women scores and then grouping them in tertiles: the first group corresponds to the low urban group (n ϭ 22 women; scores of 0.22 to 0.37), the second group corresponds to the medium urban group (n ϭ 22 women; scores of 0.40 to 0.55), and the third group corresponds to the high urban group (n ϭ 23 women; scores of 0.56 to 0.77). Mestizo women had a high urban level by both classification approaches (n ϭ 29 to 24, respectively) (scores of 0.70 to 0.93 for subject-based groups).
Clinical history, sexual behavior, contraceptive usage, and hygiene practices were also recorded in a separate clinical survey (see the data posted at https://doi.org/10.6084/m9.figshare.5579299.v1). Surveys were coded without personal identifiers.
Samples. Swabs were taken by specialized health personnel, from cervix/fornix (referred to as cervix in the text) (N ϭ 111), introitus (N ϭ 18), anal (N ϭ 18), and oral (N ϭ 18) sites. DNA was extracted using Power Soil DNA kit (Mo Bio Laboratories Inc.) according to the manufacturer's instructions. The main concern about HPV detection methods is obtaining false-negative results, usually after not being able to extract/detect viral DNA in an HPV-positive sample. The Power Soil method involves an aggressive bead beating step and allowing good extraction of the viral DNA. Cervical smears were performed by an obstetrician-gynecologist using an endocervical brush and spatula, and biopsy specimens were taken and treatment was provided if indicated. Papanicolaou's stain was performed for the cytological analysis. Results were reported according to Bethesda 2001 classification system. A drop of blood was taken from fingers for in situ hemoglobin (using Easylife rapid test in peripheral blood) to detect anemia according to the WHO limits (49). Sera were transported for HIV, syphilis, and hepatitis B and C detection, processed at the Public Health Center of Puerto Ayacucho, Amazonas State, Venezuela. Fecal samples were taken, preserved using iodine-formaldehyde, and microscopically analyzed for the presence of intestinal protozoa and helminthes by microscopic methods.
High-Risk HPVs in Amerindians HPV genotyping. The approach used in this study, the SPF10 assay that amplifies 60 different known HPV strains with high sensitivity (29,30) and hybridizes the SPF10 PCR product on the LiPA25, was limited to 25 of the most relevant and prevalent known genotypes. A reverse hybridization method SPF 10 -PCR-LiPA25 system, version 1 (Labo Biomedical Products, Rijswijk, The Netherlands, based on licensed Innogenetics technology) (50), was used to detect HPV and typing 25 of the most common mucosa HPV types (types 6, 11, 16, 18, 31, 33, 34, 35, 39, 40, 42, 43, 44, 45, 51, 52, 53, 54, 56, 58, 59, 66, 68/73, 70, and 74). Briefly, 65-bp biotinylated amplicons from the highly conserved L1 gene region were generated using SPF10 primers. Amplified fragments were hybridized with a strip with specific oligonucleotide probes for each of the 25 HPV types. Visualization was performed by adding streptavidin-conjugated alkaline phosphatase to the hybrids formed, yielding a dark precipitate in a particular strip area that determines the specific HPV type. Negative and positive controls were included. We confirmed results of the highly sensitive method for HPV detection using the SPF 10 primers (29,30), repeating a subsample of replicate swabs from 10 women. This is a study performed in a non-HPV-vaccinated population, since HPV vaccines have not been included in the national vaccination program in Venezuela.
Statistical analysis. Principal-component analysis (PCA) for the villages and for women based on their urbanization indicator values were performed with the ggfortify package (51) in R (52). To visualize the urban groups for both types of classification, 95% confidence interval (95% CI) ellipses were drawn for community-based and subject-based group distributions (Fig. S1). Mean comparisons among urban group scores were performed with analysis of variance (ANOVA) and Tukey's test as a posthoc test (Fig. S1). Correlations between village-and subject-based urban scores among all populations and only including Amerindians were evaluated by a linear regression (Fig. S1).
Association between prevalence of HPV types, having only a high-risk or low-risk type or both risk types, and comparisons among single and multiple types and among body sites, were performed using log linear models and the contrast package (53) in R version 3.3.2 (52) to compute comparisons of the estimated regression coefficient. Comparisons of means with normal distribution (verified by QQ plot) were performed with ANOVA and Tukey's test as a posthoc test, and between Amerindians from high urbanization group and mestizo group with Student's t test. Means with nonnormal distribution were compared using the Kruskal-Wallis test. Proportion comparisons among groups were performed with Pearson's chi-squared test ( 2 test) with Yates' continuity correction if needed or Fisher exact test for count data. P values were adjusted for multiple comparisons by the Holm method (54) ( Table 1 and  Tables S1 and S2). P values of Ͻ0.05 were considered statistically significant.
Phylogenetic trees were built based on the HPV L1 region from sequences obtained from the PaVE database (58); MAFFT was used for the nucleotide alignment (59). The maximum likelihood method was used in PhyMLb (60). The tree was rooted with theta HPV type (not shown in the figure) ( Fig. 2e and Fig.  S2). The hierarchical tree for HPV and urban groups or individual women was built using hclust function from the R base "stats" package by the Spearman method and was visualized together with a heat map plotted with heatmap.2 from gplots (61) and RColorBrewer (62) from R packages.
Alpha and gamma HPV diversity were measured using three of the most typical used Hill's family of diversity (63-65) numbers or the effective number of types, order (q) 0, 1, and 2 (HPV type observed richness, exponential of Shannon entropy index, and inverse of Simpson concentration index) integrating rarefaction (interpolation) and extrapolation curves following Hsieh et al. (66) approach using iNEXT package (66) in R 3.3.2 version (52). Hill number of order 0 (q ϭ 0) counts all HPV types present in each group, Hill number of order 1 (q ϭ 1) can be interpreted as the number of common HPV types per group, and Hill number of order 2 (q ϭ 2) can be interpreted as the number of dominant HPV types. Alpha diversities were compared using the nonasymptotic and asymptotic analysis for incidence type data. For the nonasymptotic analysis, we compared groups at the same sample size (sample size based) or at the same level of sampling coverage (sample coverage based). The latter measures the proportion of the total number of individuals that belong to the HPV type detected in the sample and has been shown to better evaluate the magnitude of the diversity differences among groups than the traditional sample size-based comparison (64). The asymptotic analysis allowed estimating diversity when the accumulation curves reach the asymptote guided by the Chao2 estimator. Comparisons among groups were performed at the extrapolated diversity values. The 95% CI was estimated from the bootstrap method based on 50 replications. Significant differences were reached when the 95% CIs among groups did not overlap (Fig. 2c, Fig. S2, Table 3, and Tables S3 to S5).
Beta diversity was analyzed with the vegan (67) package in R (52). Beta diversity was measured using the nonparametric permutational multivariate analysis of variance (PERMANOVA) (68) to compared variance between groups with the variance within groups. Sorensen dissimilarity index matrix was built with betadiver function. The model calculates a pseudo F ratio that is tested for significance based on 999 permutations. A more robust analysis for within group dispersion (variance) comparison was performed with the permutation test for homogeneity of multivariate dispersions (10). The betadisper function was used to reduce the distances to the principal coordinate. The method computes the F statistic to compare median distances-to-centroids of each group. P value was generated with the permutest function based on 999 permutations. The plot function was used for the principal-coordinate analysis visualization (Fig. 2d, Fig. S2, and Tables S3 and S4).