Global Transcriptome Analysis of Aedes aegypti Mosquitoes in Response to Zika Virus Infection

Vector-borne viruses pose great risks to human health. Zika virus has recently emerged as a global threat, rapidly expanding its distribution. Understanding the interactions of the virus with mosquito vectors at the molecular level is vital for devising new approaches in inhibiting virus transmission. In this study, we embarked on analyzing the transcriptional response of Aedes aegypti mosquitoes to Zika virus infection. Results showed large changes in both coding and long noncoding RNAs. Analysis of these genes showed similarities with other flaviviruses, including dengue virus, which is transmitted by the same mosquito vector. The outcomes provide a global picture of changes in the mosquito vector in response to Zika virus infection.

viruses. Zika virus (ZIKV) has been the most recent mosquito-borne virus to emerge. While it was first reported in 1952 from Uganda (1), the virus has spread rapidly across the Pacific and the Americas in the last 10 years with recent outbreaks in South America (2). The clinical symptoms are variable, ranging from no or mild symptoms to severe neurological disorders such as microcephaly in infants born from infected mothers or Guillain-Barré syndrome in adults (reviewed in references 2 and 3). The virus is mainly transmitted among humans by the bites of mosquito species of the genus Aedes, in particular Aedes aegypti, when it takes a blood meal from infected individuals. The virus first infects the midgut cells of the mosquito and then disseminates into other tissues, finally reaching the salivary glands, where it continues to replicate and is eventually transmitted to other human hosts upon subsequent blood feeding events (4).
It is thought that infection by flaviviruses does not cause any detrimental pathological effects on the mosquito vectors (5), reflecting evolutionary adaptations of the viruses with mosquitoes through intricate interactions, which involve optimal utilization of host factors for replication and avoidance of overt antiviral responses. However, a number of studies have shown major transcriptomic changes in the mosquito vectors in response to flavivirus infection. These changes suggest regulation of a wide range of host genes involved in classical immune pathways, RNA interference (RNAi), metabolism, energy production, and transport (6)(7)(8)(9)(10)(11)(12)(13). In addition, mosquito small and long noncoding RNAs have also been shown to change upon flavivirus infection (14,15).
Recently, we showed that the microRNA (miRNA) profile of A. aegypti mosquitoes is altered upon ZIKV infection at different time points following infection (16). Here, we describe the transcriptional response of A. aegypti whole mosquitoes to ZIKV infection at the same time points postinfection. Consistent with previous studies of other arboviruses, we found that the abundance of a large number of genes was altered following ZIKV infection.

RESULTS AND DISCUSSION
A. aegypti transcriptome RNA sequencing (RNA-Seq) data analysis. RNA-Seq using Illumina sequencing technology was performed on poly(A)-enriched RNAs extracted from ZIKV-infected and noninfected A. aegypti mosquitoes at 2, 7, and 14 days postinfection (dpi). Total numbers of clean paired reads varied between 43,486,502 and 60,486,566 reads per library among the 18 sequenced RNA samples. More than 96% of reads mapped to the host genome with around 80% of counted fragments mapped to gene regions and 20% to intergenic areas of the genome (see Table S1 in the supplemental material).
Principal-component analysis (PCA) of the RNA-Seq data at each time point distributed all biological replicates of ZIKV-infected and noninfected samples in two distinct groups, although the differences were more subtle at 2 days postinfection, in which one of the ZIKV-infected biological replicates was relatively close to the control group ( Fig. 1).
Analysis and comparison of mRNA expression profiles of A. aegypti mosquitoes at different time points following ZIKV infection revealed that in total 1,332 genes had changes of 2-fold or more in either direction ( Fig. 2; details in Table S2). Among the three time points, the highest number of changes occurred at 7 dpi with 944 genes showing alteration in their transcript levels. The numbers of genes altered at 2 and 14 dpi were very close, 298 and 303, respectively (Fig. 3). These trends were anticipated, as we expected to see lower gene expression alteration at 2 dpi and 14 dpi due to the low level of infection in the mosquito body at 2 dpi and advanced stages of virus replication at 14 dpi, while at 7 dpi the virus is still at its proliferative stage, infecting various tissues of the mosquito. In a previous study that explored the effect of dengue virus type 2 (DENV-2) on the A. aegypti transcriptome using RNA-Seq (11), the number of genes altered was the highest at 4 dpi (151, combining carcass and midgut), compared to 1 dpi, which showed the lowest number of changes (40 genes) followed by 14 dpi (82 genes).
Comparison of the transcriptome profiles showed 18 overlapping genes among the three time points (Fig. 3; listed in Table 1). Twelve of these common genes were depleted, and only six were enriched, which were angiotensin-converting enzyme (AAEL009310), serine-type endopeptidase (AAEL001693), phosphoglycerate dehydrogenase (AAEL005336), cysteine dioxygenase (AAEL007416), and two hypothetical proteins. To validate the analysis of the RNA-Seq data, we used reverse transcriptionquantitative PCR (RT-qPCR) analysis of the 18 genes. Overall, all expression values showed consistency between the two methods and had a positive linear correlation (Pearson correlation; day 2, R 2 ϭ 0.7097, P Ͻ 0.0001; day 7, R 2 ϭ 0.8793, P Ͻ 0.0001; day 14, R 2 ϭ 0.9184, P Ͻ 0.0001) (Fig. 4).
Differentially abundant transcripts and comparisons with other flaviviruses. When concentrating on genes with 10-fold differential expression and statistical significance relative to control mosquitoes, 101, 54, and 17 genes showed changes at 2, 7, and 14 dpi, respectively (Table S2, dark blue font). After removal of hypothetical proteins, those with known functions are listed in Table 2. Interestingly, while the total number of genes showing differential abundance was higher at 7 dpi (Fig. 3), more genes showed 10-fold or greater changes at 2 dpi than at 7 dpi (101 versus 54, respectively). At 2 dpi, transcripts of eight genes were enriched with a metalloproteinase (AAEL011539) showing a 56-fold increase in abundance, a serine protease (AAEL013298) increasing 22-fold, and two trypsins (AAEL007601 and AAEL013707) with 19-and 10-fold increases, respectively. We also saw that two phosphatidylethanolamine-binding proteins, two cubulin proteins, and a cysteine-rich venom protein were altered at this time point. However, most strikingly, we observed suppression of 14 odorant binding proteins at 2 dpi, with several of these transcripts being massively reduced (around 800-fold) ( Table 2). Furthermore, other odorant binding protein transcripts were enhanced (by 2-fold or greater) at 7 and 14 dpi (Fig. S1), indicating that ZIKV may have the capacity to alter the behavior of the mosquito, potentially suppressing host-seeking in early stages of the infection and encouraging host-seeking when the mosquito is infectious. Dengue virus is known to alter host-seeking behaviors and feeding efficiency (17,18), and microarray analysis of mosquitoes with salivary gland infections found several odorant binding protein transcripts that were enriched in this late stage of infection (14 dpi) (19). Similarly, there is evidence that malaria parasites suppress the host-seeking tendencies of the mosquito early in infection but encourage host-seeking at later stages when the mosquito can transmit the parasite (20)(21)(22). The transcription patterns that we observed here with ZIKV are consistent with these observations from dengue and malaria infection of mosquitoes, but further behavioral studies are required to confirm this intriguing finding.  At 7 dpi, 34 genes showed enrichment of 10-fold or more, including clip-domain serine proteases, defensins, transferrins, hexamerin, C-type lectin, and serine proteases, which are implicated in immune responses. At this time point, only seven genes were depleted. The number of genes that were differentially expressed by 10-fold or more at 14 dpi was small, with eight genes showing enrichment and eight genes showing depletion. The highest enrichment (212-fold) was steroid receptor RNA activator 1 (AAEL015052), while peritrophin, attacin, and superoxide dismutase were among the depleted genes (Table 2).
Previous studies have shown alteration of mRNA transcript levels in A. aegypti mosquitoes infected with DENV and a couple of other flaviviruses. Using microarray analysis, Colpitts et al. found that 76 genes showed 5-fold or greater changes in DENV-infected mosquitoes over 1, 2, and 7 dpi (13). Their study, which also included responses of A. aegypti to West Nile virus (WNV) and yellow fever virus (YFV), found that commonly 20 and 15 genes were differentially enriched and depleted, respectively, between the three flaviviruses at day 1 postinfection. Considering utilization of two different techniques in the work of Colpitts et al. (microarray) and in this study  (RNA-Seq) and differences between the time points chosen, proper comparison of changes in transcript levels and fold changes cannot be done. However, when we mapped all the differentially expressed genes (2-fold or more) from the work of Colpitts et al. against our data (Table S2), we found that 364 genes from our study showed differential expression at least at one time point that overlaps the other three viral infections (Table S3).
In a follow-up study using the data from the above study (1, 2, and 7 dpi) (13), Londono-Renteria et al. identified 20 top differentially regulated transcripts in YFV-, DENV-, and WNV-infected A. aegypti mosquitoes (23). Out of these 20 genes, five of them were also found to be changed in ZIKV-infected mosquitoes in our study. These were the cysteine-rich venom proteins (AAEL005098, AAEL005090, AAEL000379, and AAEL000356) by about 9-, 18-, 25-, and 150-fold depletion at 2 dpi, respectively, and an unknown protein (AAEL013122) by 390-fold depletion at 2 dpi. While pairwise comparison is not really possible between the two studies, comparing data from 2 dpi showed that AAEL005090 (in the case of DENV), AAEL005098 and AAEL000356 (in the case of YFV and WNV, respectively), and AAEL013122 (in the case of DENV) changed in the same direction as in ZIKV infection. Another study also found a number of cysteine-rich venom proteins altered upon DENV infection of A. aegypti mosquitoes (11). Cysteine-rich venom proteins are secretory proteins that are mostly found in the fluids of animal venoms acting on ion channels (24). Londono-Renteria et al. found that among the cysteine-rich venom proteins, only AAEL000379 was enriched in DENVinfected mosquitoes and the rest did not change noticeably. Silencing the gene led to an increase in replication of DENV (23). Alteration of the cysteine-rich venom proteins commonly found in the case of different flaviviruses indicates their possible importance in replication of these viruses. Further studies are required to determine the role that these proteins play in ZIKV-infected mosquitoes specifically.
In another study with DENV-2 and A. aegypti in which deep sequencing of carcass, midgut, and salivary glands with one replicate per pooled sample was used, transcript levels of infected and noninfected tissues were compared at 1, 4, and 14 dpi, which showed differential abundance of 397 genes (11). We reanalyzed the raw data from the study using the same pipeline as we used for our study. While comparative analysis of the study with ours cannot properly be made due to differences in the samples (tissues versus whole mosquitoes) and times postinfection, in total, we found 199 genes commonly altered between DENV-2 and ZIKV infections, some with the same directional change in expression (Table S4).
A number of immune-related genes were mostly enriched at 7 dpi in ZIKV-infected mosquitoes. Toll was enriched only at 7 dpi by 2-fold. Twelve leucine-rich immune proteins were mostly enriched at 7 dpi by 4-to 16-fold. Phenol oxidase (AAEL010919), which was not changed upon DENV infection, was depleted by 2-fold at 2 dpi but enriched by 8-to 9-fold at 7 and 14 dpi in ZIKV-infected mosquitoes. Components of the JAK/STAT pathway, such as Dome and Hop, were not induced in ZIKV-infected mosquitoes. Interestingly, induction of the JAK/STAT pathway specifically in the fat body of A. aegypti mosquitoes by overexpressing Dome or Hop did not lead to increased resistance to ZIKV infection (25). This result and the lack of induction of the pathway in our study suggest that the JAK/STAT pathway may not be involved in ZIKV-mosquito interaction. Further, major genes involved in the RNA interference (RNAi) pathway, such as Dicer-1, Dicer-2, or any of the Argonaut genes, also did not change upon ZIKV infection in this study.
Gene ontology. All the differentially expressed host genes were submitted to Blast2GO for gene ontology (GO) analysis. This analysis identified 126, 68, and 33 GO terms in biological process, molecular function, and cellular components, respectively (Table S5). GO analysis of enriched genes at different times postinfection showed that they were mostly related to proteolysis, zinc ion/protein binding, and integral components of membranes (Fig. 5). Among the depleted genes, the highest categories were more variable, with day 2 having chitin metabolic process, odorant binding, and integral components of membranes; day 7 having oxidation-reduction process, DNA binding, and nucleosome; and day 14 having oxidation-reduction process, protein binding, and nucleus (Fig. 5). In support of our earlier observation (Fig. S1), odorant binding transcripts were depleted at day 2 but enriched at day 14 (Fig. 5). In A. aegypti, differentially expressed genes upon infection with DENV, WNV, and YFV belonged to various cellular processes, such as metabolic processes, ion binding, peptidase activity, and transport (13), which are also among the GO terms identified in differentially abundant transcripts in the ZIKV-infected mosquitoes (Fig. 5). The genes commonly altered upon ZIKV and DENV infections that are listed in Table S4 were mostly in proteolysis, oxidation-reduction process, and transmembrane transport from biological process; serine-type endopeptidase activity and protein binding from molecular function; and integral component of membrane, nucleus, and extracellular region from cellular component (Table S4). miRNA target genes. Recently, we identified 17 A. aegypti microRNAs (miRNAs) altered upon ZIKV infection at the same time points at which RNA-Seq was conducted (2, 7, and 14 dpi) (16). Comparative analysis of the altered mRNAs and the 17 miRNAs with opposite trends in abundance revealed that 53 of the differentially expressed mRNAs could potentially be regulated by 11 out of the 17 differentially abundant miRNAs (Table S6). However, there is growing evidence that miRNAs could also positively regulate their target genes (26,27), which are not listed in the table. Further, the analysis showed that some miRNAs have multiple potential target genes as expected (e.g., miR-309a has 19 target genes and miR-981-5p has 12 target genes). Gene ontology analysis of the target genes indicated that the majority of the genes are involved in oxidation-reduction process and integral component of membrane within the biological process and cellular component terms (Table S6).
lincRNAs change upon ZIKV infection. Long intergenic noncoding RNAs (lincRNAs) are transcripts that are longer than 200 nucleotides (nt) but do not code for any proteins; however, they are transcribed the same way as mRNAs (28), i.e., they have a poly(A) tail and therefore are enriched in transcriptomic data produced following mRNA isolation and sequencing. Similar to small noncoding RNAs, the main function of lincRNAs is regulation of gene expression, involved in various processes such as genomic imprinting and cell differentiation (29), epigenetically and non-epigenetically based gene regulation (30), activation and differentiation of immune cells (31), and, relevantly, virus-host interactions (32)(33)(34)(35)(36).
We recently reported 3,482 putative lincRNAs from A. aegypti (32). In this study, we found that, in total, 486 lincRNAs were differentially expressed in response to ZIKV infection in at least one time point postinfection (fold change of Ͼ2 and P value of Ͻ0.05). Similar to mRNAs (Fig. 3), the majority of altered lincRNAs were found at 7 dpi, and 56 out of these lincRNAs showed significant alteration at least in two time points ( Fig. 6; Table S7). The Euclidean distance was calculated for each time point based on its lincRNA fold changes. Differentially expressed lincRNAs at 7 dpi (116.83) and 14 dpi (75.30) showed more correlation than, or similar fold change patterns as, those of 2 dpi (180.86). Only lincRNAs 656, 1385, and 3105 were differentially expressed and showed the same fold change pattern among the three time points. In our previous study, we also found that the transcript levels of 421 A. aegypti lincRNAs were altered due to DENV-2 infection. Comparison of those with the ones identified in this study showed that about 80 of them were also differentially expressed in ZIKV-infected samples (Table S7), and these could be common lincRNAs involved in flavivirus-mosquito interactions.
Conclusions. Overall, our results showed large changes in the transcriptome of A. aegypti mosquitoes upon ZIKV infection, in both coding and long noncoding RNAs. The majority of transcriptional changes occurred at 7 dpi, with the genes mostly involved in metabolic process, cellular process, and proteolysis. We found some overlaps of transcriptional alterations in the case of other flavivirus infections in A. aegypti, but unlike those, immune genes were not altered to the same extent. In regard to lincRNAs, out of 486 lincRNAs changed in ZIKV-infected mosquitoes, 80 of them overlapped those of DENV-infected mosquitoes, indicating possible conserved functions of the lincRNAs in flavivirus-mosquito interactions. A drawback of this study is that whole mosquitoes were used, which means that changes at the tissue levels could have been overlooked due to the dilution factor of mixing all tissues; however, the outcomes provide a global overview of transcriptional response of A. aegypti to ZIKV infection and can be utilized in determining potential proviral and antiviral host factors.

MATERIALS AND METHODS
Ethics statement. ZIKV, which was originally isolated from an A. aegypti mosquito (Chiapas State, Mexico), was obtained from the World Reference Center for Emerging Viruses and Arboviruses at the University of Texas Medical Branch (Galveston, TX). Experimental work with the virus was approved by the University of Texas Medical Branch Institutional Biosafety Committee (reference number 2016055).
Mosquito infections with Zika virus. We used excess RNA from samples generated recently to investigate miRNA profiles in ZIKV-infected A. aegypti mosquitoes (16). Briefly, 4-to 6-day-old female A. aegypti mosquitoes (Galveston strain) were orally infected with ZIKV (Mex 1-7 strain) at 2 ϫ 10 5 focus-forming units (FFU)/ml in a sheep blood meal (Colorado Serum Company). Infected mosquitoes were collected at 2, 7, and 14 days postinfection (dpi), and RNA was extracted from them using the mirVana RNA extraction kit (Life Technologies), applying the protocol for extraction of total RNA. Viral infection in mosquitoes was confirmed by TaqMan quantitative PCR (qPCR) on an ABI StepOnePlus machine (Applied Biosystems) (16). For all time points, three independent pools were used to create libraries for infected and uninfected samples. Uninfected mosquitoes were fed with ZIKV-free blood, collected at the same time points, and processed as described above. The dynamics of infection in mosquitoes was shown in Fig. S1 in the work of Saldaña et al. (16).
Library preparations and sequencing. All samples were quantified using a Qubit fluorescence assay (Thermo Scientific). Total RNA quality was assessed using an RNA 6000 chip on an Agilent 2100 Bioanalyzer (Agilent Technologies).
Total RNA (1.0 g) was poly(A) ϩ selected and fragmented using divalent cations and heat (94°C, 8 min). The NEBNext Ultra II RNA library kit (New England Biolabs) was used for RNA-Seq library construction. Fragmented poly(A) ϩ RNA samples were converted to cDNA by random primed synthesis using ProtoScript II reverse transcriptase (New England Biolabs). After second-strand synthesis, the double-stranded DNAs were treated with T4 DNA polymerase and 5= phosphorylated, and then an adenine residue was added to the 3= ends of the DNA. Adapters were then ligated to the ends of these target template DNAs. After ligation, the template DNAs were amplified (5 to 9 cycles) using primers specific to each of the noncomplementary sequences in the adapters. This created a library of DNA templates that have nonhomologous 5= and 3= ends. A qPCR analysis was performed to determine the template concentration of each library. Reference standards cloned from a HeLa S3 RNA-Seq library were used in the qPCR analysis. Cluster formation was performed using 15.5 to 17 billion templates per lane using the Illumina cBot v3 system. Sequencing by synthesis, with paired-end 50-base reads, was performed on an Illumina HiSeq 1500 sequencer using a protocol recommended by the manufacturer.
RNA-Seq data analysis. The CLC Genomics Workbench version 10.1.1 was used for bioinformatics analyses in this study. RNA-Seq analysis was done by mapping next-generation sequencing reads and distributing and counting the reads across genes and transcripts. The latest assembly of the A. aegypti genome (GCF_000004015.4) was used as a reference. All libraries were trimmed from sequencing primers and adapter sequences. Low-quality reads (quality score below 0.05) and reads with more than 2 ambiguous nucleotides were discarded. Clean reads were subjected to an RNA-Seq analysis toolbox for mapping reads to the reference genome with mismatch, insertion, and deletion costs of 2, 3, and 3, respectively. Mapping was performed with stringent criteria and allowed a length fraction of 0.8 in mapping parameter, in which at least 80% of nucleotides in a read must be aligned to the reference genome. The minimum similarity between the aligned region of the read and the reference sequence was set at 80%.
Principal-component analysis (PCA) graphs were produced for each time point after ZIKV infection between control and infected samples to identify any outlying samples for quality control. The expression levels used as input were normalized log count per million (cpm) values.
The relative expression levels were produced as RPKM (reads per kilobase of exon model per million mapped reads) values, which take into account the relative size of the transcripts and use the mappedread data sets only to determine relative transcript abundance. To explore genes with differential expression profiles between two samples, CLC Genomic Workbench uses multifactorial statistics based on a negative binomial generalized linear model (GLM). Each gene is modeled by a separate GLM, and this approach allows us to fit curves to expression values without assuming that the error on the values is normally distributed. The TMM (trimmed mean of M values) normalization method was applied on all data sets to calculate effective library sizes, which were then used as part of the per-sample normalization (37).
The Wald test was also used to compare each sample with its control group to test whether a given coefficient is nonzero. We considered genes with more than a 2-fold change and a false discovery rate (FDR) of less than 0.05 as statistically significantly modulated genes.
We previously reported 3,482 putative long intergenic noncoding RNAs (lincRNAs) from A. aegypti using a stringent filtering pipeline to remove transcripts that may potentially encode proteins (32). The expression profile of lincRNAs was also generated for each sample similar to the approach described above.
To identify the host transcriptomic response to two different flaviviruses, we compared altered gene profiles in previously published DENV-infected A. aegypti libraries (11) with our ZIKV-infected samples. The relevant RNA-Seq data (SRA058076) were downloaded from the NCBI Sequence Read Archive. The libraries were treated in the same way as described above to identify differentially expressed A. aegypti gene profiles in response to DENV. GO analysis. All differentially expressed genes were uploaded to the Blast2GO server for functional annotation and GO analysis. We used Blast and InterProScan algorithms to reveal the GO terms of differentially expressed sequences. More abundant terms were computed for each category of molecular function, biological process, and cellular components. Blast2GO has integrated the FatiGO package for statistical assessment, and this package uses Fisher's exact test.
Identification of miRNA target genes. We screened all differentially expressed mRNAs to identify potential miRNA targets among them. If selected mRNAs did not have complete annotations such as clear 5= untranslated region (UTR), open reading frame (ORF), and 3= UTR, the region before the ORF start codon (300 bp) and after the stop codon (500 bp) for each mRNA was considered 5= UTR and 3= UTR, respectively. We used three different algorithms, including RNA22 (38), miRanda (39), and RNAhybrid (40), to predict potential miRNA binding sites on genes altered by ZIKV. We previously described this approach and parameters for setting each tool, but to increase the level of confidence, we selected those binding sites which were predicted by all three algorithms for further analysis (41).
RT-qPCR analysis of mRNAs. qPCR validations were done using the same RNA that was used for RNA-Seq. RNA from ZIKV-positive samples was pooled (n ϭ 5) for time points 2, 7, and 14 dpi and treated with amplification-grade DNase I (Invitrogen). Total RNA was reverse transcribed using the amfiRivert cDNA synthesis Platinum master mix (GenDepot, Barker, TX, USA) containing a mixture of oligo(dT) 18 and random hexamers. Real-time quantification was performed in a StepOnePlus instrument (Applied Biosystems, Foster City, CA) in a 10-l reaction mixture containing 1:10-diluted cDNA template, 1ϫ PowerUp SYBR green master mix (Applied Biosystems), and 1 M (each) primer. The analysis was performed using the threshold cycle (ΔΔC T ) (Livak) method (42). Three independent biological replicates were conducted, and all PCRs were performed in duplicate. The ribosomal protein S7 gene (43) was used for normalization of cDNA templates. Primer sequences are listed in Table S8 in the supplemental material.
Accession number(s). The accession number for the raw and trimmed sequencing data reported here is GEO GSE102939.