Acinetobacter baumannii NCIMB8209: A rare environmental strain displaying extensive insertion sequence-mediated genome remodeling resulting in the loss of exposed cell structures and defensive mechanisms

Acinetobacter baumannii represents nowadays an important nosocomial pathogen of poorly defined reservoirs outside the clinical setting. Here we conducted whole-genome sequencing analysis of the Acinetobacter sp. NCIMB8209 collection strain, isolated in 1943 from the aerobic degradation (retting) of desert guayule shrubs. NCIMB8209 contained a 3.75 Mb chromosome and a plasmid of 134 kb. Phylogenetic analysis based on core genes indicated NCIMB8209 affiliation to A. baumannii, a result supported by the identification of a chromosomal blaOXA-51-like gene. Seven genomic islands lacking antimicrobial resistance determinants, 5 regions encompassing phage-related genes and, notably, 93 insertion sequences (IS) were found in this genome. NCIMB8209 harbors most genes linked to persistence and virulence described in contemporary A. baumannii clinical strains, but many of them encoding components of surface structures are interrupted by IS. Moreover, defense genetic islands against biological aggressors such as type 6 secretion systems or crispr/cas are absent from this genome. These findings correlate with a low capacity of NCIMB8209 to form biofilm and pellicle, low motility on semisolid medium, and low virulence towards Galleria mellonella and Caenorhabitis elegans. Searching for catabolic genes and concomitant metabolic assays revealed the ability of NCIMB8209 to grow on a wide range of substances produced by plants including aromatic acids and defense compounds against external aggressors. All the above features strongly suggest that NCIMB8209 has evolved specific adaptive features to a particular environmental niche. Moreover, they also revealed that the remarkable genetic plasticity identified in contemporary A. baumannii clinical strains represents an intrinsic characteristic of the species. IMPORTANCE Acinetobacter baumannii (Ab) is an ESKAPE opportunistic pathogen, with poorly defined natural habitats/reservoirs outside the clinical setting. Ab arose from the Acb complex as the result of a population bottleneck, followed by a recent population expansion from a few clinically-relevant clones endowed with an arsenal of resistance and virulence genes. Still, the identification of virulence traits and the evolutionary paths leading to a pathogenic lifestyle has remained elusive, and thus the study of non-clinical (“environmental”) Ab isolates is necessary. We conducted here comparative genomic and virulence studies on Ab NCMBI8209 isolated in 1943 from the microbiota responsible of the decomposition of guayule, and therefore well differentiated both temporally and epidemiologically from the nowadays predominant multidrug-resistant strains. Our work provides insights on the adaptive strategies used by Ab to escape from host defenses, and may help the adoption of measures aimed to limit its further dissemination.

were obtained by centrifugation at 4,000 x g followed by filtration using 0.22 μm filters. 236 Secreted proteins present in the supernatants were then concentrated 50-fold using Amicon 237 Ultracel 3K centrifuge filters following the instructions of the manufacturer, and subjected to a 238 wash step with 40 mM Tris pH 8, 200 mM NaCl, 5% glycerol before being analyzed by 18% 239 SDS-PAGE. 240 241

Biofilm assays 242
Biofilm formation was qualitatively determined by measuring the adhesion of bacteria 243 to the surface of glass tubes. A. baumannii strains were statically grown overnight in 2 mL of L-244 Broth medium. The spent liquid medium was discarded, and the tubes were rinsed twice with 245 water before adding a solution of 1 % Crystal Violet. After 15 min the dye solution was 246 discarded and the tubes were rinsed twice with water before inspection. All assays were done at 247 least three times using fresh samples each time. Virulence assays of the different A. baumannii strains tested here were evaluated by 252 using two different model systems: G. mellonella moth larvae (45) and the nematode C. elegans 253 (46). G. mellonella larvae were purchased from Knutson's Live Bait (Brooklyn, MI) and were 254 used the day after arrival. Groups of twenty randomly picked larvae were used for each assay 255 condition. The different A. baumannii strains tested were grown overnight in LB and then 256 diluted with PBS to obtain the CFU titers indicated in the corresponding figure legends, which 257 were verified by colony counts on LBA for all inocula. A Hamilton microliter syringe was used 258 to inject 10 μl of the bacterial suspensions into the hemolymph of each larva via the second last 259 left proleg. As a control, one group of G. mellonella larvae was injected with 10 μl of PBS. 260 After injection, the larvae were incubated in plastic plates at 37°C and the numbers of dead 261 individuals were scored regularly. 262 For C. elegans survival experiments the N2 Bristol (wild type) was used. Gravid 263 hermaphrodites were bleached and resulting eggs washed using standard protocols (47). 264 Nematodes arrested in stage 1 (L1) were then transferred onto NGM plates seeded with fresh E. 265 coli OP50 and grown at 20°C until nematodes reached larval stage 4 (L4). For virulence assays, 266 A. baumannii and E. coli OP50 (control) were grown overnight at 37°C in L-Broth medium. 267 The stationary culture was then diluted to 10 7 total CFU and 50 µl portions were seeded onto 268 60-mm nematode growth medium (NGM) plates. Groups of L4 worms were then transferred to 269 the plates seeded with A. baumannii and deposited on the center of the bacterial spot. Assays 270 were performed in duplicates. The nematodes were maintained at 20°C, transferred to fresh 271 plates every 48 hours and daily monitored. Worms that did not respond to stimulation by touch 272 were scored as dead. 273 Survival curves were plotted using PRISM software, and comparisons in survival were 274 calculated using the log-rank Mantel-Cox test and Gehan-Breslow-Wilcoxon test. 275

RESULTS AND DISCUSSION 277
Phylogenetic analysis assigned strain NCIMB8209 to A. baumannii, albeit to a separate 278 clonal lineage as compared to its companion strain DSM30011 279

NCIMB8209 origins 280
As described in our previous work (11), strain NCIMB8209 and its companion 281 DSM30011 were isolated prior to 1944 from the natural microbiota enriched during the aerobic 282 decomposition of guayule, an industrial procedure designed as retting used to reduce the 283 resinous content of the shrub processed material for the subsequent production of natural latex 284 (14, 15). Our ML phylogenetic analyses based on core gene sequence comparisons derived from 285 the WGS data (see below) allowed us to confidently assign this strain to A. baumannii as a 286 species. In concordance with this assignment, this strain was capable of growing at 44 ºC (11) in 287 what represents a typical phenotype associated to A. baumannii (1, 48). Still, random 288 amplification PCR (11), phylogenetic, comparative genome analysis, and metabolic studies (see 12 below) indicated significant differences between this strain and its companion A. baumannii 290 strain DSM30011. This evidence suggests that, even when these two A. baumannii strains might 291 share similar environmental niche, they still belong to separate clonal lineages. NCIMB8209 292 thus provided us with a different A. baumannii strain isolated from a non-clinical source (14) 293 before the massive introduction of antibiotics to treat infections (2). 294

NCIMB8209 genomic features 296
Genome sequencing indicated that the NCIMB8209 chromosome consisted of 297 3,751,581 bp in length with a G+C content of 39.1% (Table 1). These values match the average 298 values reported for the genomes of the species composing the Acinetobacter genus (3,870 kpb 299 and 39.6% respectively (6). Noteworthy, 223 of the CDS (i. e. around 6% of the total) predicted 300 in the NCIMB8209 genome were pseudogenes. It is worth noting that a similarly high number 301 of non-functional genes (272; 9% of the total genes) has been reported for the A. baumannii 302 strain SDF, which was isolated from a human body louse and whose genome is riddled with 303 numerous prophages and ISs (49). 304 Our analysis also showed the presence of a large plasmid of 133,709 bp with a G+C 305 content of 40.1%, hereafter designated pAbNCIMB8209_134 (Table 1). Comparison of 306 pAbNCIMB8209_134 with other plasmids deposited in databases indicated extensive sequence 307 identity with a group of A. baumannii plasmids higher than 100 kb in length, including pABTJ2 308 (50). All of these plasmids share a previously undescribed Rep-3 superfamily (pfam0151) 309 replication initiation protein gene (C4X49_18465). The presence of both partition 310 (C4X49_18550) and toxin/antitoxin genes (C4X49_18715-C4X49_18720) related to plasmid 311 stability were detected in pAbNCIMB8209_134. On the contrary, genes involved in 312 mobilization, conjugation, antimicrobial resistance or virulence functions could not be identified 313 in pAbNCIMB8209_134. Still, this plasmid encodes functions which may provide some 314 adaptive advantages to their Acinetobacter hosts, such as a putative glutathione-dependent 315 pathway of formaldehyde detoxification (C4X49_18640-18650). 316 13

Comparisons of the chromosomal architectures of the A. baumannii environmental 317 strains NCIMB8209 and DSM30011. 318
Comparison of the overall chromosome structures of the A. baumannii NCIMB8209 319 and DSM30011 strains showed that the size of the former is 198 kpb smaller than that of 320 DSM30011 (Fig. 1). Furthermore, a number of GIs and prophages distinguished these two 321 chromosomes as will be described in greater detail below. Despite these differences a general 322 shared synteny was observed between the two chromosomes, with the notable exception of an 323 inversion of a 53.1 kb-region located between the first and sixth rRNA operons (Fig. 1). A 324 similar situation was found in the community-acquired strain A. baumannii D1279779 (51), in 325 which the same region was inverted as compared to DSM30011 and to most other clinical 326 strains including the type strain ATCC17978 (Fig. S1). This rearrangement was probably 327 mediated by homologous recombination between two oppositely-oriented rRNA operons 328 bordering this region (51). Still, and although this inversion reverses the orientation of a number 329 of critical housekeeping genes as well as the origin of chromosomal replication (oriC), it has no 330 substantial effects on the growth rate of NCIMB8209 as compared to DSM30011 in either rich 331 or minimal medium (data not shown). 332 333

Phylogenomic and MLST analyses 334
A phylogenetic study based on the comparisons of the concatenated sequences of 383 335 core genes of the environmental strains NCIMB8209 and DSM30011 and a number of 336 Acinetobacter genomes that encompassed other 99 A. baumannii as well as 26 non-A. 337 baumannii representatives (26 strains; Table S1) reinforced the affiliation of NCIMB8209 to A. 338 baumannii as a species (Fig. 2 and Fig. S2). Different authors have noted the lack of a defined 339 phylogenetic structure for the general A. baumannii population on phylogenetic trees based on 340 core genes comparisons, with the exception of different terminal clusters each corresponding to 341 an epidemic CC (2, 3, 11, 52). The incorporation of NCIMB8209 core genome sequences to this 342 phylogenetic study did not change this general picture (Fig. 2), but some observations derived 343 14 from this analysis are worth remarking. First, strains NCIMB8209 and DSM30011, which were 344 isolated more than 70 years ago, neither emerged close to the root nor forming a separate 345 "environmental" cluster in the A. baumannii subtree. On the contrary, they appeared intermixed 346 between more contemporary clinical strains (Fig. 2) shared only 4 of the 7 alleles with its closest matching isolate (11), NCIMB8209 shared 6 of the 354 7 alleles with its closest matching isolates with all differences being restricted to the rplB allele 355 (Table S1). Of note, the gdhB gene used in the A. baumannii MLST Oxford classification 356 scheme (42) is missing from the NCIMB8209 genome (Table S1) inclusion of more environmental strains in these calculations will certainly contribute to obtain a 386 more accurate value for the core genes repertoire of the species. 387 388 NCIMB8209 antimicrobial resistance 389 Conventional antimicrobial susceptibility assays indicated that NCIMB8209 showed 390 susceptibility to most clinically employed antimicrobials tested except nitrofurantoin and, 391 among β-lactams, to ampicillin at MIC values just above the CLSI recommended breakpoints 392 (Table S2). In full concordance (Table 2 and Table S2), this strain lacks AbaR resistance 393 islands. The marginal ampicillin resistance of this strain (see above) most likely reflects the 394 presence of a number of β-lactamase genes (Table S2) (Table S2). This indicates both the potentiality to evolve such resistances under 406 selective pressure and an environmental reservoir of these resistance genes. 407 NCIMB8209 shares with DSM30011 (11) susceptibility to folate pathway inhibitors 408 such as sulfamethoxazole/trimethoprim (Table S2). These susceptibilities correlate with the 409 absence of sul or dfrA resistance genes in the genome (Table S2)  involved in clinical Acinetobacter strains in the extrusion of toxic compounds including some 417 antimicrobials (7) were also found in the NCIMB8209 genome (Table S2)

A. baumannii NCIMB8209 shows reduced virulence towardsGalleria mellonella and 426
Caenorahabditis elegans 427 G. mellonella moth larvae and the nematode C. elegans provide reliable models to study 428 the virulence of numerous human pathogens, among them Acinetobacter genus species (45, 46). 429 We thus decided to use these two models to evaluate the virulence of strain NCIMB8209 as 430 compared to those of strain DSM30011 and the soil organism Acinetobacter baylyi ADP1 (24). 431 In the G. mellonella model, DSM30011 showed a high virulence (44, 66), whereas a low-432 virulence capacity has been demonstrated for ADP1 (67). On the contrary, virulence in the C. 433 elegans model has not been assessed before for this group of strains. As observed in Fig. 3, 434 NCIMB8209 was less virulent than DSM30011 in either of these models, the latter 435 environmental strain showing in particular a much higher capacity to kill C. elegans. On the 436 contrary, NCIMB8209 killing capacity was close to that observed for A. baylyi ADP1 in either 437 assay (Fig. 3). The observed differences in virulence between NCIMB8209 and DSM30011 438 indicate relevant phenotypic differences between them (see also following sections), regardless 439 of their isolation as companion strains from a similar environmental origin following similar 440 enrichment and culture protocols (14). 441

IS and prophages have extensively modified the NCIMB8209 genome 443
The lower virulence displayed by strain NCIMB8209 as compared to that of 444 DSM30011 (see above) led us to analyze in more detail their genomes, aiming to find genetic 445 differences that might explain these distinct phenotypes. When comparing the accessory 446 genomes of DSM30011 and NCIMB8209 some worth-noting differences were observed in the 447 latter ( Fig. 1) such as: 1) the presence of 7 GIs (Table 2), which will be described throughout 448 the text; 2) the presence of 5 regions harboring putative prophages (Tables 1 and S3), that will 449 also be described in detail below; 3) the notorious absence of a CRISPRs-cas gene cluster 450 common to DSM30011 and other A. baumannii strains (11, 68); 4) the lack of interbacterial 451 competition islands (ICI), such as those encoding the type 6 secretion system components 452 and/or its associated toxins (44); 5) significant differences regarding ISs number and 453 composition between these two strains (see below).  (Table 3). According to PHASTER integrity predictions, 2 "questionable" (Ph1-N and Ph2-N) 461 and 3 "incomplete" prophages (Ph3-5-N; Table S3) were identified among them. The 462 integration sites for Ph4-N and Ph5-N in the NCIMB8209 genome were the same hot spots 463 found in other A. baumannii genomes (10), whereas those of Ph1-N to Ph3-N represent novel 464 integration sites (Table 3) Table S3 followed by manual examination 477 corroborated the presence of 12 different ISs (totalizing 79 IS copies) in the NCIMB8209 478 19 chromosome and 9 different ISs (totalizing 14 IS copies) in the plasmid that this strain harbors 479 (Table 4). These values are significantly higher than the estimated average of 33 IS copies per 480 A. baumannii genome determined by Adams and collaborators (8). Moreover, it is also 481 remarkable the diversity of IS families (16 in total) found in the NCIMB8209 genome when 482 considering that only 7 out of 976 A. baumannii genomes were found to carry 10 or more 483 different IS elements (8). Although most of the IS elements identified in NCIMB8209 were 484 previously reported in different species of the Acinetobacter genus, this strain contains a very 485 particular IS profile for an A. baumannii strain, with 2 of the most frequent IS elements (ISAha2 486 and ISAha3) originally detected in A. haemolyticus (Table 4). Furthermore, as also seen in this 487  (Table 4). This particular mobile element was also 497 found in the chromosome of other 3 A. baumannii strains (ABNIH28, B8300 and B8342) and in 498 A. johnsonnii XBB1 (75% identity, query coverage=100%). No homologs to this IS were found 499 in other organisms outside the Acinetobacter genus. ISAba45 (3 copies) was also detected in 500 other four A. baumannii strains (PR07, ABNIH28, A1296, and IOMTU433, the former two 501 strains closely linked to NCIMB8209, Figure 2) and in A. soli GFJ2 (87% identity, query 502 coverage=99%). Moreover, ISAba45 displays significant nucleotide identity with similar ISs 503 present in other Acinetobacter strains (76%) and in Moraxella osloensis plasmids (70%). The 504 ISAba46-only copy is carried by plasmid pAbNCIMB8209, and was also found in the 505 20 chromosomes and plasmids of numerous strains of A. pittii, A. baumannii, A. lwoffii, A. junii, A. 506 johnsonnii and A. haemolyticus (>86% identity, query coverage=100%). Concerning the 507 locations of the above IS elements, we found that 17 of them are interrupting an equivalent 508 number of CDSs, very likely precluding their expression (Table S3. Seven of these 17 CDSs are 509 interrupted by ISAba13 copies (Table 4). Remarkably, a similar situation was reported 510 previously for strain D1279779, which harbors 18 ISAba13 copies (51). We also detected that 511 some IS elements were positioned on one of the borders of several GIs, including GI1, GI2, 512 GI4, and GI6 (Table 2). It is tempting to speculate that these insertions probably interfere with 513 the excision mechanism of the corresponding GIs, and were thus selected due to the subsequent 514 retention of these GI in the chromosome (74). 515 In summary, the plethora of IS elements found in NCIMB8209 have significantly help 516 remodeling the genome of this strain. 517 518 A genome devoid of interbacterial competition islands. 519 As mentioned above, the genome of NCIMB8209 neither carries a T6SS-main gene 520 cluster nor any T6SS-associated locus encoding VgrG-like proteins or their cognate toxins (44, 521 75). Searching for other competition mechanisms such as the Two-partner systems (Tps) 522 already identified in strain DSM300011 (Fig. 2) related to contact-dependent inhibition (CDI) 523 (76), also resulted in negative results. These observations led us hypothesize that the 524 NCIMB8209 ability to outcompete other bacteria was severely compromised. To test this 525 prediction, bacterial competitions assays were performed essentially following previously 526 described procedures (44). Results shown in Fig. S3 demonstrate that NCIMB8209 was not 527 capable of outcompeting E. coli when the former strain was used as the attacker and E. coli 528 DH5 was the prey. On the contrary, the DMS30011 strain was clearly capable of 529 outcompeting E. coli DH5in a similar assay, an effect that specifically depended on the T6SS 530 system as shown by the lack of effect observed in a DMS30011 ΔtssM mutant (Fig. S3A). In 531 correlation with these results NCIMB8209, similarly to DSM30011 ΔtssM, did not secrete 532 21 detectable amounts of Hcp (a marker of a functional T6SS) into the growth medium (Fig. S3B). 533 We additionally tested the capacity of NCIMB8209 and DSM30011 to out-compete each other. 534 While DSM30011 completely eliminated NCIMB8209 (Fig. S3C) when co-incubated in a 10:1 535 ratio attacker:prey, the latter strain was not capable of outcompeting the former under similar 536 experimental conditions (Fig. S3D). 537 538

Persistence and virulence 539
To analyze the presence of genes potentially involved in persistence and virulence in 540 strain NCIMB8209, we performed a BlastN-homology search using as query a list of potential 541 candidates described in Acinetobacter strains (11). This list included genes coding for the 542 synthesis of the capsule and other exopolysaccharides, appendages, OM proteins and the T2SS; 543 genes coding phospolipases and proteases; and genes involved in traits such as motility and iron 544 scavenging. Our searching indicated that 131 out of the 146 genes (sequence identity ≥76%) 545 encoding potential virulence factors analyzed were present in NCIMB8209 (Table S2). Among 546 the 15 putative virulence factors included in the search and not detected in the NCIMB8209 547 genome, most of them encode or are involved in the synthesis of surface-exposed molecules 548 (Table S2). These included the Prp pilus (77), the MFS transporter Pmt (probably involved in 549 DNA transport necessary for biofilm formation; (78)) and a surface motility-associated 550 molecule (79). Moreover, the gene encoding for the Pilus 3-fimbrial adhesion precursor 551 (C4X49_08175) and the Pilus 3-fimbriae anchoring protein (C4X49_08180) are both 552 incomplete, and a gene coding for a polymorphic toxin of the RTX-family (80) was interrupted 553 by an ISAba44 copy (C4X49_17180-C4X49_17185). We also noticed that several genes which 554 products are directly or indirectly involved in functions related to adhesion or biofilm formation 555 in clinical strains were absent or interrupted by ISs. For instance, the gene encoding the Bap 556 protein (81) could not be detected (Table S2) proteins, which was interrupted by an ISAba13 copy in NCIMB8209 (83, 84). We then 561 hypothesized that the lack of this group of genes could have impacted traits such as biofilm and 562 pellicle formation as well as motility. To evaluate this hypothesis, the capacity of NCIMB8209 563 to form pellicle and biofilm when grown overnight in rich medium was compared to that of 564 DSM30011, which was found to represent a high-biofilm/biopellicle producer (44). This assay 565 indicated that NCIMB8209 was neither capable of forming pellicle (Fig. S4A) nor attaching to 566 glass surfaces (Fig. S4B), supporting the above prediction that the ability to form 567 biofilm/biopellicles is severely compromised in this strain. Furthermore, motility assays on 568 semisolid medium showed that the ability of NCIMB8209 to perform swarming was highly 569 reduced ( Fig. S4B), again in sharp contrast to DSM30011 (66). 570 Another gene absent in NCIMB8209 is cpaA (Table S2), which encodes the surface-571 exposed metallopeptidase Cpa endowed with the ability to cleave fibrinogen and coagulation 572 factor XII, and thus proposed to deregulate blood coagulation (85, 86). CpaA represents a 573 substrate of the Type II secretion system (87), and has been proposed as a bona fide A. 574 baumannii virulence factor (88). Therefore, its conspicuous absence in NCIMB8209 (Table S2) (Table S4). From the 59 unique chromosomal genes, 38 584 were located within different prophage and GI regions and showed no significant hits in 585 databases. A Blast searching against the NCBI Protein database using as query the amino acid 586 23 sequence of the remaining 21 chromosomal CDSs revealed that 20 of them had significant best 587 hits with proteins found in the database, sixteen of them (i. e., 80%) located in species of the 588 Acinetobacter genus (12 of them in A. baumannii) with potential roles in transport, motility, 589 transcriptional regulation, and lipopolysaccharide synthesis (see Table S4 for details). Notably, 590 4 CDSs related to capsule synthesis (see below) were best affiliated to homologs located either 591 in species of different orders among the class Gammaproteobacteria to which Acinetobacter 592 belongs, including the Alteromonodales (Alteromonas sp.) and the Enterobacteriales (Yersinia 593 sp.), and also to a different class (Azoarcus sp., Betaproteobacteria) or even to a different 594 phylum (Vitellibacter sp., Bacteroidetes/Chlorobi) (Table S4). This indicated a remarkable 595 ability of A. baumannii NCIMB8209 to co-opt genes from both phylogenetically-related and -596 distant species as the result of horizontal gene transfer. 597

NCIMB8209 carries a novel K-locus. 599
Among idiosyncratic features of NCIMB8209 worth remarking, we found differences in 600 content and organization of the genes linked to the production of the K capsule. The K locus 601 identified in the NCIMB8209 genome (Table S2) (Table S2). Pyruvyl-capped N-607 acetyl-D-galactosamine (D-GalpNAcA) branches constitute rare structures described so far only 608 in A. baumannii D78, a strain assigned to CC1 (90). The K-locus arrangement described above 609 for strain NCIMB8209 was not found in other Acinetobacter strains by database searching, 610 therefore revealing a previously unreported PSgc locus in A. baumannii. Of note, the weeK gene 611 coding for an UDP-N-acetyl-glucosamine 4,6-dehydratase (involved in the biosynthesis of 612 24 UDP-linked sugar precursors used for capsule synthesis) is also interrupted by an ISAba44 copy 613 in NCIMB8209 (C4X49_00335-C4X49_00350, Table S2). 614 NCIMB8209 shares with DSM30011 a similar gene locus involved in the synthesis of 615 outer core polysaccharides (OC) of the lipid A core moiety, with the exception of an additional 616 gene encoding a glycosyl transferase (C4X49_15205). However, this gene is annotated as a 617 pseudogene and is probably non-functional (Table S2). As previously described for strain 618 DSM30011 (11), this cluster includes the rmlBDAC genes (Table S2)

Mechanisms of resistance to toxic compounds 624
NCIMB8209 also contains many gene clusters encoding systems involved in the 625 resistance to toxic compounds. Some of these clusters are scattered throughout the genome, 626 while others are concentrated in three regions. One of these regions is constituted by GI3 (Table  627 2) integrated next to the dusA gene (C4X49_02945). The dusA locus has been found to 628 represent a common integration site for this kind of genetic islands in A. baumannii (93, 94). 629 Although GI3 (19.1 kb long) is shorter than homologous GIs carried by other A. baumannii 630 strains (for instance, in DSM30011 this GI is 33 kb long), this island still includes genes for 631 putative arsenate and heavy metal ion detoxification systems (ars and czc genes) and other 632 involved in Fe ions transport (feoAB). Another case is GI6, which carries a cluster of genes 633 encoding a putative copper ion detoxification system (C4X49_13800-C4X49_13840, Table 2). 634 More interestingly, GI6 also harbors a mobA gene (C4X49_13855) encoding a protein with a 635 relaxase domain which might be responsible for its mobilization after excision from the genome 636 (Table 2). This may suggest a plasmid origin for this GI, and also opens the possibility that this 637 gene could even mediate its mobilization by horizontal gene transfer after excision from the 638 genome. The third cluster (C4X49_16225-C4X49_16400) contains a merR-merTPCAD gene 639 25 cluster (C4X49_16305-C4X49_16300 to C4X49_16280) coding for a complete Hg ion 640 detoxification system (95). Since there is no gene coding for an integrase nearby, this region 641 was not considered as a GI. Remarkably however, it is flanked by several IS copies which might 642 have contributed to its mobilization and integration in this locus. 643 Inspection of the NCIMB8209 genome also evidenced the presence of 3 putative 644 catalase genes (C4X49_07525, C4X49_02120 and C4X49_17825). Catalases represent one of 645 the main strategies evolved by cells to cope with the accumulation of reactive oxygen species 646 (26). It is then noteworthy that the number of putative catalase proteins encoded by this strain is 647 higher than that of the environmental strains DSM30011 (2, PNH13446.1 and PNH14300.1) 648 and A. baylyi ADP1 (1, CAG67388.1), and similar to that found in the A. baumannii clinical 649 strain ATCC 17978 (3, ABO11814.2, ABO10867.2 and ABO13771.2). Moreover, a 650 comparative analysis of the tolerance of the above strains to strong oxidants, as judged by their 651 survival when exposed to H 2 O 2 (26), indicated a strong correlation between their catalase gene 652 content and oxidative stress resistances (Fig. S6). 653 654

Catabolic abilities 655
Previous WGS analysis of the environmental A. baumannii DSM30011 strain predicted 656 the presence in its genome of 28 gene clusters encoding many metabolic pathways involved in 657 the utilization of a large variety of plant substances (11). The presence and organization of 658 similar catabolic genes was also investigated in NCIMB8209 and, despite some differences in 659 the organization of catabolic loci between these two strains, 27 out of the 28 catabolic loci 660 found in DSM30011 (11) were also present in this strain. The only exception was the 661 salicylate/gentisate (sal2/gen) cluster, which was totally missing in NCIMB8209. In addition, 662 the betABI locus present in both strains (previously thought as a catabolic gene cluster, (11, 24)) 663 has been shown in A. baylyi to be involved in the synthesis rather than in the degradation of 664 glycine betaine (96). In agreement, none of the A. baumannii strains tested including 665 26 DSM30011, NCIMB8209, and ATCC 17978, nor A. baylyi, were capable of utilizing glycine 666 betaine as the only carbon source for growth (Table S5). 667 From the 27 predicted catabolic loci shared between NCIMB8209 and DSM30011 668 mentioned above, 17 equivalent clusters are also found in the genome of A. baylyi and are 669 involved in the degradation of plant substances and the recycling of plant material (24). These 670 include loci such as pca, qui, pob, hca, van, and ben, involved in the degradation of aromatic 671 acids and hydroxylated aromatic acids such as hydroxycinnamic acids constituting the building 672 blocks of plant protective heteropolymers such as suberin (24, 97-99). These aromatic 673 compounds are ultimately catabolized through the beta-ketoadipate pathway yielding Krebs 674 cycle-intermediary substrates, therefore allowing bacterial growth when used as substrates (24). 675 Our analysis of the ability of A. baumannii strains NCMIB8209 and DSM30011 to employ 676 different compounds as substrates for growth (Table S5) indicated that these two strains share 677 with A. baylyi the ability to utilize many aromatic acids found in plants including benzoate, 4-678 hydroxy-benzoate, 4-hydroxy-cinnamate, and shikimate, as sole carbon sources. Moreover, the 679 activity of the mdc pathway involved in the catabolism of dicarboxylic malonic acid, another 680 plant-synthesized compound (24), was inferred from the growth observed by all A. baumannii 681 strains tested in malonate as the only carbon source (Table S5). These observations are 682 compatible with the isolation of A. baumannii strains strains NCMIB8209 and DSM30011 from 683 an enriched consortium specialized in the recycling resinous plants material (14). Remarkably 684 still, all of the above-mentioned catabolic capabilities are also shared by A. baumannii clinical 685 strains such as ATCC 17978 (Table S5 and data not shown). 686 Besides the above described similarities with A. baylyi, NCMIB8209 and DSM30011 687 are endowed with some idiosyncratic catabolic clusters also related to the degradation of 688 particular plant compounds. Among them we could mention the paa (phenylacetic acid, PAA) 689 and liu (leucine/isovalerate) clusters (11). The presence of a paa cluster in both NCMIB8209 690 and DSM30011, but not in A. baylyi, correlates with the capability of these A. baumannii strains 691 to grow on PAA as the sole carbon source (Table S5). PAA is a plant auxin derived from the 692 27 catabolism of phenylalanine (100, 101) endowed with substantial antimicrobial activity (102). 693 PAA degradation by A. baumannii clinical strains has already been noted (100) (see also Table  694 S5) and found to play an important role during A. baumannii infection by reducing the levels of 695 this powerful phagocyte chemoattractant (101). Concerning the liu (leucine/isovalerate) 696 catabolic cluster, evidence of its activity was obtained by the growth observed for NCMIB8209 697 and DSM30011 on L-leucine or isovalerate as only carbon sources (Table S5). In P. aeruginosa 698 the liu pathway complements the atu pathway responsible of the degradation of acyclic terpenes 699 produced by plants in response to phytopathogens (103), and a similar situation may occur also 700 in NCMIB8209 and DSM30011 (11). It follows that these A. baumannii strains have the 701 capacity not only to participate in the degradation of many plant aromatic compounds, but are 702 also endowed with the additional ability to degrade compounds produced by plants in response 703 to stress situations including the attack of phytopathogens and phytophagous insects (97-99). 704 Besides the above described similarities at both the genomic level and metabolic 705 capabilities between DSM30011 and NCIMB8209, a differential capacity for the utilization of 706 the basic amino acids arginine and ornithine was observed between these two strains (Table S5). 707 This may be explained by the differential presence in DSM30011 of a 10.5 kbp fragment 708 containing a gdh/ascC/astA/astD/astB/astE gene cluster encoding a complete arginine 709 succinyltransferase (AST) pathway responsible of the catabolism of these basic amino acids 710 (104), which was missing in NCIMB8209 (Fig. 1). Instead, a region of 27.6 kb bordered by 711 IS1008/ISOur remnants and encompassing putative ion mercury detoxification genes was found 712 in NCMIB8209, which has in turn been heavily impacted by several ISs of different types 713 (Table S3). It is worth noting in the above context that the chromosomal region adjacent to astA 714 in A. baumannii CC2 strains is also an integration site for different mobile elements such as 715 AbGRI2-type resistance islands among others, which have provoked different rearrangements in 716 their vicinity including various deletions (10, 73). This has resulted in some CC2 strains in 717 which a AbGRI2-type element is found adjacent to a complete ast gene cluster, and other strains 718 in which the ast genes have been completely deleted (10, 73) (and data not shown). It has been 719 28 shown in P. aeruginosa that the N-succinyl transferase AstA, the first enzyme of the AST 720 pathway, is able to use both L-arginine and L-ornithine as substrates (105), and a similar 721 substrate specificity may also occur for A. baumannii AstA. Furthermore, genes coding for a 722 putrescine importer (puuP) and a gamma-aminobutyraldehyde dehydrogenase (patD), the latter 723 part of the transaminase pathway of putrescine degradation (104), were identified in DSM30011 724 (51% and 48% identity with the corresponding proteins from Escherichia coli str. K-12 substr. 725 MG1655; accession numbers AAC74378.2 and AAC74526.1, respectively) but not in 726 NCIMB8209. These latter observations might also explain the inability of NCIMB8209 to grow 727 on putrescine, as compared to the other A. baumannii strains tested (Table S12). In any case, the 728 observed phenotypic profiles suggest a narrowing of NCIMB8209 substrate utilization 729 capabilities when basic amino acids and polyamines in particular are considered, which suggest 730 an adaptation of this strain to a specific niche. 731 732

CONCLUSIONS 733
A. baumannii NCIMB8209 represents to our knowledge the second reported 734 environmental A. baumannii strain, isolated from a desert plant source at the onset (or even 735 before) the massive introduction antimicrobials to treat infections (11). In concordance, and 736 similarly to its companion strain DSM30011, NCIMB8209 showed general susceptibility to 737 most clinically-employed antimicrobials including folate pathway inhibitors (this work and 11). 738 Expectedly from their common plant source and retting enrichment before isolation in media 739 containing guayule resins as substrates for growth (14), both A. baumannii strains share the 740 ability to degrade a number of substances produced by plants including many hydroxylated 741 aromatic acids constituting the building blocks of plant protective hydrophobic heteropolymers 742 and also repellents against predators (Table S5). However, WGS and subsequent phylogenetic 743 and comparative genome analysis, as well as different biochemical studies conducted in this 744 work, indicated significant differences between these two environmental strains. First, as 745 compared to DSM30011, NCIMB8209 has undergone a significant genome reduction and lacks 746 many genetic clusters encoding components involved in defense mechanisms against other 747 biological competitors such as the CRISPR-cas complex, T6SS, and two-partner systems. 748 Moreover, and although NCIMB8209 contains most genes associated to persistence and 749 virulence in A. baumannii clinical strains, many encoding components of surface structures are 750 interrupted by IS elements whose relatively high number and variability impacted heavily on the 751 NCIMB8209 genome. Among IS-interrupted genes we found those encoding pili components, 752 the O-antigen ligase TfpO, a biosynthetic route for surfactants compounds, Bap and Ata 753 adhesins, a RTX-toxin, etc. (Table S3). Comparative biofilm and pellicle production, as well as 754 motility assays, evidenced in fact that NCIM8209 is severely compromised in these 755 pathogenicity-associated traits (Fig. S4). Altogether, these observations can explain the low 756 relative virulence potential observed for NCIM8209 on the G. mellonella and C. elegans 757 infection models (Fig. S3). 758 30 Loss of T6SS genes has been observed in a number of A. baumannii clinical strains 759 causing infections, suggesting that this system is not required once A. baumannii invades its 760 host (73, 106, 107). Moreover, T6SS absence has been linked with higher chances of evasion of 761 A. baumannii from the host immune system (108, 109). The situation above described for strain 762 NCMIB8209 therefore resembles that reported for A. baumannii SDF isolated from a human 763 body louse (49), which has undergone extensive genome reductions and rearrangements 764 mediated by ISs (49). Also, albeit more moderately, loss of T6SS and IS-mediated inactivation 765 of genes encoding surface structures was also observed in A. baumannii D1279779, a 766 community acquired strain isolated from the bacteraemic infection of an indigenous Australian 767 (51). 768 It is tempting to speculate that the changes observed in A. baumannii NCMIB8209 are 769 also related to its adaptation to a particular niche. In this context, several reports have shown the 770 profound associations existing between different bacterial groups including many members of 771 the Acinetobacter genus with a number of insects feeding on plants (16-22). Many of these 772 bacterial species are located in the gut of their insect hosts conducting mutualistic or symbiotic 773 associations based on nutrition/protection relationships (16, 18-20, 110, 111). The bacterial 774 counterpart thus degrades toxic compounds for the insect of the plant diet, and the insect host in 775 return provides a stable environment, supply or resources, and a vector for the rapid spreading 776 and inoculation into fresh plant tissues. For those bacterial species displaying low pathogenic 777 potential, selective pressure eventually favours more stable relationships with concomitant 778 structural and metabolic changes (111). A general trend thus observed for Gram-negative 779 species with these characteristics is genome reduction with the loss or modification of surface-780 exposed molecules, thus reducing interactions with the innate immune system of their hosts 781 which may trigger defensive responses and elimination (110,111). This situation that can be 782 certainly applied to NCIMB8209, as extensively detailed above. In this context, we note the 783 relative high tolerance of this strain to pro-oxidants such as H 2 O 2 (Fig. S6), which correlates 784 with the identification of 3 catalase genes in its genome (112). The innate immune system of 785 31 insects closely resembles that of vertebrates at both the molecular and cellular levels (110). 786 Thus, while the cellular immune response in vertebrates is mediated by professional phagocytes, 787 in insects this function is conducted by phagocytic cells known as haemocytes (110). Since 788 phagocytic cells use oxidative burst as a common strategy to counteract pathogens (110, 113), it 789 is then possible that the selection of a higher anti-oxidant ability in NCIMB8209 led to an 790 increased tolerance to oxidative stress and therefore increased survival in an insect niche. Of 791 note, recent evidence indicates that A. baumannii strains with high catalase production are more 792 resistant against intracellular killing by macrophages (114). 793 Our observations also provide further clues on the high genomic plasticity of A. 794 baumannii as a species (6) Table S1: Acinetobacter strains used for phylogenetic and comparative studies and MLST 1252 classification data. 1253 Table S2: Genes putatively contributing to antimicrobials resistance and virulence in A. 1254 baumannii NCIMB8209. 1255 Table S3: Prophage and IS-related genes in A. baumannii NCIMB8209. 1256