Schmidtea polychroa (Flatworm, GOE00227_SP189) (ASM4489252v1)

Schmidtea polychroa (Flatworm, GOE00227_SP189) Assembly and Gene Annotation

About Schmidtea polychroa

The genus Schmidtea comprises four species of freshwater planarians, predominantly found in Europe, with some populations also present in north Africa, Asia and introduced in regions of America [2, 10]. Schmidtea polychroa has a Europe-wide distribution from the Iberian Peninsula [1, 14] and Italy to Southern Sweden in the North [11]. It also occurs in north-Africa and has been introduced into North-America [2].

The life cycle of S. polychroa features two distinct reproductive strategies, associated to different biotypes or chromosomal races [3, 4, 5], and distributed in a geographic pattern:

  • Diploid (2n) sexual strains are mostly restricted to southern Europe including northern and central Italy, Sardinia, some areas of Spain and France, the Check Republic and Hungary [12,13]. Sexual populations maybe occurs isolated in southern Sweden [11].

  • Polyploid (3n and 4n) parthenogens are more broadly distributed across the range of the species, and can coexist with sexual populations [12].

Sexual and parthenogenetic individuals are not genetically isolated and can interbreed producing fertile egg capsules [3]. Remarkably, individuals from purely parthenogenetic populations are capable of occasional sex reproduction [6] a trait than can contribute to the success of polyploid strains in colonising central and northern Europe [7]. The species is not fissiparous.

The S. polychroa strain (internal ID: GOE00227) was collected at 43.71249; 16.72605 near the Village of Gala, Croatia.

Picture credit (Creative Commons BY 4.0): Miquel Vila-Farré, Dept. of Tissue Dynamics and Regeneration, Max Planck Institute for Multidisciplinary Sciences

Taxonomy ID 50054

Assembly

The assembly presented here has been imported from INSDC and is linked to the assembly accession [GCA_044892525.1].

PacBio Circular consensus sequencing reads were called using pbccs (v6.0.0) and reads with quality > 0.99 (Q20) were taken forward as “HiFi” reads. To create the initial contig assemblies hifiasm (v0.14.2) with 30x PacBio HiFi reads (SRA: SRR27325392) and Hi-C data (SRA: SRR27325344) to create initial contigs with purging parameter: -l 2. Next, alternative haplotigs were then removed using purge-dups (v1.2.3) using default parameters and cutoff as they were correctly estimated by the program. To initially scaffold the contigs into scaffolds, SALSA v2 (v2.2) was used after mapping Hi-C reads to the contigs. The VGP Arima mapping pipeline was followed: https://github.com/VGP/vgp-assembly/tree/master/ pipeline/salsa using bwa-mem (v0.7.17), samtools (v0.10, v1.11) and Picard (v2.22.6). False joins in the scaffolds were then broken and missed joins merged manually following the processing of Hi-C reads with pairtools (v0.3.0) and visualization matrices created with cooler (v0.8.11). Following scaffolding, the original PacBio subreads were mapped to the chromosomes using pbmm2 (v1.3.0, https://github.com/ PacificBiosciences/pbmm2) with arguments: --preset SUBREAD -N 1 and regions +/− 2 kb around each gap were polished using gcpp’s arrow algorithm (v1.9.0). Those regions in which gaps were closed and polished with all capital nucleotides (gcpp’s internal high confidence threshold) were then inserted into the assemblies as closed gaps. Lastly, the PacBio HiFi (CCS reads with a read quality exceeding 0.99) were aligned to the genomes using pbmm2 (v1.3.0) with the arguments --preset CCS -N 1. DeepVariant (v1.2.0,98) was used to detect variants in the alignments to the assembled sequence. Only the homozygous variants (GT = 1/1) that passed DeepVariant’s internal filter (FILTER = PASS) were retained using bcftools view (v1.12) and htslib (v1.11). The genome was then polished by creating a consensus sequence based on this filtered VCF file, as detailed in the VGP assembly pipeline (https://github.com/VGP/vgp-assembly/tree/ master/pipeline/freebayes-polish).

The assembly was produced by "Max Planck Institute for Multidisciplinary Sciences" and reported in [9].

The total length of the assembly is 781270316 bp contained within 1012 scaffolds. The scaffold N50 value is 189691935, the scaffold L50 value is 2. The GC% content of the assembly is 28.0%.

Annotation

Genomic annotation was provided by "Max Planck Institute for Multidisciplinary Sciences".

The genome was annotated using a hybrid genome-guided transcriptome approach. As input RNAseq data, we combined Nanopore direct RNA-seq of pooled whole animals at various feeding stages and regeneration stages (SRR27325394), Nanopore cDNA RNA-seq of whole animals (SRR27325396) and a regeneration series (SRR27325395). Total RNA was extracted from snap-frozen planarian tissue using the protocol described in [8] and [9].

After read quality trimming, deduplication, filtering, and mapping (using HISAT2 and minimap2 for short and long reads, respectively), a draft transcriptome was generated using Stringtie2 then it was further refined using FLAIR and a collection of custom scripts to filter high- confidence isoforms. For details of the procedure and a step-by-step guide to the genome annotation analysis, see the Supporting Information of [9].

Small RNA features, protein features, BLAST hits and cross-references have been computed by metazoa.

References

  1. Baguñà J, Saló E, Romero R. 1981. Biogeografía de las planarias de aguas dulces (Platelminthes; Turbellaria; Tricladida; Paludicola) en España. Datos preliminares. Actas del Primer Congreso de la Sociedad Española de Limnología, 265–280.

  2. Ball IR. 1969. Dugesia lugubris (Tricladida: Paludicola) a European immigrant into North American fresh waters. Journal of the Fisheries Research Board of Canada 26, 221–228.

  3. Benazzi M. 1958. Cariologia di Dugesia lugubris (Schmidt) (Tricladida Paludicola). Caryologia 10, 276-303.

  4. Benazzi M. 1963. Genetics of reproductive mechanisms and chromosome behaviour in some freshwater triclads. In: Daugherty EC, Brown ZN (eds). The lower Metazoa. University of California 805 Press, Berkeley, pp. 405–422.

  5. Casu S, Pala M, Vacca RA. 1982. Distribuzione geografica in Sardegna di planarie d’acqua dolce appartenenti alla specie “Dugesia (S.) polychroa“ e “Dugesia (S.) mediterranea“. Bollettino della Società Sarda di Scienze Naturali 21, 177-184.

  6. D’Souza TG, Storhas M, Schulenburg H, Beukeboom LW, Michiels NK. 2004. Occasional sex in an 'asexual' polyploid hermaphrodite. Proceedings of the Royal Society B 271, 1001–1007.

  7. D’Souza, T.G., Michiels, N.K. (2009). Sex in parthenogenetic planarians: Phylogenetic Relic or Evolutionary Resurrection?. In: Schön, I., Martens, K., Dijk, P. (eds) Lost Sex. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-2770-2_18

  8. Grohme M. A., S. Schloissnig, A. Rozanski, M. Pippel, G. R. Young, S. Winkler, H. Brandl, I. Henry, A. Dahl, S. Powell, M. Hiller, E. Myers, and J. C. Rink. 2018. The genome of Schmidtea mediterranea and the evolution of core cellular mechanisms. Nature 554:56–61.

  9. Ivanković M., J. N. Brand, L. Pandolfini, T. Brown, M. Pippel, A. Rozanski, T. Schubert, M. A. Grohme, S. Winkler, L. Robledillo, M. Zhang, A. Codino, S. Gustincich, M. Vila-Farré, S. Zhang, A. Papantonis, A. Marques, and J. C. Rink.

  10. A comparative analysis of planarian genomes reveals regulatory conservation in the face of rapid structural divergence. Nat Commun 15:8215.

  11. Leria, L., et al. 2018. Diversification and biogeographic history of the Western Palearctic freshwater flatworm genus Schmidtea (Tricladida: Dugesiidae), with a redescription of Schmidtea nova. Journal of Zoological Systematics and Evolutionary Research 56(3): 335-351.

  12. Melander Y. 1963. Cytogenetic aspects of embryogenesis in Paludicola, Tricladida. Hereditas 49,119-166.

  13. Pongratz N, Storhas M, Carranza S, Michiels NK. 2003. Phylogeography of competing sexual and parthenogenetic forms of a freshwater flatworm: patterns and explanations. BMC Evolutionary Biology 3, 23.

  14. Ribas M. 1990. Cariologia, sistemàtica i biogeografia de les planàries d'aigües dolces al Països Catalans. Barcelona. PhD Thesis, University of Barcelona 1990.

  15. Vila-Farré M, Sluys R, Almagro Í, Handberg-Thorsager M, Romero R. 2011. Freshwater planarians (Platyhelminthes, Tricladida) from the Iberian Peninsula and Greece: diversity and notes on ecology. Zootaxa 2779, 1-38.

Statistics

Summary

AssemblyASM4489252v1, INSDC Assembly GCA_044892525.1,
Database version115.1
Golden Path Length781,270,316
Genebuild byMax Planck Institute for Multidisciplinary Sciences
Genebuild methodImport
Data sourceMax Planck Institute for Multidisciplinary Sciences

Gene counts

Coding genes18,998
Non coding genes22,905
Small non coding genes22,905
Gene transcripts60,875