Taxonomy

REF

Kingdom Plantae
Phylum Tracheophytes
Class Angiosperms
Order Nymphaeales
Family Nymphaeaceae
Genus Nymphaea
Species N. nouchali
NCBI Taxonomy ID 210,225

Introduction

REF

Sequencing, Assembly, and Annotation

Assembly
To assemble the 49.8 Gb data composed of 5.5 million reads, we filtered the reads to remove organellar DNA, reads of poor quality or short length, and chimaeras. The contig-level assembly was performed on full PacBio long reads using the Canu package22. Canu v.1.3 was used for self-correction and assembly. We then polished the draft assembly using Arrow (https://github.com/PacificBiosciences/GenomicConsensus ). To increase the accuracy of the assembly, Illumina short reads were recruited for further polishing with the Pilon program (https://github.com/broadinstitute/pilon). The genome assembly quality was measured using BUSCO (Benchmarking Universal Single-Copy Orthologues)23 v.3.0. The paired-end reads from Hi-C were uniquely mapped onto the draft assembly contigs, which were grouped into chromosomes and scaffolded using the software Lachesis (https:// github.com/shendurelab/LACHESIS). Genscan (http://genes.mit.edu/GENSCAN.html) and Augustus24 were used to carry out de?novo predictions with gene model parameters trained from Arabidopsis thaliana. Furthermore, gene models were de?novo predicted using MAKER25. We then evaluated the genes by comparing MAKER results with the corresponding transcript evidence to select gene models that were the most consistent on the basis of an AED metric.
References

Zhang L, Chen F, Zhang X, Li Z, Zhao Y, Lohaus R, Chang X, Dong W, Ho SYW, Liu X, Song A, Chen J, Guo W, Wang Z, Zhuang Y, Wang H, Chen X, Hu J, Liu Y, Qin Y, Wang K, Dong S, Liu Y, Zhang S, Yu X, Wu Q, Wang L, Yan X, Jiao Y, Kong H, Zhou X, Yu C, Chen Y, Li F, Wang J, Chen W, Chen X, Jia Q, Zhang C, Jiang Y, Zhang W, Liu G, Fu J, Chen F, Ma H, Van de Peer Y, Tang H. The water lily genome and the early evolution of flowering plants. Nature. 2020 Jan;577(7788):79-84. doi: 10.1038/s41586-019-1852-5. Epub 2019 Dec 18. PMID: 31853069; PMCID: PMC7015852.

Genome Statistics

REF UNRST
Genome Size 409,930,971      n=806
Ave 508,599.22
Largest 44,612,865
N50 25,516,871      n=7
N60 25,082,955      n=9
N70 24,868,339      n=10
N80 23,035,089      n=12
N90 12,577,093      n=14
N100 3,011      n=806
N_count 62,600
Gaps 669

Contacts

Liangsheng Zhang(Zhejiang University) (email: zls83@zju.edu.cn)

Fei Chen(Hainan University) (email: feichen@hainanu.edu.cn)