Taxonomy
REF
Kingdom |
Plantae
|
Phylum |
Tracheophytes
|
Class |
Angiosperms
|
Order |
Nymphaeales
|
Family |
Nymphaeaceae
|
Genus |
Victoria
|
Species |
V. cruziana
|
NCBI Taxonomy ID |
85,961
|
Introduction
REF
- Victoria amazonica is a species of flowering plant, the second largest in the water lily family Nymphaeaceae.
- Victoria amazonica has very large leaves (lamina) (and commonly called "pads" or "lily pads"), up to 3 m (10 ft) in diameter,
that float on the water's surface on a submerged stalk (petiole),
7–8 m (23–26 ft) in length, rivaling the length of the green anaconda.
- These flowers can grow up to 40 cm (16 in) in diameter and 3.5 pounds ( 1.6 kilograms) in weight.exceeded in mass only by members of the genus Rafflesia.
- The stem and underside of the leaves are coated with many small spines to defend itself from fish and other herbivores that dwell underwater,
although they can also play an offensive role in crushing rival plants in the vicinity as the lily unfolds as it aggressively seeks and hogs sunlight,
depriving other plants directly beneath its leaves of such vital resource and significantly darkening the waters below.
Genome Statistics
REF
UNRST
VERSION
Genome |
3,259,806,839 n=1550
|
Chromosome Number |
12
|
Contig Number |
1538
|
GC content |
29.61%
|
Number of protein-coding genes |
186889
|
Mean transcript length(bp) |
3630.46
|
Mean exon length(bp) |
243.372
|
Mean exon per mRNA |
3.32578
|
Ave |
2,103,101.19
|
Largest |
344,633,227
|
N50 |
279,747,614 n=6
|
N60 |
257,272,190 n=7
|
N70 |
248,851,342 n=8
|
N80 |
220,831,560 n=9
|
N90 |
203,266,737 n=11
|
N100 |
2,000 n=1,550
|
N_count |
1,480,519
|
Gaps |
2,980
|
Contacts
Victoria cruziana: Liangsheng Zhang(Zhejiang University) (email: zls83@zju.edu.cn)
Sequencing, Assembly, and Annotation
Assembly
An initial genome sequence assembly (VB01) was generated with all ONT reads using Shasta
v0.10.0 with the config set to ‘Nanopore-May2022’. A second assembly
(VB02) was generated with NextDenovo2 v2.5.2 (Hu et al., 2024) using the config file that is
presented in Additional File 1. Whitespaces in contig names were removed with “sed 's, ,_,g'
asm.fasta > asm.rm-wht-spc.fasta”. General assembly statistics like assembly size, number of
contigs, and N50 were calculated with contig_stats3.py (Meckoni et al., 2023). A completeness
assessment was conducted with BUSCO v5.7.1 (Simão et al., 2015; Manni et al., 2021) using
the genome or transcriptome mode according to the input data type and the reference dataset
embryophyta_odb10.
Structural annotation
The prediction of gene models was conducted with AUGUSTUS v3.3 (Stanke et al., 2006; Keller
et al., 2011) for the assembly VB01 with parameters previously optimized for the detection of
non-canonical splice sites (Pucker et al., 2017). For the assembly VB02, BRAKER3 v3.0.8
(Gabriel et al., 2023) was applied with protein hints derived from Viridiplantae.fa (Kuznetsov et
al., 2023) and RNA-seq hints. The RNA-seq hints were derived from a mapping of paired-end
RNA-seq data generated with HISAT2 v2.2.1 (Kim et al., 2019) with default parameters against
the assembly VB02. Samtools v1.10 (using htslib 1.10.2-3ubuntu0.1) (Li et al., 2009) was
applied to generate a sorted BAM file which was then passed to BRAKER3. The final structural
annotation was largely based on the BRAKER3 results produced for VB02 only complemented
with the coding sequence of an anthocyanin synthase (ANS) gene model lifted from the VB01
annotation. The derived polypeptide sequence of ANS was compared against the VB02
assembly via tBLASTn v2.15.0+ (Gertz et al., 2006) to locate this gene.
Functional annotation and candidate gene identification
Genes associated with the anthocyanin metabolism were identified with three dedicated tools.
To identify the structural genes of the flavonoid biosynthesis pathway, an analysis with KIPEs
v3.2.4 (Rempel et al., 2023) and the flavonoid baits data set v.3.1.7 was conducted (Additional
File 2). Flavonoid biosynthesis controlling MYB transcription factors were annotated using the
MYB_annotator v1.0.1 (Pucker, 2022) with parameters described in Additional File 3. The bHLH
transcription factors were identified with the bHLH_annotator v1.04 (Thoben & Pucker, 2023)
with parameters described in Additional File 4. A general annotation was produced using
construct_anno.py and the functional annotation available for A. thaliana (Pucker & Iorizzo,
2023).
References
Genome sequence and RNA-seq analysis reveal genetic basis of flower coloration in the giant water lily Victoria cruziana
Genome Statistics
REF
UNRST
VERSION
Genome |
3,544,819,870 n=837
|
Chromosome Number |
12
|
Contig Number |
837
|
GC content |
29.48%
|
Number of protein-coding genes |
0
|
Mean transcript length(bp) |
948.234
|
Mean exon length(bp) |
262.396
|
Mean exon per mRNA |
|
Ave |
4,235,149.19
|
Largest |
59,784,015
|
N50 |
14,287,621 n=70
|
N60 |
11,690,671 n=97
|
N70 |
8,565,609 n=133
|
N80 |
6,228,323 n=182
|
N90 |
3,707,677 n=252
|
N100 |
13,124 n=837
|
N_count |
0
|
Gaps |
0
|
Contacts
Victoria cruziana: Melina Sophie Nowak(Technische Universität Braunschweig) (team: https://www.tu-braunschweig.de/en/ifp/pbb/team/melina-sophie-nowak)