TropGBTropical Crops Genome Database

Sugarcane /甘蔗

Taxonomy:    Magnoliopsida / Liliidae / Commelinanae / Poales / Poaceae / Saccharum /Saccharum spontaneum

Introduction

1. Sugarcane, perennial tall solid herb. The rhizome is stout and well developed. The stalk height is 3-5 (-6) meters. ChinaTaiwan,Fujian,Guangdong,Hainan,Guangxi,Sichuan,Yunnanand so onThe tropicsWidely cultivated.

2. Sugar cane istemperateandtropicCrops are manufacturedcane sugarand can be refinedethanolas an energy alternative. More than 100 countries around the world produce sugarcane, and the largest sugarcane producer isBrazil,IndiaandChina.

3. Sugarcane is an annual or perennial tropical and subtropical herb that belongs to the C4 crop.

4. Sugar cane is a perennial tall solid herb. The rhizome is stout and well developed.

5. Land preparation is to provide a deep, loose and fertile soil condition for the growth of sugarcane, so as to fully meet the needs of its root growth, so that the root system can better play the role of absorbing water and nutrients. At the same time, land preparation can also reduce diseases, insects and weeds in sugarcane fields.

Genomic Version Information

Saccharum spontaneum Np-X

Genome Overview

Saccharum spontaneum is a founding Saccharum species and exhibits wide variation in ploidy levels. We have assembled a high-quality autopolyploid genome of S. spontaneum Np-X (2n=4x=40) into 40 pseudochromosomes across 10 homologous groups, that better elucidates recent chromosome reduction and polyploidization that occurred circa 1.5 million years ago (Mya). One paleo-duplicated chromosomal pair in Saccharum, NpChr5 and NpChr8, underwent fission followed by fusion accompanied by centromeric split around 0.80Mya. We inferred that Np-X, with x=10, most likely represents the ancestral karyotype, from which x=9 and x=8 evolved. Resequencing of 102 S. spontaneum accessions revealed that S. spontaneum originated in northern India from an x=10 ancestor, which then radiated into four major groups across the Indian subcontinent, China, and Southeast Asia. Our study suggests new directions for accelerating sugarcane improvement and expands our knowledge of the evolution of autopolyploids.

Genome Information

Genome size:2.8 Gb
Total ungapped length: 2.8 Gb
Number of chromosomes:40
Number of scaffolds:1,033
Scaffold N50:68.6 Mb
Scaffold L50:17
Number of contigs:15,510
Contig N50:381.9 kb
Contig L50:2,133
GC percent:44.5
Genome coverage:18.0x
Assembly level:Chromosome

Sequencing, Assembly, and Annotation

Fresh, young leaves were collected from individual S. spontaneum Np-X plants cultivated in a greenhouse kept at 25–30°C with 16h of light per day. Genomic DNA was isolated from these young leaves using a cetyltrimethylammonium bromide (CTAB) method38. The extracted DNA was then packaged with dry ice and sent to Biomarker Technologies Corporation for the construction of CCS libraries and Illumina short-read libraries that were subsequently sequenced on the PacBio Sequel and Illumina Hiseq platforms, respectively. A total of ~52Gb of PacBio HiFi reads and ~417Gb of Illumina reads were generated for de novo assembly of the S. spontaneum Np-X genome.

Total RNA was extracted from mature stems and leaves of S. spontaneum Np-X using TRIzol (Invitrogen). These two RNA samples were sent to Novogene for the construction of transcriptome libraries and were sequenced on an Illumina platform.

Fresh, tender leaves were collected from S. spontaneum Np-X plants and used to prepare chromatin crosslinked to DNA and fixed with formaldehyde5, and then we digested the crosslinked DNA with HindIII. The produced sticky ends were biotinylated and proximity-ligated to form chimeric junctions. The processed DNA was further enriched and physically sheared into fragments of 300–500 base pairs (bp). After that, all of the prepared DNA fragments were processed into paired-end sequencing libraries. Finally, a total of ~290Gb (~105×) 150-bp paired-end Hi-C reads were produced from the Illumina platform and further used for genome scaffolding.

Transcriptomic data were obtained from RNA isolated from stem and leaf tissues of S. spontaneum Np-X. Trimmomatic was used to further filter RNA sequencing (RNA-seq) data, and then HISAT2 (v2.1.0)45, which blocks duplicates, was used to align reads to the reference genome sequence. After the reads were aligned, different coverage thresholds were set according to the sequencing depth of each aligned region to obtain reliable intron and optimal transcript information. Then, we used TransDecoder (v5.5.0) (https://github.com/TransDecoder/TransDecoder/wiki) to predict the open reading frames (ORFs) of optimal transcripts and define gene models. The optimal gene models were then screened and trained using AUGUSTUS (v3.3.2)software. We chose protein sequences from closely related species of maize, sorghum, rice and sugarcane, and used them as input for Genewise (wise2-4-1) software (https://github.com/brewsci/homebrew-bio/blob/master/Formula/genewise.rb) for gene prediction in the S. spontaneum Np-X genome. Next, we collected exon information for homologous proteins and transcripts, and collected intron information by comparing reads. AUGUSTUS was used to combine the above intron and exon information for gene prediction. The results of the above three methods were integrated, and then the Pfam database47 was used for screening to obtain final gene prediction results. Finally, we used a Perl script to analyze the final assembled genome for eukaryotic genes and obtained a total of 123,128 high-confidence gene models.

Reference Publication(s)

1.  Zhang Q et al., "Genomic insights into the recent chromosome reduction of autopolyploid sugarcane Saccharum spontaneum.", Nat Genet, 2022 Jun;54(6):885-896