1. Hevea brasiliensis, the Pará rubber tree, sharinga tree, seringueira, or most commonly, rubber tree or rubber plant, is a flowering plant belonging to the spurge family Euphorbiaceae originally native to the Amazon basin, but is now pantropical in distribution due to introductions
2. Hevea brasiliensis is a tall deciduous tree growing to a height of up to 43 m (141 ft) in the wild. Cultivated trees are usually much smaller because drawing off the latex restricts their growth.
3. In the wild the tree can reach a height of up to 140 feet (43 m). The white or yellow latex occurs in latex vessels in the bark, mostly outside the phloem.
4. The rubber tree takes between seven and ten years to deliver the first harvest.
5. The South American rubber tree grew only in the Amazon rainforest, and increasing demand and the discovery of the vulcanization procedure in 1839 led to the rubber boom in that region, enriching the cities of Belém, Santarém, and Manaus in Brazil and Iquitos, Peru, from 1840 to 1913. In Brazil, before the name was changed to 'Seringueira' the initial name of the plant was 'pará rubber tree', derived from the name of the province of Grão-Pará. In Peru, the tree was called 'árbol del caucho', and the latex extracted from it was called 'caucho'.
The rubber tree, Hevea brasiliensis GT1, produces natural rubber that serves as an essential industrial raw material. Here, we present a high-quality reference genome for a rubber tree cultivar GT1 using single-molecule real-time sequencing (SMRT) and Hi-C technologies to anchor the ∼1.47-Gb genome assembly into 18 pseudochromosomes. The chromosome-based genome analysis enabled us to establish a model of spurge chromosome evolution, since the common paleopolyploid event occurred before the split of Hevea and Manihot. We show recent and rapid bursts of the three Hevea-specific LTR-retrotransposon families during the last 10 million years, leading to the massive expansion by ∼65.88% (∼970 Mbp) of the whole rubber tree genome since the divergence from Manihot. We identify large-scale expansion of genes associated with whole rubber biosynthesis processes, such as basal metabolic processes, ethylene biosynthesis, and the activation of polysaccharide and glycoprotein lectin, which are important properties for latex production. A map of genomic variation between the cultivated and wild rubber trees was obtained, which contains ∼15.7 million high-quality single-nucleotide polymorphisms. We identified hundreds of candidate domestication genes with drastically lowered genomic diversity in the cultivated but not wild rubber trees despite a relatively short domestication history of rubber tree, some of which are involved in rubber biosynthesis. This genome assembly represents key resources for future rubber tree research and breeding, providing novel targets for improving plant biotic and abiotic tolerance and rubber production.
Illumina/PacBio sequencing depth (×) | 261.38/103.69 |
Estimated genome size (Mb) | 1561 |
Chromosome number (2n = 2x = ) | 36 |
Total length of assembly (Mb) | 1472 |
No. of contigs | 16 023 |
Contig N50 (kb) | 152.7 |
No. of scaffolds | 600 |
Contig number on chromosomes | 15 324 |
Chromosome length (Mb) | 1442 |
GC content of the genome (%) | 33.87 |
DNA was extracted from a GT1 individual for PacBio RSII and HiSeq sequencing platforms. One library with 20 kb insert size was constructed and sequenced for 100 SMRT cells on PacBio RSII. The PacBio data was assembled by FALCON (version 0.3.0). We generated ∼61 Gb Illumina data with 500 bp insert size on HiSeq 2500 platform to polish assembled genome sequences using pilon. For Hi-C sequencing, chromosome structure was fixed by formaldehyde crosslinking, and then MboI enzyme was used to shear DNA. Hi-C library with 200-600 bp insert size was constructed, which was sequenced on HiSeq 2000 platform. The Hi-C sequence data were qualified with HIC-pro, in which the validly mapped reads were selected to cluster and order the assembled genome sequences using AllHic v0.8.12.
Repetitive elements were identified based on homologous detection and de novo searches. For homolog strategy, whole genome sequences were aligned with RepBase 21.01 using RepeatMasker program. LTR_FINDER1.0.6 (Xu and Wang, 2007) was applied to identifying LTR retrotransposon elements to construct de novo repeat library, and genomic locations were also detected using RepeatMasker.
Gene models were predicted based on the five closely related plant species of the rubber tree (M. esculenta, R, communis, J. carcass, L. usitatissimumand P. trichocarpa) and RNA-seq data. Amino acid sequences from these genomes were mapped using BLAT (Kent, 2002) with the rubber tree genome assembly to search for candidate protein-coding sequences, and then GeneWise was performed to predict gene models. RNA-seq reads from bark, stem, seed, root, leaf and inflorescence tissues were mapped to the assembled GT1 genome sequences with TopHat, and transcripts were determined by Cufflink; these evidences were subsequently integrated by GLEAN to generate conserved gene models. The integrated genes were compared with Cufflink and GeneWise results based on cultivar Reyan7-33-97, and transcripts or functional genes were finally added to the GLEAN gene set. For function annotation, amino acid sequences were searched with known UniProt Consortium and, KEGG databases. InterProscan were performed to annotate protein domains.
Liu J, Shi C, Shi CC, et al. The Chromosome-Based Rubber Tree Genome Provides New Insights into Spurge Genome Evolution and Rubber Biosynthesis. Mol Plant. 2020;13(2):336-350. DOI:10.1016/j.molp.2019.10.017