TropGBTropical Crops Genome Database

Kenaf /红麻

Taxonomy:    Angiosperms / Eudicots / Rosids / Malvales / Malvaceae / Hibiscus / H. cannabinus

Introduction

1. The main uses of kenaf fibre have been rope, twine, coarse cloth (similar to that made from jute), and paper.

2.Uses of kenaf fibre include engineered wood; insulation; clothing-grade cloth; soil-less potting mixes; animal bedding; packing material; and material that absorbs oil and liquids.

3. As part of an overall effort to make vehicles more sustainable, Ford and BMW are making the material for the automobile bodies in part from kenaf.

4.Kenaf seeds yield an edible vegetable oil.

5. The most common process to make kenaf paper is using soda pulping before processing the obtained pulp in a paper machine.

Genome Overview

Kenaf (Hibiscus cannabinus, 2n = 36), a diploid plant in the Malvaceae family, is one of the most important species after cotton and jute for natural fibre production (Zhang et al., 2015a). Polyploidy is recognized as an influence on plant genome evolution, and as a well-established signs of wholegenome duplication (WGD) in many sequenced genomes, such as Gossypium species including G. raimondii (DD, D-genome) (Paterson et al., 2012), G. arboreum (AA, A-genome) (Li et al., 2014), G. hirsutum (AtDt) (Zhang et al., 2015b) and G. australe (GG, G-genome) (Cai et al., 2019). The ploidy in Hibiscus varies from 2 to 16, including H. phoeniceus (2n = 2x = 22), H. pedunculatus (2n = 2x = 30), H. syriacus (2n = 4x = 80), H. aspera (2n = 8x = 72) and H. rosasinensis (2n = 16x = 144). Recently, a draft genome of H. syriacus was assembled with a genome size of 1.75 Gb (Kim et al., 2017). In contrast to seed fibre in cotton, bast (phloem) fibre is derived from the stem bark of plants such as kenaf, jute (Corchorus L.), hemp (Cannabis sativa), ramie (Boehmeria nivea) and flax (Linum usitatissimum). Although the genomes of the seed fibre species G. arboretum(Li et al., 2014), G. raimondii (Paterson et al., 2012) and G. hirsutum (Zhang et al., 2015b) have been sequenced. However, genomic information on bast fibre species is limited and molecular biology research progresses slowly. The sequencing of the kenaf genome will enhance understanding of the genetic mechanism on bast fibre development, as it has for jute (Islam et al., 2017). Kenaf was presumably domesticated in Africa and exhibits a wide range of adaptation to different climates and soils (Zhang et al., 2015a). Kenaf has gained much attention worldwide due to the high biomass yields from kenaf that can be used to produce paper, rope, building materials, livestock feed, absorbents and so on. The annual global production of jute, kenaf and allied fibre generates a farm value of  US$2.3 billion (http://www.fao.org/faostat/en/#data/ QC).

Leaves are the primary source of photoassimilate in crops. Remarkable phenotypic difference exists for leaf shape in kenaf, including two types of round (entire) and lobed leaves. Leaf shape in kenaf is an important trait that affects canopy architecture, yield and other plant attributes. A typical lobed-leaf kenaf cultivar produces a lower canopy of round leaves before transitioning toan upper canopy of tri-, penta- and septi-lobed leaves, the growth stage that is associated with bast fibre development. Leaf shape in kenaf is unique, and breeders used a single locus to purposefully alter leaf shape among cultivars, especially hybrids. And bast fibre in kenaf makes up 35–40% of stem weight and can be processed into high-quality industrial materials because of its low content of woody impurities and pectin (Xiong, 2008). A precise understanding of the genetic architecture underlying leaf morphology and bast fibre is critical for improving the fibre yield and quality of climate-resilient kenaf varieties.

Genome Information

Total size of assembly (Mb) 1078
Number of chromosomes18
Number of contigs 1990
Longest length (Mb)79
N50 (Mb)56
GC content (%)37.6
Transposable elements (%) 67.83
Gene density0.61
miRNAs 131

Sequencing, assembly and annotation

Hibiscus cannabinus var. ‘Fuhong 952’ was chosen for genome sequencing. The genome size of H. cannabinus was estimated at 1000 Mbp using flow cytometry with Arabidopsis thaliana genome as a reference (Figure S1). A high-quality H. cannabinus genome was obtained by incorporating single-molecule real-time (SMRT) long reads, Illumina short reads, chromatin conformation capture technology (Hi-C) as well as a high-density genetic map. Appropriate 77 Gb (~80 9 coverage) raw SMRT data were generated using the PacBio Sequel System. The contig-level assembly was performed on PacBio long reads using the CANU package (Koren et al., 2017) (Table S1). The resulting assembly contains 1078 Mbp sequences, similar to the estimated genome size based on flow cytometry, with contig N50 of 2.73 Mbp and the longest contig length of 18.2 Mbp (Table 1). Hi-C libraries yielded 212 million 150-bp paired-end Illumina reads (Table S2). Karyotype analysis reveals 18 pairs of chromosomes in H. cannabinus (Figure S2). Based on the number of chromosomes, these paired-end Hi-C reads were uniquely mapped onto the assembly contigs and grouped into 18 pseudo-chromosomes (Burton et al., 2013) (Figure 1a, Figure S3, Table S3).

To increase the accuracy of the assembly, Illumina HiSeq short reads (Table S4) were recruited to further improve the assembly using the Pilon program (Walker et al., 2014). A total of 121.75 million (99.3%) reads were mapped to the assembly (Table S5). The quality of the assembly was further assessed by mapping RNA-Seq reads to the genome. A total of 441 970 of 485 096 (91.11%) transcripts could be aligned to at least one contig with 90% sequence identity. We detected 97.38%, 99.21% or 99.80% of transcripts with minimum lengths of 500, 1000 or 2000 bp, respectively, which could be aligned in our kenaf genome assembly (Table S6). Moreover, 234 (94.4%) gene models out of the 248 ultra-conserved core eukaryotic genes (CEGs) from CEGMA analysis (Parra et al., 2007) (Table S7), and 1375 (95.5%) out of 1440 conserved genes from BUSCO analysis (Simao et al., 2015) (Table S8) were completely recalled in our assembly. These results indicate a high-quality assembly and a high level of completeness. A high-resolution genetic map based on 3828 evenly distributed single-nucleotide polymorphism (SNP) markers derived from a ‘Zanyin No. 1’ 9 ‘Fuhong 952’ F2 of 390 individuals showed that 99.44% (1072 out of 1078 Mbp) of theassembled genome was anchored and oriented to 18 pseudochromosomes (Table S9; Figure S4).

Based on this reference genome of H. cannabinus, 66 004 genes were annotated by combining ab initio gene prediction, homologous protein searches and assembly of RNA-Seq reads. The average gene length in H. cannabinus is 3226 bp, and the number of exons is 5.78 at average (Table S10). Compared with G. raimondii genome (Paterson et al., 2012), H. cannabinus genome contains the average gene length at 3225.7 bp and the average exon number per gene at 5.78. In most H. cannabinus chromosomes, genes were enriched in the sub-telomeric regions, while transposable elements were distributed mainly in gene-poor regions (Figure 1a). To identify the putative functions of genes, these annotated kenaf genes were compared against the protein sequences available at public databases from various species with an E-value threshold of 10 5 . Of these 66 004 kenaf genes, 53 686 (81.20%) were present in at least one published genome, including T. cacao (Argout et al., 2011), G. hirsutum (Li et al., 2015), G. raimondii (Paterson et al., 2012), A. thaliana (Initiative, 2000; Riechmann et al., 2000) or O. sativa (Goff et al., 2002). This indicates the high accuracy of H. cannabinus gene predictions (Table S11). Among these kenaf genes, 46 823 (70.82%) and 45 607 (68.98%) displayed high similarity to known proteins in T. cacao and G. raimondii, respectively, which also belong to the Malvaceae. However, the number of mapped genes in H. cannabinus (46 822) is about twice that in T. cacao (18 627) and G. raimondii (24 935) (Table S11), which suggests a possible mechanism for the dramatic increase in the number of genes in H. cannabinus. A total of 131 microRNAs (miRNAs) were also identified based on the search of public miRNA databases (Table 1; Table S12). Further, 39 telomere fragments (Table S13) and 3572 centromere fragments (Table S14). 67.83% transposable elements (TEs) (Table 2) were predicted in the genome of H. cannabinus, which were divided into two main classes: I and II, containing 58.41% retro-element and 8.7% DNA transposon, respectively.

Reference Publication(s)

Liwu Zhang, et al. The genome of kenaf (Hibiscus cannabinus L.) provides insights into bast fibre and leaf shape biogenesis. Plant Biotechnology Journal. 2020.https://pubmed.ncbi.nlm.nih.gov/31975524/