TropGBTropical Crops Genome Database

Olive /橄榄

Taxonomy:    Angiosperms / Eudicots / Asterids / Lamiales / Oleaceae / Olea

Introduction

1. The olive tree, Olea europaea, has been cultivated for olive oil, fine wood, olive leaf, ornamental reasons, and the olive fruit. About 90% of all harvested olives are turned into oil, while about 10% are used as table olives.

2. Olive trees show a marked preference for calcareous soils, flourishing best on limestone slopes and crags, and coastal climate conditions.

3. Olives are one of the most extensively cultivated fruit crops in the world.

4. One hundred grams of cured green olives provide 146 calories, are a rich source of vitamin E , and contain a large amount of sodium ; other nutrients are insignificant.

5. Olive tree pollen is extremely allergenic, with an OPALS allergy scale rating of 10 out of 10.

Genomic Version Information

Olea europaea v1.0

Genome Overview

Olea europaea, a major source of edible oil, is a keyingredient of the healthy Mediterranean diet. This plant is alsoa strategic crop from the socioeconomic point of view. Althoughthe exact ancestor(s) of the olive is unknown1, cultivated varieties thought to have stemmed from thewild olive, called oleaster, in Asia Minor, subsequently spread to the Mediterranean2. The economic importance of this crop includes its cultivation, the harvesting of olives and the production and marketing of the olives and olive oil3. The olivetree produces bioactive micronutrients, including antioxidant phenolic-compounds(oleuropein, hydroxytyrosol, alpha-tocopherol or vitamin E, carotenes, etc),with key relevance in pharmacy, medicine and cosmetics, among othersapplications. Indeed, its fruit (olives) and its juice (olive oil) are keyelements of the healthy Mediterranean diet, and olive-tree leaves are beingused to produce extracts, isolate commercial antioxidants, and wound-healing products.4 Olive oil consumption is growing outside theMediterranean basin's traditional olive-tree grove areas, including America(United States, Mexico, Brazil, Argentina, Peru), Asia (China and India) andAustralasia (Australia), as reported by FAO 2015.Such expansion is mainly due to the recognitionof the dietetic properties of olive oil, as source of healthy fatty acids andmicronutrients (antioxidants like phenolic compounds, vitamin E, carotenes,etc).

Genome Information

Assembly Source:OGC
Assembly Version:v1.0
Annotation Source:OGC
Annotation Version:v1.0
Total Scaffold Length (bp):1,142,316,613
Min. Number of Scaffolds containing half of assembly (L50):23
Shortest Scaffold from L50 set (N50):12,567,911
Total Contig Length (bp):1,031,504,502
Number of Contigs:88,889
Min. Number of Contigs containing half of assembly (L50):6,365
Shortest Contig from L50 set (N50):43,584
Number of Protein-coding Transcripts:50,684
Number of Protein-coding Genes:50,684
Percentage of Eukaryote BUSCO Genes:90.1
Percentage of Embroyphyte BUSCO Genes:79.9

Assembly

This release represents the Olea europaea var. sylvestris genome including ~1.140 Gbp of wild olive sequence. SOAPdenovo5 was first used to assemble the sequence reads, which resulted in a draft genome assembly of 1.48 Gbp, with scaffold N50 of 228 kbp . Then the assembly was improved by using 42,843 scaffolds larger than 1 kbp (N50 = 364.6 kbp). Using a genetic map, ~50% of sequences longer than 1 kbp (~572 Mbp) were anchored into 23 linkage groups.

Gene Prediction and Locus Naming

Gene prediction of the oleaster genome was carried out by combining three different approaches; i) ab initio prediction, ii) homology-based prediction and iii) transcriptome mapping. GLEAN6 was used to consolidate results. Protein sequences of Arabidopsis thaliana, S. indicum, S. tuberosum and Vitis vinifera were aligned with TBLASTN and genBLASTA8 against the matching genomic sequence using GeneWise8 for accurate spliced alignments. Next, the de novo gene-prediction methods GlimmerHMM9 and Augustus10 were used to predict protein-coding genes, with parameters trained for O. europaea var. sylvestris, A. thaliana, S. indicum, S. tuberosum and V. vinifera. A total of 50,684 protein-coding genes were predicted on the current assembly, of which 47,124 genes (93%) were confirmed by transcriptome data. Additionally, 31,245 genes were located on the anchored chromosomes. On the other hand, both homology-based and de novo approaches were used to find repeats and transposable elements (TE) in the oleaster genome. The homology-based approach involved applying commonly used databases of known repetitive-sequences, along with programs such as RepeatProteinMask and RepeatMasker11. Tandem repeats were also searched for in the genome, using Tandem Repeats Finder12. It was found that 51% of the genome assembly is composed of repetitive DNA. TEs and interspersed repeats occupied ~43% of the genome.

Reference Publication(s)

Unver, T., Wu, Z., Sterck, L., Turktas, M., Lohaus, R., Li, Z., … Van de Peer, Y. (2017). Genome of wild olive and the evolution of oil biosynthesis. Proceedings of the National Academy of Sciences, 114(44).https://doi.org/10.1073/pnas.1708621114