Genome mining of novel rubiginones from Streptomyces sp. CB02414 and characterization of the post-PKS modification steps in rubiginone biosynthesis

Background Rubiginones belong to the angucycline family of aromatic polyketides, and they have been shown to potentiate the vincristine (VCR)-induced cytotoxicity against VCR-resistant cancer cell lines. However, the biosynthetic gene clusters (BGCs) and biosynthetic pathways for rubiginones have not been reported yet. Results In this study, based on bioinformatics analysis of the genome of Streptomyces sp. CB02414, we predicted the functions of the two type II polyketide synthases (PKSs) BGCs. The rub gene cluster was predicted to encode metabolites of the angucycline family. Scale-up fermentation of the CB02414 wild-type strain led to the discovery of eight rubiginones, including five new ones (rubiginones J, K, L, M, and N). Rubiginone J was proposed to be the final product of the rub gene cluster, which features extensive oxidation on the A-ring of the angucycline skeleton. Based on the production profiles of the CB02414 wild-type and the mutant strains, we proposed a biosynthetic pathway for the rubiginones in CB02414. Conclusions A genome mining strategy enabled the efficient discovery of new rubiginones from Streptomyces sp. CB02414. Based on the isolated biosynthetic intermediates, a plausible biosynthetic pathway for the rubiginones was proposed. Our research lays the foundation for further studies on the mechanism of the cytochrome P450-catalyzed oxidation of angucyclines and for the generation of novel angucyclines using combinatorial biosynthesis strategies. Supplementary Information The online version contains supplementary material available at 10.1186/s12934-021-01681-5.


Background
Angucyclines are aromatic polyketides with an angular tetracyclic benz[a]anthracene skeleton [1]. By virtue of their structural and functional diversity, angucyclines have greatly held the attention of chemists and biologists. Since the isolation of tetrangomycin from Streptomyces rimosus in 1965 [2], the number of angucyclines increased steadily, with more than 300 compounds discovered to date. The benz[a]anthracene scaffold of angucycline is formed by folding a decaketide chain, which is biosynthesized by the type II polyketide synthases (PKSs), to generate the UWM6 intermediate [3]. This common intermediate is then decorated by various post-PKS modifications, such as oxidations [4,5], ring rearrangement/contraction [6], and glycosylations [7,8], to form numerous angucyclines. The angucyclines have

Bioinformatics analysis of the Streptomyces sp. CB02414 genome
Streptomyces sp. CB02414 was isolated from a soil sample collected on the beach of Dubai, United Arab Emirates. It was initially screened as a potential enediyne producer [17] and the presence of an enediyne BGC in the CB02414 genome (Accession number LIPF00000000.1) was confirmed by the antiSMASH analysis [18]. According to the antiSMASH result, CB02414 contains 27 BGCs, including three BGCs encoding nonribosomal peptides, six BGCs for terpenoids, three BGCs for siderophores and six BGCs for polyketides (Additional file 1: Table S1). Among the six polyketide BGCs, the two type II PKS gene clusters (cluster 5 and cluster 20) attracted our attention and were subjected to further bioinformatic analysis.

Bioinformatics analysis of cluster 5 reveals a possible spore pigment BGC
Analysis of cluster 5 revealed eight ORFs that encode proteins with high sequence similarities (identity > 66%) to proteins from spore pigment BGCs, such as the whiE gene cluster from Streptomyces coelicolor A3(2) [19], the whiEa gene cluster from Streptomyces avermitilis [20], the cur gene cluster from Streptomyces curacoi [21], the sah gene cluster from Streptomyces sahachiroi ATCC 33158 [22], the mec gene cluster from Streptomyces bottropensis [23], the whiESa gene cluster from Streptomyces aureofaciens CCM3239 [24], and the pksA gene cluster from Streptomyces collinus DSM2012 (Accession number AF293354.1, unpublished data) (Fig. 1A). In addition to the high sequence similarity, cluster 5 displays identical gene organization to the whiE gene cluster. These results suggested that cluster 5 might be responsible for the biosynthesis of spore pigment in CB02414.

Cluster 20 displays high similarity with the hrb gene cluster
Annotation of cluster 20 (the rub cluster) revealed a 44-kbp DNA fragment consisting of 38 putative ORFs, and most of the encoded proteins showed high similarity with proteins from the hrb gene cluster, which is responsible for the biosynthesis of the angucyclines hatomarubigins A, B, C, and D in Streptomyces sp. 2238-SVT4 [16] ( Table 1). Therefore, we proposed that the rub gene cluster is responsible for the biosynthesis of natural products of the angucycline family.
The minimal PKS gene cassette (rubF1, rubF2, and rubF3) is located near the left boundary of the rub gene cluster. Interestingly, there is an 8.8-kbp DNA region consisting of nine ORFs (rubM2 to rubP4) with identical organization to the fragment encoding proteins DWB77_01891 to DWB77_01899 from Streptomyces hundungensis (Fig. 2). The protein similarities between the nine ORFs from CB02414 and their respective homologues from S. hundungensis range from 45 to 80%, and the DNA sequence identity between these two fragments is 78%. AntiSMASH analysis of the S. hundungensis genome (Accession number CP032698.1) showed that the fragment containing proteins DWB77_1891 to DWB77_1899 is not located in a recognizable BGC. Moreover, the nine ORFs in the rub gene cluster do not exhibit similarities to proteins from the other known BGCs for angucycline or aromatic polyketide. Therefore, the 8.8-kbp DNA region in the rub gene cluster might arise from an insertion event caused by horizontal gene transfer or by transposition.

Inactivation of the minimal PKSs in the two type II PKS gene clusters leading to different phenotypes
To investigate the functions of the two type II PKS BGCs in CB02414, we inactivated orf3/orf4 (encoding ketosynthase and chain length factor, respectively) in cluster 5, to generate the mutant strain Z0001, and rubF1/rubF2 to generate the mutant strain Z0002, respectively. In order to avoid the possible cross complementation between the two minimal PKS gene cassettes of cluster 5 and the rub cluster, we also constructed a double-deletion mutant Z0003, in which the genes encoding the two ketosynthases and the two chain length factors in cluster 5 and the rub cluster were inactivated together (Additional file 1: Fig. S1).
The CB02414 wild-type and the Z0002 mutant strains produced grey pigmentation on the GYM agar plate, while the color of the spores of Z0001 and Z0003 was pale yellow on the same plate (Fig. 1B). Comparison of the HPLC profiles of the CB02414 wild-type and the Z0001 mutant did not reveal noticeable difference, while most of the peaks at 254 nm (compounds 1-8) disappeared in Z0002 and Z0003 (Fig. 3). These results clearly demonstrated that cluster 5 is involved in the formation of spore pigment in CB02414, and the rub gene cluster is responsible for the biosynthesis of compounds 1-8.

Large-scale fermentation and structural elucidation of the rubiginones from the CB02414 wild-type strain
The CB02414 wild-type strain was cultivated in three different media (medium B, C, and F), and the abundances of compounds 1-8 in these media were analyzed. Medium C was selected for large-scale fermentation because of the highest titers for compounds 1-8. (Additional file 1: Fig. S2). Eight rubiginone analogues were isolated and characterized from an 8-L fermentation of the CB02414 wild-type strain, including five new compounds (compounds 1, 2, 5, 7, and 8) whose structures were established by extensive spectral analysis (Fig. 4). The five new compounds were named rubiginones J (1), K (2), L (5), M (7), and N (8), respectively (Fig. 5). Rubiginones K and L are different from other known rubiginones, because their C-1 and C-12 positions have hydroxyl groups instead of carbonyl groups found in other rubiginones. The identities of rubiginone B 2 (3) [10], rubiginone A 2 (4) [10], and ochromycinone (6) [25] were determined by comparing their individual 1 H and 13 C NMR data with the NMR data from literature (Additional file 1: Figs. S3-S8). The C-3 methyl group in compounds 3, 4, and 6 has a β-configuration. Because this configuration arises from the early cyclization step of the angucycline skeleton biosynthesis, it is reasonable to speculate that the C-3 methyl group of all the rubiginones from CB02414 adopts a β-configuration. The peak marked with asterisk ( Fig. 3) has different UV spectra from compound 1, but it was converted quickly to compound 1 (< 1 h) during the isolation and purification steps, and we were not able to determine its structure (Additional file 1: Fig. S9).
Rubiginone J (1) was obtained as light-yellow needles. Its molecular formula C 20 Fig. S10). Rubiginone J has the same molecular formula as rubiginone D 2 , but the NMR data and NOESY correlation indicted that the C-4 hydroxyl group of the former compound has an α-configuration, while that of the latter compound has a β-configuration. The relative structure of compound 1 was determined by the 1 H-1 H COSY, HSQC and HMBC data (Additional file 1: Figs. S11-S17; Table S2). The absolute configuration of compound 1 was further supported by circular dichroism (CD) and electronic circular dichroism (ECD) analysis ( Fig. 5c; Additional file 1: Figs. S18-S21). Rubiginone K (2) Fig. S10). The UV absorption of rubiginone K was significantly different from rubiginone J (1) (Additional file 1: Fig. S22). The 13 C-NMR data showed that compound 2 possesses two hydroxyl groups at C-7 and C-12, which is different from the two carbonyl groups at C-7 and C-12 in compound 1. The absolute structure of rubiginone K was determined by 1 H-1 H COSY, HSQC, HMBC and NOESY analysis, as well as the X-ray Crystal   Fig. S30). Rubiginone N has the same molecular formula as the known compound 4-O-acetyl-rubiginone D 2 [14]. According to the NOESY spectrum (Additional file 1: Fig. S52), the hydrogen atom on C-3 of rubiginone N has correlation with the hydrogen atom at C-4, indicating that the hydroxyl group at C-4 of rubiginone N has an α configuration, while 4-O-acetyl-rubiginone D 2 possesses a β configuration at C-4. The absolute structure of rubiginone N was further determined by 1 H-1 H COSY, HSQC and HMBC analysis (Additional file 1: Figs. S46-S51). Rubiginone N is an acetylation derivative of rubiginone J.
The Z0004 mutant produced compounds 3, 4, 5, and 6, and the production of compounds 1, 2, 7, and 8 was abolished. All the produced compounds in Z0004 lack the hydroxyl group at C-2, but compounds 4 and 5 still possess the C-4 hydroxyl group. When rubN1 was introduced into Z0004, production of all the eight rubiginones was restored in the complementation strain Z0006. These results suggested that RubN1 is responsible for the introduction of the β-hydroxyl group at C-2. Similarly, the Z0005 mutant only produced compounds 3, 6, and 7, which lack the C-4 hydroxyl group, and the production of the other five compounds (rubiginones J, K, A 2 , L, and N) that have the C-4 hydroxyl group was abolished. The rubN2 complementation strain Z0007 was able to produce all the eight rubiginones. Therefore, RubN2 catalyzes the C-4 oxidation step to form a hydroxyl group with an α-configuration (Fig. 6).
The rubM4-deletion mutant Z0008 only produced compound 6 which contains a C-8 hydroxyl group instead of the C-8 methoxy group in the other seven compounds. In the rubM4 complementation strain Z0009, production of all the eight rubiginones was restored. From these results, it is clear that RubM4 is responsible for the methylation of the C-8 hydroxyl group in compound 6. Since compound 6 is the only rubiginone product of Z0008, the C-8 methylation reaction should occur in the early stage of the post-PKS tailoring steps in rubiginone biosynthesis (Fig. 6).

Proposed biosynthetic pathway for the rubiginones in CB02414
Thanks to the early biosynthetic studies on typical angucyclines, exemplified by urdamycin, landomycin, simocyclinone, and jadomycin, biosynthesis of the benz[a]anthracene skeleton was well elucidated [9]. The post-PKS tailoring reactions are the major factors that generate the structural diversity of angucyclines. The X-ray Crystallographic study of compounds 2 and 5 (30% probability displacement ellipsoids). c The CD spectra of compounds 1 and 7 In the rub gene cluster of CB02414, the presence of two cytochrome P450 hydroxylases and one O-methyltransferase was key to understanding the biosynthesis of the rubiginones. Based on the structures of the eight rubiginones isolated from the CB02414 wild-type and mutants, we proposed a biosynthetic pathway for these compounds (Fig. 7).
The four proteins RubF1, RubF2, RubF3, and RubG, which represent ketoacyl synthase, chain length factor, acyl carrier protein, and a PKS-associated ketoreductase, respectively, are common in all aromatic polyketide BGCs. The aromatase RubH and cyclase RubE are homologous to other counterparts in type II PKS gene clusters [9]. It was proposed that the six proteins mentioned above are sufficient to produce UWM6, the common biosynthetic intermediate for angucyclines [9] After a few modification steps, UWM6 is converted to ochromycinone (6), the earliest intermediate isolated in this study. The methyltransferase RubM4 catalyzes the methylation of C-8 hydroxyl group to form rubiginone B 2 (3), which is used as a substrate by the cytochrome P450 hydroxylases RubN1 and RubN2, to generate rubiginone J. There are two possible pathways for the two P450-catalyzed conversions from rubiginone B 2 to rubiginone J: (i) in path a, RubN1 catalyzes the C-2 hydroxylation to form rubiginone M, which is used as a substrate by RubN2 to introduce the C-4 β-hydroxyl group to produce rubiginone J; (ii) in path b, rubiginone B 2 is oxidized by RubN2 to generate compound 4, followed by the RubN1catalyzed transformation of compound 4 to rubiginone J. From the production profiles of the CB02414 wildtype and mutant strains, it seems that both path a and path b are applicable in CB02414 (Fig. 7). Compound 8 is an acetyl derivative of compound 1, and it is a minor product of the CB02414 wild-type strain. It is not clear whether the acetylation of compound 1 is a spontaneous or an enzymatic reaction.

Discussion
Actinobacteria are a rich resource of bioactive natural products, many of which have been extensively used in the clinical setting. However, most of the antibiotics used today were discovered 50 years ago and the rapid emergence of antibiotic resistance requires the discovery of new natural products with novel mode of action. Recent advances in high-throughput strain prioritization [17], next generation genome sequencing, and bioinformatics analysis [18,[27][28][29][30] have disclosed that actinobacteria possess a huge potential in producing novel secondary metabolites. Most of the BGCs in actinobacteria are silent under routine laboratory fermentation conditions. Therefore, linking BGCs to their encoded natural products (also known as the forward genome mining approach) is an important step in the discovery of novel microbial natural products. Many natural products have been identified by using the forward genome mining strategy [31].
Angucyclines have broad biological activities, including antitumor, antimicrobial, enzyme inhibition. Therefore, their discovery and understanding their structure-activity relationship has been a focus of natural product chemists. Although hundreds of members have been characterized in the past 60 years, the number of angucyclines is still growing quickly. According to our recent survey, more than 200 novel angucycline natural products have been reported in the last decade, most of which are isolated from Streptomyces spp. Biosynthetic studies of angucyclines are always intriguing, because of the diverse modifications brought by the post-PKS tailoring enzymes, such as oxidoreductases and glycosyltransferases [32,33]. Compared with the complex oxidative rearrangement reactions and decorations, biosynthesis of the angucycline backbones is proposed to occur via two different biosynthetic routes, using the benz[a]anthracene intermediate and the anthracyclinone intermediate, respectively. The benz[a]anthracene intermediate is involved in the formation of landomycin A, chrysomycin A, ravidomycin V, and kinamycin D [9]. The anthracycline intermediate was observed in the biosynthesis of angucyclines PD116198 and BE7585A [9].
More than ten rubiginones have been isolated from S. griseorubiginosus Q144-2, Saccharopolyspora sp. BCC 21906, Streptomyces sp. Gö N1/5, Streptomyces sp. SNA-8073, and Streptomyces sp. KMC004. Besides the well-known biological activities for angucyclines, such as cytotoxicity, antibacterial, and platelet aggregation inhibition, the rubiginones are able to potentiate the cytotoxicity of vincristine (VCR) against VCR-resistant cancer cell lines. Although many rubiginones have been isolated to date, their BGCs and biosynthetic pathways were not reported. The hrb gene cluster was cloned and characterized in 2010 and the biosynthetic pathway for hatomarubigins was proposed. However, the post-PKS  20:192 modifications of the hatomarubigins occur mainly on the D-ring. In this study, we analyzed the genome of Streptomyces sp. CB02414 and characterized the rub gene cluster that is responsible for the biosynthesis of the eight rubiginones isolated from CB02414. The rubiginones isolated from CB02414 feature different oxidative modifications on the A-ring. Based on the rubiginones produced by the CB02414 wild-type and mutant strains, we were able to propose a plausible biosynthetic pathway for the rubiginones. In this pathway, the two cytochrome P450 hydroxylases RubN1 and RubN2 introduce the α-hydroxyl group at C-2 and the β-hydroxyl group at C-4, respectively. Our attempts to overexpress rubN1 and rubN2 in Escherichia coli BL21 (DE3) failed to produce soluble proteins, thus hindering the kinetics studies of these two enzymes. We also introduced rubN1 and rubN2 into different Streptomyces hosts, including Streptomyces lividans TK24 and Streptomyces albus J1074, and tried the biotransformation of compound 3, 4, and 7 in the resulted strains, but no conversion was observed (data not shown). It is possible that the fed rubiginones was not able to penetrate the cell membrane of the heterologous hosts, and the biotransformation could not occur without the substrates. It is interesting that we were able to obtain the crystals for rubiginones K and L, which helped us to establish the configurations of the hydroxyl groups or methyl group at the C-2, C-3, and C-4 of the isolated rubiginones. We also conducted crystallization of the other isolated rubiginones using different conditions, but no crystal was obtained. Considering that rubiginones K and L both possess hydroxyl groups at C-1 and C-12, this structural feature may facilitate the crystallization process. Rubiginones K and L are not stable and their C-1 and C-12 hydroxyl groups are oxidized into ketones during the purification procedure, to generate rubiginones J and A 2 , respectively. The photo-induced oxidation of C-1 hydroxyl group in rubiginones was reported before [34], we believed that the oxidation of C-12 hydroxyl group may follow a similar mechanism as the C-1 hydroxyl group. Rubiginone K was produced as a major metabolite in the CB02414 wild-type and rubiginone L was a major product of the Z0004 mutant, but it remains unclear how these two rubiginones are biosynthesized and whether they are shunt products of the rubiginone biosynthetic pathway.

Conclusions
In this study, we first analyzed the two type II PKS gene clusters in Streptomyces sp. CB02414 and identified their functions through gene-inactivation experiments. We isolated eight rubiginones, including five new ones, from the CB02414 wild-type strain. Their structures were determined by the combination of HR-ESI-MS, 1D and 2D NMR, X-ray crystal diffraction, CD test, and ECD calculations. We investigated the functions of two cytochrome P450 hydroxylases (RubN1 and RubN2) in the rub cluster of CB02414 and confirmed that they are responsible for the introduction of the hydroxyl groups at C-2 and C-4 of rubiginones, respectively. Based on the production profiles in the CB02414 wild-type and the gene-deletion mutant strains, we proposed a biosynthetic pathway for the rubiginones. Our study enlarges the rubiginone family of natural products and lays the foundation for the generation of novel rubiginones using the combinatory biosynthesis strategy. Moreover, this study exemplifies the power of the genome mining strategy in the targeted discovery of novel microbial natural products.

General experimental procedures
HRMS spectra were analyzed on an LTQ-ORBITRAP-ETD instrument (Thermo Scientific, MA, USA). NMR spectra were recorded on the Varian spectrometers (400/500/600 MHz) (Brucker, Ettlingen, Germany). CD spectroscopy was measured by using JASCO (J-185, Tokyo, Japan) at room temperature (25 °C). Crystal data using Cu Kα radiation were acquired on a Rigaku APEX-II XtaLAB PRO MM007HF diffractometer at 100 K. Optical Rotatory Dispersion (ORD) spectrum was used to determine the optical activity of compounds (Rudolph Research Analytical, Autopol IV, USA). For purification of compounds, column chromatography (CC) was carried out using silica gel or Sephadex LH-20. Reversedphase high performance liquid chromatography (RP-HPLC) was performed using a Waters 1525 Binary HPLC Pump equipped with a Welch Ultimate AQ-C18 column (250 × 10 mm, 5 μm, Welch Materials Inc., Shanghai, China) and a Waters 2489 UV/Visible Detector (Shimadzu, Kyoto, Japan).

Bacterial strains and fermentation
Streptomyces sp. CB02414 was grown on the GYM agar plate containing per liter: 4 g yeast extract, 4 g glucose, 10 g malt extract, 2 g CaCO 3 , 20 g agar, pH 7.2) and incubated at 30 °C to obtain spores after 3-5 days. Escherichia coli DH5α and S17-1 were grown in liquid Luria-Bertani medium with antibiotic added (apramycin or kanamycin) and incubated at 30 °C for 12 h. The final concentration of antibiotic was 50 μg/mL.
Streptomyces sp. CB02414 was inoculated into 250-mL Erlenmeyer flasks containing 50 mL tryptic soy broth (TSB) medium and some glass beads, then cultivated at 30 °C on a rotary shaker at 220 rpm for 36 h. Then 10% (v/v) seed culture was inoculated into 50 mL production medium (B medium, g/L: 40 dextrin, 7.5 tomato paste, 2.5 NZ-Amine, 5 yeast extract, pH 7.2 ± 0.2; C medium, g/L: 25 glucose, 25 corn flour, 5 yeast extract, pH 7.2 ± 0.2; F medium, g/L: 100 sucrose, 10 glucose, 5 yeast extract, 0.1 casamino acids, 21 MOPS, trace elements 1 mL, 0.25 K 2 SO 4 , 1 MgCl 2 ·6H 2 O, pH 7.2 ± 0.2), and then cultured for 7 days at 30 °C on a rotary shaker at 220 rpm. After fermentation for 6 days, 2% (m/v) XAD-16 resin was added into each flask and then incubated overnight on a rotary shaker at 220 rpm. For large-scale fermentation (8-L), 50 mL of seed culture was inoculated into a 2-L Erlenmeyer flask containing 500 mL of production medium and 16 flasks were used for the fermentation. After the fermentation, the crude extract was exposed to sunlight in air for 2 h, following the same method used previously, to simplify the purification steps of the photosensitive compounds [14].

Gene inactivations in Streptomyces sp. CB02414
A pOJ260-based plasmid Y0002 was constructed to generate the ΔrubF1/rubF2 replacement mutant in CB02414 via a double crossover homologous recombination. To inactivate rubF1/rubF2, a partial fragment spanning these two genes was replaced with the kanamycin-resistance gene using the Seamless Cloning and Assembly kit (TSINGKE, China), and the mutated rubF1/rubF2 was cloned into pOJ260 between the XbaI and Hin-dIII restriction sites. This plasmid was introduced into CB02414 (wild-type) and Z0001 (i.e., CB02414Δorf3/ orf4) by intergeneric conjugation, then selected for kanamycin-resistance and apramycin-sensitive phenotype to obtain the desired double-crossover mutants Z0002 (i.e., CB02414ΔrubF1/rubF2) and Z0003 (i.e., Z0001ΔrubF1/ rubF2), respectively. Deletion of orf3/orf4, rubN1, rubN2 and rubM4 in CB02414 was conducted via in-frame deletion, to obtain mutants Z0001, Z0004, Z0005 and Z0008, respectively. The mutants were confirmed by PCR analysis and DNA sequencing. The strains and plasmids were