Archives

  • 2018-07
  • 2018-10
  • 2018-11
  • 2019-04
  • 2019-05
  • 2019-06
  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2019-12
  • 2020-01
  • 2020-02
  • 2020-03
  • 2020-04
  • 2020-05
  • 2020-06
  • 2020-07
  • 2020-08
  • 2020-09
  • 2020-10
  • 2020-11
  • 2020-12
  • 2021-01
  • 2021-02
  • 2021-03
  • 2021-04
  • 2021-05
  • 2021-06
  • 2021-07
  • 2021-08
  • 2021-09
  • 2021-10
  • 2021-11
  • 2021-12
  • 2022-01
  • 2022-02
  • 2022-03
  • 2022-04
  • 2022-05
  • 2022-06
  • 2022-07
  • 2022-08
  • 2022-09
  • 2022-10
  • 2022-11
  • 2022-12
  • 2023-01
  • 2023-02
  • 2023-03
  • 2023-04
  • 2023-05
  • 2023-06
  • 2023-07
  • 2023-08
  • 2023-09
  • 2023-10
  • 2023-11
  • 2023-12
  • 2024-01
  • 2024-02
  • 2024-03
  • In case where an extreme loop of up to

    2022-06-22

    In case where an extreme loop (of up to 45 nt) is allowed, as in G3 + E3 + XX, however, a G3+GQ loop maximum of 3 was adequate to reach high J-statistic. It continued to increase only up to 6 as the loop maximum. (Fig. 1A) Hence we suggest using a G3+GQ loop maximum between 3 and 6 where an extreme loop is allowed. Keeping it at the lower boundary may benefit in low dilution conditions. Extreme loop maximum: Choosing an optimal extreme loop maximum, has proved to be trickier. Since the optimal value may depend on G3+GQ loop maximum, we decided to compare G3 + E3 + XX models with two G3+GQ loop maxima, 3 and 7 (Fig. 2). Interestingly, choice of G3+GQ loop maximum did not result in any significant difference and in both cases. Up to 9 nt, increasing extreme loop maximum benefited J-statistics well. From 9 till 30, or in some cases 31, the benefit has been gradual and small. Although an extreme loop maximum between 9 and 30 seems to be reasonable, it should be remembered, as previously mentioned, lower maxima may benefit prediction in low stability conditions. G2GQs usually are less stable and thus stringent loop rules for these GQ may be preferred. Comparison of G2GQ loop maxima vs. J-statistic indicated a peak J-statistic for G2GQ loop maximum of 3 nt in the most G2 + E-XX models, and G2 + E3 + XX models. (Fig. 3) According to these observations a G3+GQ loop maximum of 7 when extreme loop is disallowed or 3 when extreme loop is allowed, and an extreme loop maximum between 9 and 30 are recommended as discrete parameters for genome-wide scanning.
    Discussion This study aimed to improve the G-quadruplex prediction capability and provide a novel tool for the discovery of putative G-quadruplex-forming sequences. G4C accepts several parameters and creates patterns for the discovery of putative G-quadruplexes. These patterns may be designed to allow a number of atypical GQ features allowing the capture of GQs that were missed by QP (QP G2+, QP G3+) and G4H (window size = 20 or 25). These patterns were put to the test along with the already present methods via comparison against a reference dataset with known structural characteristics and revealed that atypical features indeed improved both specificity and sensitivity. The consideration of all types of atypical features and the ability to apply separate rules for shorter GQs are its novelties since, in comparison, no other algorithms, including those developed recently, do not regard extreme loops nor apply separate sets of rules for shorter GQs (Varizhuk et al., 2014, Hon et al., 2017). With the addition of its availability in both Python and PHP, G4C presents important differences from the alternative tools. (Table 3) The inclusion of the aforementioned atypical features improves the prediction accuracy cumulatively. Separate set of rules for G2GQs improved the accuracy when the extreme loops were permitted (G2 + E3 + XX vs. G2 + E2 + XX). Also, the separate rules benefit the prediction since optimum G2GQ loop maxima are found to be different from G3+GQ loop maxima for all models. In case of the extreme loop feature, its inclusion improved predictions significantly (G3 + E3 + XX vs. G3 + E-XX). Bulges should also be considered for GQ prediction. While the level of improvement varies, it is apparent that bulges alone improved the prediction accuracy at all times (I1B and I2B). In comparison, the inclusion of mismatches often decreased the prediction quality. Unfortunately, there is not a single model that is suitable for any application. This could be due to the large variety of GQs structures available, making prediction harder. However, with the currently available models, for high sensitivity requiring applications, such as PCR primer selection, we suggest G2 + E3 + I2BM since this model showed highest TPR (0.99). On the other hand, for applications where high specificity is equally important, such as structural analysis, and prediction along a genome, G3 + E3 + I1B and G3 + E3 + I2B demonstrate the best scores for accuracy.