SigED Web Interface


SigED (Significant scan of Energy Difference): A program for discovering well-ordered folding segments by scanning successive segments along the nucleotide sequence. SigED is often used in conjunction with the more rapid EDscan. The computer program SigED is designed to identify the local structural features by statistical evaluation in a random sample with computing three z-scores, SigZscre(Si), SigSteme(Si), and SigLoope(Si), of local segments in a sequence.

From EDscan we have Ediff(Si) = Ef(Si) - E(Si). Where E(Si) is the lowest free energy of the optimal structure folded by the segment Si, Ef(Si) is the optimal free energy in which all previous base-pairs formed in the original optimal structure are prohibited. The greater the Ediff(Si) of the segment is, the more well-ordered the folded RNA structure is expected to be. The measure Ediff is obviously dependent on the structural feature of a local segment.

What is the typical behavior of Ediff in a random sample that is related to the local segment? We adapt Monte Carlo simulations as the means of estimating the uncertainty of Ediff in a random sample. In Monte Carlo simulations, we generate a large number of randomly shuffled sequences (RSi,1, ..., RSi,m) for the local segment Si, where the number m is determined by the length of the segment Si. Similarly, we can compute Ediff(RSi,1), ..., Ediff(RSi,m) for each random sequence and calculate their sample mean Ediff(Si) and sample standard deviation std(RSi).

The standardized z-score SigZscre is defined as

SigZscre(Si) = (Ediff(Si) - Ediff(RSi))/std(RSi).

We can also divide Ediff into two parts, Estemdiff which is the energy difference contributed by base-pairing stacking only and Eloopdiff which is the energy difference contributed by loops only. Thus, we can also define the other two z-scores, SigSteme(Si) and SigLoope(Si). The two measures can help us to evaluate a significantly unstable folding region that, for example, may be a potential target for hybridization of antisense agents.

In the random sample, the distribution of the random variable SigZscre(RSi,j), SigSteme(RSi,j) and SigLoope(RSi,j) are expected to follow a Normal distribution. The statistical significance of the measure Ediff(Si), Estemdiff(Si) and Eloopdiff(Si) of the segment Si can be easily estimated by means of the classical Normal distribution.

The program is based on the dynamic programming algorithm and implemented in Fortran 90 running on Unix. SigED may take considerable heavy computation for a long sequence.