DipCheck is a validation tool for protein backbone geometry, developed by Joana Pereira and Victor Lamzin, EMBL Hamburg.

The tool uses a Euclidean 3D space (DipSpace) of the orthogonal descriptors
of the geometry of a 5-atom dipeptide unit:

CA(i-1)-O(i-1)-CA(i)-O(i)-CA(i+1).

The DipSpace database contains 1,024,000 data points derived from the selected set of 1,300,000 dipeptide fragments from the well-refined structures deposited in the PDB.

DipCheck classifies the **geometry of the middle, CA(i) atom** in four categories:

Favoured region | 98.00% of the set, DipScore above 0.243 |

Allowed region | 99.80% of the set, DipScore between 0.243 and 0.033 |

Generously allowed region | 99.95% of the set, DipScore between 0.033 and 0.010 |

Disallowed region | the remaining 0.05%, DipScore below 0.010 |

DipCheck also classifies the overall **geometry of a protein model**, according to its DipScore distribution, in four categories:

Favoured model | 98.00% of a random set of protein models, Chi-score above -2.15 |

Allowed model | 99.80% of a random set of protein models, Chi-score between -2.97 and -2.15 |

Generously allowed model | 99.95% of a random set of protein models, Chi-score between -3.38 and -2.97 |

Outlier model | the remaining 0.05%, Chi-score below -3.38 |

The output of the dipcheck version as of 06.05.2016 provides the following:

- The value of the DipScore for each residue and its annotation to the region.
- The number of CA atoms contained in the input file and the number of CA atoms evaluated. Dipeptide units containing atoms with partial occupancies are ignored.
- Summary table of the number of residues in each of the four regions.
- The first four central moments of the DipScore distribution with Z-scores.
- The overall Chi-score. This is the most important 'single-number' result.

- The Chi-score is similar to a conventional Z-score, but its distribution is not a Gaussian, but resembles a Rayleigh distribution. In addition, the Chi-score has a sign. Negative sign indicates that the structure is worse than average, while positive overall Chi-score corresponds to the structure better than the average.
- For good structures a value of the overall Chi-score below -3.38 should statistically occur once for 2000 structures.
- Therefore, a structure with the overall Chi-score below -3.38 can be deemed an outlier and is worth inspecting.
- A percentile is also printed for the whole structure. If this value is higher than 2.0 - the model is regarded as favoured. If it is lower, but higher than 0.2 - the model is 'allowed'. Below this but higher than 0.05 - 'generously allowed'. Otherewise 'disallowed' or an outlier.
- The overall Chi-score is an average indicator of the whole model. With all other conditions being equal, it will be the same for a 100-residue model with 1 outlying CA and a 1000-residue model with 10 outliers.