Ten years of probabilistic estimates of biocrystal solvent content: new insights via nonparametric kernel density estimate
MetadataShow full item record
The probabilistic estimate of the solvent content (Matthews probability) was first introduced in 2003. Given that the Matthews probability is based on prior information, revisiting the empirical foundation of this widely used solvent-content estimate is appropriate. The parameter set for the original Matthews probability distribution function employed in MATTPROB has been updated after ten years of rapid PDB growth. A new nonparametric kernel density estimator has been implemented to calculate the Matthews probabilities directly from empirical solvent-content data, thus avoiding the need to revise the multiple parameters of the original binned empirical fit function. The influence and dependency of other possible parameters determining the solvent content of protein crystals have been examined. Detailed analysis showed that resolution is the primary and dominating model parameter correlated with solvent content. Modifications of protein specific density for low molecular weight have no practical effect, and there is no correlation with oligomerization state. A weak, and in practice irrelevant, dependency on symmetry and molecular weight is present, but cannot be satisfactorily explained by simple linear or categorical models. The Bayesian argument that the observed resolution represents only a lower limit for the true diffraction potential of the crystal is maintained. The new kernel density estimator is implemented as the primary option in the MATTPROB web application at http://www.ruppweb.org/mattprob/.
Showing items related by title, author, creator and subject.
Gschnitzer, T; Gems, B; Aufleger, M; Mazzorana, B; Comiti, F (Springer International Publishing, 2014)The topic of the present paper is the quantification of the processes of bridge clogging by wood and its consideration in natural hazards assessment. Physical, scale model tests on bridge clogging were conducted in the ...
Helmer, S; Neumann, T; Moerkotte, G (ACM, 2003)We introduce a new parameter, the clusteredness of data, and show how it can be used for estimating the output cardinality of a partial preaggregation operator. This provides the query optimizer with an important piece of ...
Durante, F; Pappadà, R; Torelli, N (Springer Verlag (Germany), 2015)We present a procedure for clustering time series according to their tail dependence behaviour as measured via a suitable copula-based tail coefficient, estimated in a non-parametric way. Simulation results about the ...