Details on hydrophobicity plots

From BanghamLab
Revision as of 19:48, 14 June 2014 by AndrewBangham (talk | contribs) (Created page with "==Abstract from ...== Hydrophobicity plots provide clues to the tertiary structure of proteins (J. Kyte and R. F. Doolittle, 1982, J. Mol. Biol.157, 105; C. Chothia, 1984, Ann...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Abstract from ...

Hydrophobicity plots provide clues to the tertiary structure of proteins (J. Kyte and R. F. Doolittle, 1982, J. Mol. Biol.157, 105; C. Chothia, 1984, Annu. Rev. Biochem.53, 537; T. P. Hopp and K. R. Woods, 1982, Proc. Natl. Acad. Sci. USA78, 3824). To render domains more visible, the raw data are usually smoothed using a running mean of between 5 and 19 amino acids. This type of smoothing still incorporates two disadvantages. First, peculiar residues that do not share the properties of most of the amino acids in the domain may prevent its identification. Second, as a low-pass frequency filter the running mean smoothes sudden transitions from one domain, or phase, to another. Data-sieving is described here as an alternative method for identifying domains within amino acid sequences. The data-sieve is based on a running median and is characterized by a single parameter, the mesh size, which controls its resolution. It is a technique that could be applied to other series data and, in multidimensions, to images in the same way as a median filter.

Actually, for the data-sieve I did use a cascade of medians which was a very inefficient algorithm. Subsequently, I switched to a cascade of 1D recursive medians which also preserved scale space but was hugely quicker and is idempotent.