Monday, 29 June 2009

Clustered Alignments of Gene-Expression Time Series Data

Adam Smith ***
Want to align time series to compare treatments so you can find causative effects. Ultimately would be good to search a database for similar effects to find genes that are operating together.

You get warping so you need to align equivalent points. Use splines to create continuous series from discrete data.
Shorting alignments trim series to the same features of maxima and minima.
SCOW - efficient method for aligning time series.

  • Most extensive fiting is dynamic programming to minimse Euclidean distances between two time series being aligned (Sakoe and Chiba 1978)
  • Parametric time warping (Eilers 2005) approximate warping to a parabolic or linear warp.
  • Segment based warping (Smith et al 2008) - alignment score for different segments
  • COW correlation optimised warping - points of discontinuity are called knots - where there is a break and a warp.
  • SCOW - shorting correlation optimised warping.
Evaluated EDGE toxicology database 216 observations 1600 genes times 6 to 96 hours.
Trying to match query to find the most similar treatment profiles in the database.

For clustering pick an average time series alignment and then the extremes above and below before continuing to add the other time series distances to each of the clusters. By using clusters the alignments are improved.

No comments: