Optimal Partitions of Astronomical Data



Sponsor: RIACS/NASA Ames Research Center
Liaisons: Dr. Jeffrey Scargle
Supervisors: Dr. Brad Jackson
group posed photo
Spring 2002 Optimal Analysis of Astronomical Data
Team Leaders Sundararajan Arabhi and David Barnes
Student Team Alina Alt, Peter Gioumousis, Elyus Gwin, Paungkaew Sangtrakulcharoen, Linda Tan, Joe Tsai

This project studied a combinatorial optimization problem that arises in the analysis of astronomical data. In the 1-dimensional version of this problem, our might have a detector that records the times that photons are emitted from an unknown source over a period of a time. Our goal was to find the best partition of the observation period into smaller intervals such that the intensity of the source is roughly constant in each interval. Similarly, in dimension 2, we might have data points in the plane representing positions of galaxies, and we want to partition the plane into regions that are roughly uniform in density. In previous work, Dr. Scargle developed a precise definition of what an optimal partition is, and devised a greedy algorithm that obtains near-optimal partitions for 32,000 data points in about 5 minutes. In dimension 1 we found that dynamic programming gives a practical and efficient algorithm for finding a provably optimal partition of the data. For comparison, our algorithm finds an optimal solution for 32,000 data points in under 30 minutes. We also tested several other algorithms for finding near-optimal solutions, all of which can be implemented in both dimensions 1 and 2.

San Jose, CA. Thursday, May 16, 2002.
June, 2002 ??
Fall 2002 Analysis of Clusters in Higher-dimensional Astronomical Data
Team Leaders Peter Gioumousis and Dennis Kanygin
Student Team Tzu-wang Chuang, Russell Sarmiento, Sowmya Subramaniam

In astronomy, there are many 1-dimensional data analysis problems; for example, we might want to identify bursts in the intensity of radiation from an unknown source. There are also higher-dimensional data analysis problems in astronomy; for example, we might want to take a picture of stars in space and cluster them into regions that are roughly uniform in density. Last semester, we discovered that dynamic programming gives an efficient algorithm for finding optimal partitions of 1-dimensional data. This semester, we discovered a branch and bound algorithm that (conjecturally) finds optimal partitions of data in higher dimensions. Since this algorithm does not seem to be efficient enough to handle large problems, we compared some other approaches (e.g., simulated annealing, genetic programming) that find near-optimal solutions.

San Jose, CA. Wednesday, December 11, 2002.
January, 2003 ??
Spring 2003 Optimal Partitions of Astronomical Data
Team Leader James Kittock
Student Team Chris Cusanza, Jordan Horiuchi, Kashif Naqvi, Viet Nguyen

Using Bayesian statistics, Dr. Jeffrey Scargle has devised an objective function that measures the likelihood that a set of data points comes from a distribution that is uniform in density over each of the blocks in some partition. We have been looking for algorithms that find the optimal partition when each of the blocks in the partition is a connected union of cells from an initial partition of the data. In dimension 1, this technique has applications in time-series analysis, trying to graph the intensity of radiation from an unknown source; in dimensions 2 and higher, this has applications in estimating the densities of galaxies in various regions of space.

This semester, we have built on the work of two previous semesters to find improved algorithms and new theoretical results. Our new results include very fast 1-dimensional algorithms that find near-optimal solutions, 1-dimensional algorithms that get optimal partitions of data on a circle, improved 2-dimensional algorithms for finding near-optimal solutions, and a proof that a previously developed branch and bound algorithm finds optimal solutions for 2-dimensional problems.

San Jose, CA. Wednesday, May 14, 2003.
June, 2003 ??
Related Activity
  • Cusanza, C., Horiuchi, J., Kittock, J., Naqvi, K., and Nguyen, V. "Optimal partitions of astronomical data." CAMCOS Reports Day, San Jose, CA, May 14, 2003. Slides.
  • Scargle, J. & Jackson, B. " An Algorithm for the Optimal Partitioning of Data on an Interval," IEEE Signal Processing Letters,

  • Scargle, J. & Jackson, B. "Optimal Partitions of Data in Higher Dimensions,"

  • Mahmound Quweider, Jeff Scargle and Brad Jackson. "Grey level reduction for segmentation, threshholding and binarisation of images based on optimal partitioning on an interval," IET Image Processing 2007, 1(2), pp. 103-111. A copy of the paper can be found at

  • Jackson, B., Scargle, J.D., Barnes, D., Arabhi, S., Alt, A., Gioumousis, P., Gwin, E., Sangtrakulcharoen, P. Tan, L., & Tun Tao Tsai. "An algorithm for optimal partitioning of data on an interval," Signal Processing Letters, IEEE, 12(2), Feb. 2005, p. 105-108.

  • Scargle, J. D., Norris, J., & Jackson, B., (2007). "Studies in Astronomical Time Series Analysis. VI. Optimum Partition of the Interval: Bayesian Blocks, Histograms, and Triggers," (in progress).

  • Jackson, B., & Scargle, J. (2007). "Optimal Partitioning in Higher Dimensions," (in progress).

Jackson team posed photo

Center for Applied Mathematics, Computation and Statistics
 Department of Mathematics • San Jose State University • One Washington Square • San Jose, CA 95192-0103
fax (408) 942-5080