
Spring 2002

Optimal Analysis of Astronomical Data
Team Leaders

Sundararajan Arabhi and David
Barnes

Student Team

Alina Alt, Peter Gioumousis, Elyus Gwin, Paungkaew
Sangtrakulcharoen, Linda Tan, Joe Tsai 
Abstract

This project studied a combinatorial optimization
problem that arises in the analysis of astronomical data. In the
1dimensional version of this problem, our might have a detector that
records the times that photons are emitted from an unknown source over
a period of a time. Our goal was to find the best partition of the
observation period into smaller intervals such that the intensity of
the source is roughly constant in each interval. Similarly, in
dimension 2, we might have data points in the plane representing
positions of galaxies, and we want to partition the plane into regions
that are roughly uniform in density. In previous work, Dr. Scargle
developed a precise definition of what an optimal partition is, and
devised a greedy algorithm that obtains nearoptimal partitions for
32,000 data points in about 5 minutes. In dimension 1 we found that
dynamic programming gives a practical and efficient algorithm for
finding a provably optimal partition of the data. For comparison, our
algorithm finds an optimal solution for 32,000 data points in under 30
minutes. We also tested several other algorithms for finding
nearoptimal solutions, all of which can be implemented in both
dimensions 1 and 2.

CAMCOS Presentation

San Jose, CA. Thursday, May 16, 2002.

CAMCOS Report

June, 2002 ??




Fall 2002

Analysis of Clusters in Higherdimensional Astronomical
Data
Team Leaders

Peter Gioumousis and
Dennis Kanygin

Student Team

Tzuwang Chuang, Russell Sarmiento, Sowmya Subramaniam

Abstract

In astronomy, there are many 1dimensional data
analysis problems; for example, we might want to identify bursts in
the intensity of radiation from an unknown source. There are also
higherdimensional data analysis problems in astronomy; for example,
we might want to take a picture of stars in space and cluster them
into regions that are roughly uniform in density. Last semester, we
discovered that dynamic programming gives an efficient algorithm for
finding optimal partitions of 1dimensional data. This semester, we
discovered a branch and bound algorithm that (conjecturally) finds
optimal partitions of data in higher dimensions. Since this algorithm
does not seem to be efficient enough to handle large problems, we
compared some other approaches (e.g., simulated annealing, genetic
programming) that find nearoptimal solutions.

CAMCOS Presentation

San Jose, CA. Wednesday, December 11, 2002.

CAMCOS Report

January, 2003 ??




Spring 2003

Optimal Partitions of Astronomical Data
Team Leader

James Kittock

Student Team 
Chris Cusanza, Jordan Horiuchi,
Kashif Naqvi, Viet Nguyen

Abstract 
Using Bayesian statistics, Dr. Jeffrey Scargle has
devised an objective function that measures the likelihood that a set
of data points comes from a distribution that is uniform in density
over each of the blocks in some partition. We have been looking for
algorithms that find the optimal partition when each of the blocks in
the partition is a connected union of cells from an initial partition
of the data. In dimension 1, this technique has applications in
timeseries analysis, trying to graph the intensity of radiation from
an unknown source; in dimensions 2 and higher, this has applications
in estimating the densities of galaxies in various regions of space.
This semester, we have built on the work of two previous semesters to
find improved algorithms and new theoretical results. Our new results
include very fast 1dimensional algorithms that find nearoptimal
solutions, 1dimensional algorithms that get optimal partitions of
data on a circle, improved 2dimensional algorithms for finding
nearoptimal solutions, and a proof that a previously developed branch
and bound algorithm finds optimal solutions for 2dimensional
problems.

CAMCOS Presentation

San Jose, CA. Wednesday, May 14, 2003.

CAMCOS Report

June, 2003 ??




Related Activity 

Presentations 
 Cusanza, C., Horiuchi, J., Kittock, J., Naqvi, K., and Nguyen, V.
"Optimal partitions of astronomical data." CAMCOS Reports
Day, San Jose, CA, May 14, 2003.
Slides.

Publications 
 Scargle, J. & Jackson, B. " An Algorithm for the Optimal Partitioning
of Data on an Interval,"
IEEE Signal Processing Letters,
http://trotsky.arc.nasa.gov/~jeffrey/paper_1d.pdf.

Scargle, J. & Jackson, B. "Optimal Partitions of Data in Higher Dimensions,"
http://astrophysics.arc.nasa.gov/AISRPCodeLibraryServer/Uploads/Public/000021/preprints/preprints.pdf.
 Mahmound Quweider, Jeff Scargle and Brad Jackson. "Grey level reduction for segmentation,
threshholding and binarisation of images based on optimal partitioning on an interval,"
IET Image Processing 2007, 1(2), pp. 103111. A copy of the paper can be found at
http://blue.utb.edu/mkquweider/Nasa2004/IET2007.pdf.
 Jackson, B., Scargle, J.D., Barnes, D., Arabhi, S., Alt, A., Gioumousis, P., Gwin, E.,
Sangtrakulcharoen, P. Tan, L., & Tun Tao Tsai.
"An algorithm for optimal partitioning of data on an interval,"
Signal Processing Letters, IEEE, 12(2), Feb. 2005, p. 105108.
 Scargle, J. D., Norris, J., & Jackson, B., (2007). "Studies in Astronomical Time Series
Analysis. VI. Optimum Partition of the Interval: Bayesian Blocks, Histograms,
and Triggers," (in progress).
 Jackson, B., & Scargle, J. (2007). "Optimal Partitioning in Higher
Dimensions," (in progress).




