Overview of GPU Projects
This is a list of GPU projects that students can choose from. Some of
the projects have also been contributed by other professors in CCIS.
Right now, not all GPU projects are fully defined. If there is interest
in a particular project, we will refine its description. In particular,
I also asked professors for interesting general algorithms. We can then
examine those general algorithms to identify interesting computational
micro-kernels that we will implement.
As an example, recall the breadth-first search algorithm. The kernel
requires a hash array to detect if we have previously seen a new state.
Hash arrays can be implemented to use streaming access by using
delayed duplicate detection. In this case, we wait until we have
many hash requests. We then apply the hash function to find hash indices.
We sort based on hash index. We then stream the hash requests and the
hash array, both in hash index order, and compare the two to detect
which hash requests refer to previously seen states.
Where a faculty member provides only the higher-level algorithm, it will
be our job to identify the micro-kernel, and reformulate the micro-kernel
in a form suitable for streaming access.
GPU Projects
-
Multi-Dimensional Range Search (contributed by Prof. Donghui Zhang)
-
Traditionally, multi-dimensional range search are supported through
multi-dimensional index structures such as the R-tree. Such indices
typically suffer from the "curse of dimensionality". For instance, the
R-tree performs poorer than linear scan for range search when the
dimensionality is more than 12. Furthermore, such indices use random
access. There are two ways to support multi-dimensional range search for
high dimensions, both utilizing streaming access.
(a) X-tree. The X-tree combines the R-tree with array. Every node can
either be the root of a sub-tree or be an array of all elements in the
sub-tree. As the dimensionality increases, the X-tree nodes are more
likely to be arrays.
(b) Space filling curves. Using space filling curves such as the Hilbert
curve and the Z-ordering, one can organize a multi-dimensional dataset
into an one-dim array (or sequential file), and a search region can be
mapped to one or multiple ranges of curve values. To retrieve large
chunks of records from this array using the ranges can utilize streaming
access.
-
Nearest Neighbor Search (contributed by Prof. Donghui Zhang)
-
Traditionally, the R-tree are used to support NN search given an arbitrary
query point. The idea is to perform best-first search: keep expanding
the index entries that are the closest to the query point, till an object
stored in some leaf node is revealed. It is a random-access method.
It will be interesting to implement the NN search algorithm when the
objects are indexed using space filling curves.
-
Update the Space Filling Curve File (contributed by Prof. Donghui Zhang)
-
When the objects are organized into an array or file in increasing order
of space filling curves, it will be interesting to implement objects
updates. It will be really inefficient to reconstruct the whole file for
each new object insertion. One idea is to buffer the updates, e.g.
whenever there are 100 new updates, integrate them into the original file
via reconstruction. Another idea of avoiding reconstruction is to leave
some space in the file (so that a few insertions in the middle
can be applied in-place and does notrewrite the whole file). Yet another
idea is to break the file into multiple pieces. It will be interesting to
design a systematic way of organizing the records which, probably by
incorporating all these ideas and possibly some other ideas, efficiently
supports updates.
-
Kernels for Face and Fingerprint Recognition (contributed by Merav Algon)
-
"Complete discriminant evaluation and feature extraction
in kernel space for face recognition"
by Xudong Jian, Bappaditya Mandal and Alex Kot.
(Also, see: Manhua Liu, Xudong Jiang, Alex ChiChung Kot: Efficient fingerprint search based on database clustering. Pattern Recognition 40(6): 1793-1803 (2007)). We have an internal copy of the first paper. For copyright reasons,
I can't post it publicly. You'll find it in the course directory
(if you have a CCIS account), and you can always get it from a TA with
a CCIS account. I recommend asking Kapil Arya (username same as first name,
but lower case) at ccis.neu.edu.