High School Student Research Conference 2004 Abstracts AT&T Labs Improving Table Compression with Combinatorial Optimization Tables are collections of fixedlength records, which contain fixedwidth
fields. They form an important subset of data stored in warehouses.
For example, records of telephone calls and customer care interactions
are stored as tables at AT&T. These tables can be massive, growing by
terabytes each year. Effective compression is thus critical to storing
and processing tables.
I will describe recent work on table compression, embodied in the pin/pzip
system developed at AT&T. Pzip compresses a table, given a partition
of the columns into contiguous intervals: each interval is compressed
separately, by applying a standard compressor to the string formed by
scanning the interval in rowmajor order. The partitions are generated
by pin (partition inducer). The original pin has three phases: first,
lowfrequency columns, those with relatively long runs of identical data,
are factored out and compressed by a differential encoder; second,
the remaining columns are reordered heuristically to try to group
columns that are functionally dependent; third, an optimal partition
is generated for the reordered set of columns by dynamic programming.
This scheme applies pin offline to a small sample of the table. Pzip is
then applied online to the table (and future instances of the table
generated by the same source).
The second and third phases of pin generalize into a uniform problem
of computing an optimal partition of the columns, without respect to a
fixed order. By further abstracting this general problem, we devise
three new partitioning algorithms. Two replace dynamic programming
to find good (although not optimal) partitions for a fixed order; they
are fast enough, however, to allow pin and pzip to be applied together
online, to individual, tabular files, not just tables from continuously
operating sources. The third is a reordering algorithm based on the
asymmetric traveling salesman problem. I will present experimental
results demonstrating their effectiveness.
This is joint work with Glenn Fowler (AT&T Labs) and Raffaele Giancarlo
(U. Palermo).
MKA, Wayne, NJ Attempting Original Research in a Highschool Classroom Supervised by Brother Pat Carney, DCI Lead Teacher In this paper I consider the following problem: If graph G embeds on a
torus, will it always be the case that G has three vertices whose removal
makes the graph four colorable? I first explored K4's that are the largest
planar complete graphs on the usual plane. From this it follows that a
K4 is the largest complete graph that can fit within any area bounded by
the edges of a K4 and thus infinitely many can be so constructed. I then
explored graphs that have a chromatic number of 7. I conjectured that any
graph on a torus with a chromatic number of 7 must have a K7 in it. Finally
I explored replication of a graph with chromatic number of 5 or higher on a
torus at least 4 times. I was not able to do this; however my ideas for
future research are along this idea.
Cresskill High School, Cresskill, NJ Search Numbers Supervised by Kear Halstater We researched how to find a stationary or mobile object or
person within a given graph. We found the minimum number
of searchers needed to search any variation of paths,
circuits, stars, wheels, bipartite graphs, grids.
Additionally, we found a upper bound for complete graphs.
Cresskill High School, Cresskill, NJ The Game of Nim Supervised by Kear Halstater We explored the game of Nim with the goal of player
domination. While Nim is player 2 dominated, we researched
several variations and found the player domination of these
combinations.
Cresskill High School, Cresskill, NJ Art Gallery Supervised by Kear Halstater We determined the minimum number of guards it would take to
secure a polygon room with a set number of edges and
verticies. We also attempted to determine how obstacles in
the center of the room effect the number of guards.
School of the Future, New York, NY Eulerian Graphs Supervised by Halle Kananack, DCI 2003 Many logic problems challenge you to draw a picture
without lifting your pencil or retracing any lines. We
decided to use graph theory to find a pattern in these
problems. Our essential question is: How can you tell
just by looking at a graph if you can travel every
edge once and only once in a single continuous walk?
We categorized all graphs as Eulerian, semiEulerian
or nonEulerian. By studying properties of the
vertices, namely the degree of each vertex and whether
or not it is even or odd, we discovered for ourselves
several patterns that aid us in answering our
"essential question."
Grayson High School, Loganville, GA Voice of the People? An investigation of a hypothetical student council election Supervised by Nicole Davis, DCI 2003 Representative democracy is usually chosen over other forms of democratic
government because it simplifies the governmental process. However, the
best method of implementing a representative democracy has long been under
debate. The goal of our project is to determine the smallest group of
people that can efficiently represent a larger test group. In order to
accomplish this task, we surveyed the test population and selected the
individuals who were most recommended by their peers as a representative of
his/her own views. Each person is represented by a vertex, and each of
their four votes by an arc. Each person will receive a popularity ranking
based on the number of votes he/she received plus half of the votes the
people who voted for him/her received. Using a computer program written for
this project, subgraphs which are the connected components of the overall
digraph will be created, each of which will have its own representative.
The representative will be determined by finding the root of the maximum
weight spanning tree of the subgraph. Cycles will be eliminated based on
the person in the cycle with the greatest popularity. With this data, a
small, influential council, or one supreme individual can be selected that
can sufficiently represent the attitude of the entire test group with
his/her opinions. Alternative parameters for determining the popularity
rankings, such as adding a fraction of the votes that are “distance two”
from the person, will also be explored.
Seton Catholic High School, Pittston, PA Supervised by James Kupetz Let n be the number of vertices and e be the number of edges in a
graph G. G can be represented on the computer in a twodimensional format
or a onedimensional format. We compare the efficiency of these two
formats as n and e increase. We also formulate different algorithms for
searching for both K3s and K4s as subsets of G, and compare the speed of
these algorithms as n and e increase.
Charter School of Wilmington, Wilmington, DE Optimizing Location Using Topics from Graph Theory Supervised by Chuck Biehl, DCI Lead Teacher Due to an overwhelming number of patients coming to the Christiana Care
Sleep Lab and Imaging Center, the company wanted to find the best
location to build an additional center to alleviate pressure on the
existing hospital. we used various graph methods, population data, and
personal correspondence with personnel, to locate a new center in the
Wilmington area.
Charter School of Wilmington, Wilmington, DE On the Optimization of the Snowplow Routes for the Town of Elsmere, Delaware Supervised by Chuck Biehl, DCI Lead Teacher Working closely with the Public Works Department, and map of Elsmere,
Delaware was converted to a graph of more than 200 vertices. The graph
was then eulerized, yielding a total of 435 street traversals, including
116 deadheaded edges. Three Euler circuits of approximately 145 edges
were found for the current three plows available in the town. It was
determined that the current routes are grossly inefficient and that the
new routes found will save money and time for the Town of Elsmere during
snow removal procedures.
Brooklyn Technical High School, Brooklyn, NY A Solution to a Problem of Gargano, Quintas and Rubens [1] We propose a solution to a problem of Gargano, Quintas, and Rubens [1]: Find necessary and sufficient conditions for the existence of a nontrivial complement domination partition in a digraph. We show that a digraph has such a partition if and only if its complement is not strongly connected.
[1] M. L. Gargano, L. V. Quintas, and J. Rubens,
Complement domination,
in: Proceedings of the Thirtyfirst Southeastern International
Conference on Combinatorics, Graph Theory and Computing
(Boca Raton, FL, 2000), Congr. Numer. 145 (2000), 109120
