Andrew D. Ferguson

I am a third-year Ph.D. student in the Computer Science department of Brown University, advised by Rodrigo Fonseca. I'm broadly interested in operating systems and computer networks, which has led to my current focus on distributed systems, such as the Hadoop implementation of MapReduce, and the Internet. My research is currently exploring network filesystem issues in Hadoop, and the use of ISP backbone networks.

From 2011-2014, I will be supported by a National Defense Science and Engineering Graduate Fellowship (NDSEG).

I spent the summer of 2011 at Microsoft Research (Redmond) working with Peter Bodik and Srikanth Kandula on resource allocation in Cosmos/Dryad on Bing's clusters.

In June 2008, I graduated from Princeton University. My interest in probability and statistics led me to major in Operations Research and Financial Engineering, with a healthy serving of Computer Science courses on the side.

You may also be interested in my Curriculum Vitae, or the list of courses I have taken. As a curious computer scientist, I occasionally use twitter.

Contact Info
E-mail: adf@cs.brown.edu
Office: 303, CIT building

Current Research

Utility-based Cluster Scheduling
Working with colleagues at Microsoft Research, we are developing scheduling mechanisms for MapReduce-style clusters which maximize global utility of compute jobs, and are capable of meeting service-level agreements (SLAs) with high probability.

Our results on meeting deadlines for single jobs have been accepted for publication in ACM EuroSys 2012. The paper's final version will be available here when ready.

Analysis of ISP Networks
By combining passive network measurements with a detailed analysis of publicly available data, I am seeking to give researchers and smaller ISPs greater visibility into the construction and utilization of Tier 1 ISP backbones. This is joint work with my advisor, Prof. Rodrigo Fonseca.

As a first part of this project, I have been exploring the use of the IP Timestamp option to probe router and link properties. The results of these initial experiments were presented at the ACM CoNEXT 2010 Student Workshop. [Paper] [Poster]

Task Scheduling and Block Placement in Hadoop
I am currently working with Prof. Rodrigo Fonseca to understand cluster workload characteristics and improve task scheduling in the Hadoop platform for MapReduce. I am also working to optimize data layout in the Hadoop distributed filesystem by considering automatic re-balancing approaches informed by research from the database community.

I presented a poster describing some preliminary results from this work at the USENIX Annual Technical Conference 2010. The poster explains the source of some of the imbalances which arise naturally in HDFS, and explores the potential performance improvements offered by a more balanced filesystem. [Poster] [Poster Proposal]

Past Research Projects

Princeton Election Consortium
I assisted Prof. Samuel S.-H. Wang with creating and refining statistical models for the most accurate meta-analysis prediction of the 2008 election. I also developed the automated website infrastructure and prepared visualizations.

Exploring the Yeast Genome with Generalized Singular Value Decomposition
For my senior independent work, I applied a technique from matrix analysis to compare data previously clustered using hierarchical methods, yielding a substantial processing speed-up while reproducing and extending previous results.

Professional Service

Brown University - Information Technology Advisory Board
I serve as an elected representative to the committee charged with studying the use of technology on campus and advising the university leadership on technology decisions. With the growth of out-sourced cloud-computing options, I think it is an important time for students of distributed systems to be involved in the IT-investment conversation.

Brown Computer Science - Research Exchange Seminars with Tea (REST)
I am a co-organizer of a weekly talk series by computer science graduate students designed to foster collaboration and disseminate new techniques across subfields.

Conference Reports for USENIX ;login:
I wrote summary reports of talks given at INM/WREN 2010, NSDI 2010, and NSDI 2011 for the USENIX ;login: magazine.

Open Source Contributions
I previously maintained Rdiff-Backup, a Python-based cross-platform backup solution using the rsync algorithm in use worldwide. I helped prepare the Windows port and improved file extended attribute handling on Mac OS X. I'm a former developer for the AfterStep Window Manager, where I focused on the configuration GUI and parsing the options files. My patches have been included in Python and Samba.

Just For Fun

Vint Cerf wants YOU to use IPv6
I think Vint Cerf makes a great Uncle Sam, so I created this propaganda to help raise awareness of the transition to IPv6.

Office Toys
To honor the memory of my now-graduated office mate, Alp Küpçü, I assembled the 6' tall "K'nex HyperSpace Training Tower" where he used to sit. I made a stop-motion video of the assembly process, and a short video showing the tower in action.

The Chazelle Show
As an undergraduate, I compiled the best clips of Bernard Chazelle lecturing in the Integrated Science course during the Fall of 2004. You've never seen the inverse Ackerman function explained quite like this before...

A (Narrow) History of Programming Languages
In 2000, when I was in eighth grade I wrote a report about the history of programming languages. For several years after I wrote it, the text was available online and it became a reference for other articles, Wikipedia entries, and even college courses. I have placed the report here for posterity and amusement (how could I have possibly left out Python??).

Other Interests
I developed a love for power tools during my years at Princeton, during which I spent all of my spare time involved with Theatre Intime, the campus' only student-run theater. I'm happy to say that I successfully hooked my sister on its productions before extricating myself. :-)


In a nod to my friend Rob: Validation is like typechecking your webpage! XHTML, CSS