Compression Contest: Final Results
Last update: Wed Mar 16 00:00:01 EST 2011
Methodology
We tested your compressing/decompressing programs on both the non-positional and the positional indices. First, we measured the time to compress the two files (Compress column), logging the size of the correponding compressed files (Size column). Then, we fed in 100k postings list IDs for each index, measuring the time that your query program needed to retrieve the corresponding postings lists from the corresponding compressed index (Query column). Finally, we checked that the retrieved postings lists were identical to the original ones.
Results for non-positional index
| # | Team | Size (bytes) | Language | Compress (s) | Query (s) |
| 1 | ZM5 | 11,330,543 | Python | 71 | 7 |
| 2 | EvilBits | 13,277,460 | Python | 214 | 24 |
| Cobb(7z) | 13,568,351 | ||||
| 3 | Justabitmore | 14,708,956 | Racket | 107 | 4,156* |
| 4 | Datamongers | 14,951,142 | Python | 64 | 29 |
| 5 | HammerheadHousefly | 14,962,010 | Python | 75 | 2,242 |
| 6 | BrinRank | 15,326,072 | Python | 263 | 47 |
| 7 | JWA | 15,498,505 | Python | 36 | 28 |
| 8 | StreetSharks | 15,498,505 | Python | 31 | 2,597 |
| 9 | GeorgeM | 15,509,869 | Python | 20 | 14 |
| 10 | arden | 15,509,869 | Python | 41 | 5 |
| 11 | emmy | 15,509,869 | Python | 37 | 10 |
| Ariadne(bzip2) | 17,122,370 | ||||
| 12 | PugnaciousParsers | 17,938,805 | Python | 58 | 12 |
| 13 | Ding | 17,975,946 | Python | 57 | 1,543 |
| 14 | jim-bo | 25,087,469 | Java | 17 | 2 |
| 15 | CMonster | 32,302,092 | Python | 288 | 1,579* |
| 16 | TeamAwesome | 32,379,323 | Python | 239 | 116 |
| Arthur(uncompressed) | 69,726,288 |
Results for positional index
| # | Team | Size (bytes) | Language | Compress (s) | Query (s) |
| 1 | BrinRank | 58,946,406 | Python | 660 | 165 |
| 2 | HammerheadHousefly | 60,211,666 | Python | 305 | 2,464 |
| 3 | Datamongers | 63,570,919 | Python | 269 | 135 |
| 4 | EvilBits | 66,300,020 | Python | 954 | 97 |
| Cobb(7z) | 69,788,951 | ||||
| 5 | PugnaciousParsers | 69,911,298 | Python | 203 | 37 |
| 6 | JWA | 71,483,005 | Python | 181 | 259 |
| 7 | emmy | 71,512,899 | Python | 174 | 51 |
| 8 | arden | 71,513,429 | Python | 216 | 1,416 |
| 9 | StreetSharks | 71,534,265 | Python | 135 | 2,533 |
| 10 | ZM5 | 71,555,096 | Python | 166 | 20 |
| 11 | Ding | 73,964,407 | Python | 226 | 1,607 |
| Ariadne(bzip2) | 74,423,056 | ||||
| 12 | GeorgeM | 88,250,299 | Python | 157 | 63 |
| 13 | CMonster | 92,665,669 | Python | 866 | 1,557* |
| 14 | TeamAwesome | 95,217,284 | Python | 703 | 987 |
| 15 | Justabitmore | 130,214,216 | Racket | 115 | DNQ1 |
| Arthur(uncompressed) | 196,048,431 | ||||
| 16 | jim-bo | XXX,XXX,XXX | Java | DNQ2 | DNQ2 |
Notes
- * = tested on 25k queries only (the provided code was too slow to be tested on the entire 100k queries)
- DNQ1 = timeout (2h) reached while querying the compressed index with 25k queries
- DNQ2 = failed to create the compressed index
WhoWhenWhere
- Professor: Eli Upfal
- HTA: Alberto Pettarin
- GTA: Olya Ohrimenko
- Spring 2011
- Time: WF 1-2:30
- Place: CIT 368