ACM Computing Surveys 28A(4), December 1996, http://www.acm.org/surveys/1996/Formatting/. Copyright © 1996 by the Association for Computing Machinery, Inc. See the permissions statement below.
The area of database technology has been one of the unqualified success stories of the past two decades in Computer Science. The field of relational databases has proved to be extremely successful in the marketplace, and a rich body of theory has emerged to provide a firm foundation. Many of the results in the research community have had a direct and significant impact on commercial products. While this heady success of the past is gratifying, it also sets a high level of expectation for future. What are the important directions for the field to pursue? Can the research community continue to take a leadership role? I think that the key is to identify those application areas that are not adequately served by current technology and that will dominate in the coming decade. Without question, there is an ongoing explosion of information, and database technology can play a crucial role in storing, sifting and accessing this information. The danger lies in the possibility that the next generation of DBMS technology is not able to address the changing needs of information-intensive applications, and is supplanted by other technologies or under-utilized. Here're some trends, as I see them:
- WWW: The Web is here to stay, and it represents an enormous repository of information. The database field has been slow to react, and in many ways has been bypassed by this revolution. It is important to evaluate how DBMS technology can be used to leverage the connectivity provided by the Web.
- Integrated Access to Data: Database integration will become increasingly important as more and more data comes on-line and, especially, into database management systems. Loosely-structured and heterogeneous sources, legacy database systems, applications like spreadsheets, etc. must be conveniently accessible for several applications, and the integration technology to address this needs to be developed further. The WWW and standards like CORBA and MS/OLE are likely to be very significant in this area.
- Office Information Systems: The paper-less office has been imminent, seemingly, forever now. But with the increased power of inexpensive `personal' computers, the greater availability of cheap storage and technology for OCR and digital document handling, not to mention the ever-increasing benefits of having digitized data (such as exploiting the Web to exchange this data), I think the paper-less office is turning, at last, into reality. What role can DBMS technology play here?
- Sequence Data: Sequence data is being collected from a wide variety of sources---satellite observations, experimental traces, financial data in the stock market, temporal data of numerous kinds, medical histories, videos, and gene sequences, to name a few---and in rapidly increasing volumes. Clearly, this is an opportunity for DBMS technology, if we can provide the necessary flexibility in querying and accessing such data.
- Image and Video Data: Image data is being accumulated in vast quantit ies because of several reasons. With the increasing emphasis on multimedia applications, imagery is ubiquitous, and the tools for digitizing and electronically disseminating images have become commonplace. However, querying image collections is in its infancy. NASA proposes to generate about 15 {\em petabytes} of data over about a 10 year period in its EOS project! Clearly, good analysis tools are essential if this data is to be useful.
- Data Exploration: Organizations have come to recognize that their data constitutes a valuable asset, if only they can extract all the infomation that it contains! In contrast to the well-structured queries that characterized the past two decades, there is a growing interest in the ability to explore or browse large datasets to find useful patterns. While the term {\em data mining} has become fashionable, similar ideas have been explored for many years in AI (machine learning and knowledge discovery) and Statistics (exploratory data analysis). The new twist that makes this a compelling database problem is the growing volume of data.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.