Client-Side Profile Language and Interpreter
(Final project for cs227, together with N. Tatbul, J. Cohen, Y. Xing)






    In this project, our objective was to develop a client-side tool which lets the user to get continuous and timely information from the web about what he is interested in, minimizing the time user spends on browsing the web pages. Data source file  should be a well-structured site whose content is rapidly changing ( We chose CNN.com as the web-site). Although the language itself is generic, we need to manually build site trees for the web sites we want to apply it to, and to specify some functions on the leaves of these trees to  allow finer-grained information retrieval. Such ``half-automatic'' extraction of structure and information from HTML-pages is not very elegant of course, but it seems that the only way to allow to do it automatically is to store data on the web in XML.
The user specifies his interests through :

Then he will be presented a web page containing the information, possibly collated, with links to the  original data source. In this project, profiles are treated as persistent queries which should be continuously evaluated as the data in the source changes.
Here's an example query:

FROM  /cnn/sports/baseball/schedules/*  A
             /weather/forecast/Boston   B
WHERE A.city("home", "May 11") = "Boston"   AND
                B.temperature("day2", "high", "F") > ``50''

Our system should be easily extendible to query and integrate data from multiple sites. This is the potential advantage of this client side system versus the sever side service.



Final Report

Back to koa's homepage