FIELD:
A Friendly Integrated Environment for Learning and Development

FIELD:
A Friendly Integrated Environment for Learning and Development

Steven P. Reiss

Brown University

Providence, RI

Contents

Contents v

Figures xiii

Tables xvii

Preface xix

Acknowledgments xxiii

1 Integrated Programming Environments 1

1.1 WHAT IS A PROGRAMMING ENVIRONMENT? 1

1.2 CLASSIFICATION OF ENVIRONMENTS 3

1.3 OBJECTIVES IN BUILDING FIELD 6

1.4 INTEGRATION STRATEGIES 7

1.4.1 Integration Requirements 8

1.4.2 Data Integration 9

1.4.3 Control Integration 11

1.5 OVERVIEW OF THE FIELD ENVIRONMENT 13

2 The FIELD Integration Mechanism 17

2.1 THE MESSAGE SYSTEM 17

2.1.1 Evolution of Message Passing 17

2.1.2 Message Architecture 19

2.1.3 Message Conventions 20

2.2 PATTERN MATCHING 21

2.3 MESSAGE TYPES 24

2.3.1 Asynchronous Messages 25

2.3.2 Synchronous Messages 25

2.3.3 Priority Messages 26

2.3.4 Default Messages 27

2.4 MESSAGE GROUPS 27

2.5 OTHER MESSAGE FACILITIES 28

2.5.1 Service Management 28

2.5.2 Environment Management 30

2.6 THE MSG PROGRAM INTERFACE 30

2.6.1 Connecting to the Message Server 31

2.6.2 Registering for Messages 31

2.6.3 Sending Messages 32

2.6.4 Replying to a Message 33

2.7 COMPARISON TO OTHER IMPLEMENTATIONS 33

2.7.1 Softbench 33

2.7.2 DEC/FUSE 35

2.7.3 Tooltalk 36

3 The FIELD Policy Service 39

3.1 THE POLICY CONCEPT 39

3.2 POLICY LANGUAGE CONCEPTS 41

3.2.1 Augmented Transition Network 42

3.2.2 Policy Levels 42

3.2.3 Tool Specifications 43

3.2.4 State Variables 44

3.2.5 Patterns 45

3.2.6 Policy Rule Specifications 45

3.2.7 Actions 47

3.2.8 Selecting Actions 48

3.2.9 Policy Programs 49

3.3 SAMPLE POLICY PROGRAMS 50

3.3.1 Automatic Compilation 50

3.3.2 Starting the Cross-Reference Service 51

3.3.3 Automatically Starting an Editor 52

4 The FIELD Debugger 53

4.1 OVERALL DEBUGGER ORGANIZATION 53

4.1.1 State Management 55

4.1.2 Expression Management 56

4.1.3 Event Management 57

4.1.4 Stack Management 61

4.2 THE MESSAGE INTERFACE 62

4.2.1 Processing Messages 62

4.2.2 Message Command Language 62

4.2.3 The Programming Interface 68

4.3 MESSAGES GENERATED BY THE DEBUGGER 70

4.3.1 System-Oriented Messages 70

4.3.2 Location and Trace Messages 72

4.3.3 Messages Describing the Stack 72

4.3.4 Messages Describing Events 73

4.3.5 Information Messages 73

4.4 THE TEXTUAL COMMAND LANGUAGE 74

5 Cross-Referencing in FIELD 81

5.1 THE OVERALL APPROACH 82

5.2 THE CROSS-REFERENCE DATABASE SYSTEM 84

5.2.1 Relations and Fields 84

5.2.2 The Query Language 90

5.2.3 Query Processing 92

5.2.4 System Commands 95

5.2.5 Scanning Strategies 95

5.2.6 Maintaining the Database 97

5.2.7 User Options 99

5.3 THE CROSS-REFERENCE SCANNERS 101

5.3.1 Scanner Output Format 101

5.3.2 The Pascal Scanner 102

5.3.3 The C Scanner 103

5.3.4 The C++ Scanner 104

5.3.5 Compiler-Generated Scans 105

5.4 THE CROSS-REFERENCE SERVER 106

6 FIELD Services 109

6.1 CONFIGURATION AND VERSION CONTROL 109

6.1.1 The Internal Representation 111

6.1.2 The Configuration Management Interface 112

6.1.3 The Version Control Interface 113

6.1.4 The Formserver Message Interface 114

6.2 PROGRAM PROFILING 116

6.2.1 The Internal Representation 116

6.2.2 Profiling Back Ends 117

6.2.3 The Message Interface 117

6.3 EXECUTION MONITORING 119

6.3.1 Monitoring With a Server 119

6.3.2 The Monserver Message Interface 121

7 The Brown Workstation Environment 125

7.1 HISTORY OF BWE 125

7.2 BASIC BWE COMPONENTS 129

7.2.1 Basic Input and Output 129

7.2.2 Geometry Management 131

7.2.3 Menuing 131

7.2.4 Text Editing 132

7.2.5 Window Management 133

7.2.6 Help Facilities 133

7.3 STRUCTURED GRAPHICS DISPLAY 134

7.3.1 GELO 135

7.3.2 Layout Heuristics 137

7.4 RESOURCE MANAGEMENT 141

7.4.1 X11 Resource Management 141

7.4.2 AUXD Resource Management 142

8 The Annotation Editor 145

8.1 ANNOTATIONS 146

8.2 INTEGRATING ANNOTATIONS AND MESSAGES 149

8.3 PERMANENT ANNOTATIONS 153

8.4 ANNOTATION EDITOR INTERFACE 155

8.4.1 The Annotation Panel 156

8.4.2 Manipulating Annotations 156

8.4.3 Miscellaneous Annotation Commands 158

8.4.4 Editor Commands 158

8.4.5 Defining Annotation Editor Tools 160

9 The Debugger Interface 161

9.1 OVERVIEW 161

9.2 DBG 162

9.2.1 User-Definable Buttons 164

9.2.2 Integration With Messages 166

9.3 VIEWERS OF DEBUGGER INFORMATION 167

9.3.1 Overall Structure 167

9.3.2 The Event Viewer 169

9.3.3 The Stack Viewer 169

9.3.4 The Trace Viewer 171

9.4 THE USER INPUT-OUTPUT VIEWER 171

10 The Interface for Cross-Referencing 175

10.1 DEFINING STANDARD QUERIES 176

10.2 QUERY PROCESSING 177

10.2.1 Generating the Query 178

10.2.2 Outputting the Query Result 179

10.3 INTERACTING WITH OTHER TOOLS 181

11 The Call Graph Browser 183

11.1 ORGANIZING THE DATA 183

11.1.1 The Function-File-Directory Hierarchy 184

11.1.2 User-Defined Groupings 187

11.1.3 Deciding What Nodes To Display 189

11.1.4 Dynamic Calls 190

11.2 BROWSING OPTIONS 191

11.3 INFORMATION WINDOW 192

11.4 ANIMATING THE CALL GRAPH 193

11.5 INTERACTING WITH THE ENVIRONMENT 195

12 The Class Hierarchy Browser 197

12.1 WHAT TO DISPLAY 197

12.2 DISPLAYING LARGE HIERARCHIES 198

12.3 CLASS AND MEMBER INFORMATION 203

12.3.1 The Class Display 203

12.3.2 Arcs in the Display 204

12.3.3 Highlighting 206

12.4 INTERACTING WITH THE CLASS BROWSER 207

13 The Interface to UNIX Profiling Tools 211

13.1 DISPLAYING THE PERFORMANCE DATA 211

13.2 INTERACTING WITH XPROF 214

14 Configuration and Version Management 217

14.1 OBTAINING THE INFORMATION 217

14.2 DISPLAYING THE DEPENDENCY GRAPH 218

14.3 BROWSING OPTIONS AND COMMANDS 220

14.3.1 Configuration Management Commands 221

14.3.2 Version Control Commands 222

14.3.3 The Transcript Window 223

14.4 INTERACTING WITH OTHER TOOLS 224

15 Data Structure Display 227

15.1 GETTING THE INFORMATION 228

15.2 DEFAULT DISPLAY DEFINITIONS 229

15.3 USER-DEFINED DISPLAY DEFINITIONS 230

15.3.1 The APPLE Editor 231

15.3.2 The APPLE User Interface 232

15.4 EXAMPLES OF MAPPING DEFINITIONS 236

15.4.1 A Tiled Example 236

15.4.2 A List Example 236

15.5 EDITING DATA STRUCTURES GRAPHICALLY 238

16 Monitoring Program Execution 243

16.1 HEAP VISUALIZATION 243

16.2 INPUT/OUTPUT VISUALIZATION 247

16.3 PERFORMANCE VISUALIZATION 248

17 The Control Panel 253

17.1 DEFINING THE CONTROL PANEL 253

17.2 WINDOW MANAGEMENT 255

17.3 COMMON UTILITIES 255

17.4 STANDARD BUTTON COMMANDS 256

18 Retrospective 259

18.1 MESSAGING 259

18.1.1 Message Conventions 262

18.1.2 Messaging Problems 262

18.1.3 The Policy Tool 264

18.2 GENERAL STRUCTURE 264

18.2.1 Tool Decomposition 265

18.2.2 Tool Wrappers 266

18.3 GRAPHICAL INTERFACES 266

18.3.1 The Effectiveness of Graphical Displays 267

18.3.2 The BWE Toolkit 268

18.4 EDITING 269

18.5 DEBUGGING 270

18.5.1 The Debugger Monitor 270

18.5.2 Debugger Information Viewers 271

18.5.3 Data Structure Display 272

18.6 PROGRAM DATABASE 273

18.7 EXPERIENCE WITH THE ENVIRONMENT 274

18.8 CONCLUSION 275

Bibliography 277

Index 281

Figures

FIGURE 1-1: Overall FIELD architecture 15

FIGURE 1-2: Sample FIELD screen 16

FIGURE 2-1: FIELD messaging architecture 19

FIGURE 3-1: The policy tool in the FIELD architecture 40

FIGURE 3-2: Processing a message in the policy service 49

FIGURE 3-3: Rule matching algorithms 50

FIGURE 3-4: Policy program for automatic compilation 51

FIGURE 3-5: Policy program to start the cross-reference service 52

FIGURE 3-6: Policy program to invoke the editor upon selection 52

FIGURE 4-1: The debugger interface in the FIELD architecture 55

FIGURE 4-2: Organization of the FIELD debugger wrapper ddt_mon 56

FIGURE 5-1: Cross-Referencing in the FIELD architecture 83

FIGURE 5-2: Query syntax for the cross-reference database 91

FIGURE 6-1: Services in the FIELD architecture 110

FIGURE 6-2: Messages for monitoring file operations 123

FIGURE 7-1: Development timeline for BWE 126

FIGURE 7-2: The architecture of BWE 129

FIGURE 7-3: Example of a GELO tiled-flavored object 136

FIGURE 7-4: AUXD resource file syntax 143

FIGURE 7-5: Sample resource file 144

FIGURE 8-1: The annotation editor in the FIELD architecture 146

FIGURE 8-2: Dialog for creating an annotation 152

FIGURE 8-3: Sample annotation editor 155

FIGURE 8-4: Information display dialog box for an annotation 157

FIGURE 8-5: Annotation query output window 160

FIGURE 9-1: Debugger tools in the FIELD architecture 162

FIGURE 9-2: The dbg debugger interface 163

FIGURE 9-3: Query dialog box for debugger button 165

FIGURE 9-4: Dialog box for editing a debugger button 166

FIGURE 9-5: Event viewer display 168

FIGURE 9-6: Stack display viewer 170

FIGURE 9-7: Variable trace viewer 171

FIGURE 9-8: User input-output viewer 172

FIGURE 10-1: The xref interface in the FIELD architecture 176

FIGURE 10-2: Reference query display output 178

FIGURE 10-3: Cross-reference query output 179

FIGURE 10-4: Cross-reference query dialog box 180

FIGURE 11-1: Flowview in the FIELD architecture 184

FIGURE 11-2: Sample directory-file-function hierarchy 185

FIGURE 11-3: FIELD call graph 186

FIGURE 11-4: Initial display of the FIELD call graph 187

FIGURE 11-5: Class groupings used in a call graph display 188

FIGURE 11-6: Local call graph display 190

FIGURE 11-7: Call graph information window 193

FIGURE 11-8: Call graph animation display 194

FIGURE 12-1: The class browser in the FIELD architecture 198

FIGURE 12-2: Complete class hierarchy for moderate-sized system 199

FIGURE 12-3: Display for a single class 200

FIGURE 12-4: Class display with collapsed hierarchy 201

FIGURE 12-5: Class hierarchy without member information 202

FIGURE 12-6: Class browser display showing visual encodings 204

FIGURE 12-7: Class browser showing member details 205

FIGURE 12-8: Class browser information window 208

FIGURE 13-1: The profiling interface in the FIELD architecture 212

FIGURE 13-2: Sample profile histogram display 213

FIGURE 13-3: Line display for profiling 214

FIGURE 13-4: Full profiling display 215

FIGURE 13-5: Information dialog for profiling 216

FIGURE 14-1: Formview in the FIELD architecture 218

FIGURE 14-2: Formview example 219

FIGURE 14-3: Formview display restricted to a file 221

FIGURE 14-4: Formview dialog box showing file information 222

FIGURE 14-5: Checkin command dialog box 223

FIGURE 14-6: Formview transcript window 224

FIGURE 15-1: Data structure display in the FIELD architecture 228

FIGURE 15-2: Default data structure display for a tree 230

FIGURE 15-3: List data structure display 231

FIGURE 15-4: Default data structure display for an array 231

FIGURE 15-5: Editor for defining type-based display mappings 233

FIGURE 15-6: Editor for defining arc objects 234

FIGURE 15-7: Display mapping for an empty tree 237

FIGURE 15-8: Data structure display using user-define mappings 238

FIGURE 15-9: Structures used for linked list display 238

FIGURE 15-10: List mapping and the resultant display 239

FIGURE 15-11: ListElement mapping and resultant display 240

FIGURE 15-12: Final list structure display 241

FIGURE 15-13: Inset window used in tree data structure display 242

FIGURE 16-1: Monitoring tools in the FIELD architecture 244

FIGURE 16-2: Graphical display of heap memory 245

FIGURE 16-3: Heap display showing various optional windows 246

FIGURE 16-4: Display of file input and output 248

FIGURE 16-5: Input/output viewer showing auxiliary windows 249

FIGURE 16-6: Performance visualization display 251

FIGURE 17-1: Default FIELD control panel 254

FIGURE 18-1: Tool composition methods 265

Tables

TABLE 1-1: Summary of environment types 13

TABLE 2-1: Parameter type options 23

TABLE 4-1: Event types 58

TABLE 4-2: Message-based command language 63

TABLE 4-3: STEP actions 66

TABLE 4-4: SYMINFO command options 67

TABLE 4-5: Programming interface commands 68

TABLE 4-6: Messages sent by the debugger 71

TABLE 4-7: Debugger commands for controlling execution 75

TABLE 4-8: Debugger commands for events 76

TABLE 4-9: Debugger commands for expressions 77

TABLE 4-10: Debugger commands for manipulating files 78

TABLE 4-11: Debugger commands for programming 78

TABLE 4-12: Debugger commands for csh -like interface 79

TABLE 4-13: Debugger commands for the gdb interface 79

TABLE 5-1: The file relation 85

TABLE 5-2: The reference relation 85

TABLE 5-3: The scope relation 86

TABLE 5-4: The declaration relation 86

TABLE 5-5: Declaration classes supported by xrefdb 87

TABLE 5-6: The call relation 88

TABLE 5-7: The function relation 88

TABLE 5-8: The hierarchy relation 89

TABLE 5-9: The member relation 89

TABLE 5-10: The member definition relation 90

TABLE 5-11: The client-server relation 90

TABLE 5-12: Cross-reference database commands 95

TABLE 5-13: Cross-reference database resource file commands 100

TABLE 5-14: Scanner output formats 102

TABLE 6-1: Configuration command options 113

TABLE 8-1: Annotation types 148

TABLE 8-2: Annotation pattern codes 150

TABLE 9-1: Escape sequences for buttons 164

TABLE 11-1: Mouse actions in the call graph browser 192

TABLE 12-1: Mouse commands for class browser 207

TABLE 13-1: Mouse actions in the performance visualizer 215

TABLE 14-1: Mouse actions in formview 220

TABLE 18-1: Code size for FIELD components 260

TABLE 18-2: Code size for BWE toolkit components 261

Preface

FIELD, the Friendly Integrated Environment for Learning and Development, is the research project that demonstrated that practical integrated graphical programming environments are possible. It did this by providing user-friendly graphical interfaces to a variety of programming tools and integrating these separate tools into a unified whole.

This book describes the FIELD environment. It discusses the history and evolution of the environment, concentrating on the development of ideas that both worked and didn't work. It discusses the inner workings of the environment, showing how each of the programming tools works and how the various tools interact with each other. It discusses the user interfaces provided by the various tools, how they are used, why they were chosen, and their strengths and weaknesses.

FIELD has been a remarkably successful research project. The ideas first exhibited in the environment now form the basis for most of the current generation of programming environments including Hewitt-Packard's Softbench, DEC's FUSE, Sun's SPARCworks, Lucid's Energize, and SGI's CodeVision. FIELD pioneered the notion of broadcast messaging as a basis for tool integration. Moreover, many of the other tool concepts we introduced in FIELD have found their way into these environments. Thus, in discussing the FIELD environment, this book actually explains the inner workings of today's programming environments.

The concepts presented here -- the message passing framework, the various graphical user interfaces, and the integration of a wide variety of tools to form a single application -- are applicable to domains other than programming environments. Many of the lessons learned from FIELD can be applied to general distributed object systems as well as a variety of new applications that can better be structured as loosely coupled processes rather than a single massive entity. The work on program visualization can be applied to visual database query interfaces and to visually browse the ever-expanding information highway.

The primary audience for this book are those interested in the development of programming tools and environments. The book will also be valuable to serious users of programming environments. The book should also be of interest to anyone undertaking a large software project, both by introducing the software tools needed to work on such a project and by demonstrating the concepts of message-based integration that can be applied to a variety of domains.

This book can be divided into three parts. The first part, Chapters 1 through 3, details the message-based integration mechanism at the core of the environment. The second part, Chapters 4 through 6, describes the underlying services provided by FIELD through wrappers around traditional programming tools. The third part, Chapters 7 through 17, describes the tools and user interfaces FIELD offers to the user.

Chapter 1 provides an overview of integrated programming environments, providing context for the use of message-based integration by briefly reviewing the history of such environments and the alternative technologies that were proposed and considered.

Chapters 2 and 3 detail the message-based integration mechanism, describing the concepts behind it, how it is implemented, and the simple interface it offers the various tools. Chapter 2 provides the basics, while Chapter 3 describes extensions to this mechanism that make it more flexible.

Chapter 4 describes the debugger monitor provided by FIELD. This service consists of a complex wrapper around the system debugger, either dbx or gdb , interacting with the rest of the environment through the message server.

Chapter 5 details the cross-reference database service, a new tool developed to support the environment. It is used extensively for visualization and as a service available to other tools and the user.

Chapter 6 describes the remaining services provided by the environment. Two of these are wrappers around existing programming tools, one for configuration management and version control and one for profiling, while the third is an interface for execution monitoring.

Chapter 7 describes the user interface tools developed to support FIELD and other programming projects, emphasizing the use of these tools in the environment.

Chapter 8 details the internals and the user interface of the annotation editor. Source code must be a central focus of any programming environment, since it is the programmer's concrete input and the program representation. The annotation editor provides FIELD with a clean interface for tying the source to the rest of the environment.

Chapters 9 through 17 then describe the user interfaces provided by the other FIELD tools, showing both how the tools are used and the tradeoffs made to use graphics effectively. Especially important here are the techniques used in the various tools for managing complexity, allowing graphical interfaces to be used on relatively large programs.

The final chapter provides an overview of the lessons we learned from the system, what we felt were its successes and failures, and some sense of future research directions for programming environments.

This book was written using FrameMaker. All the FIELD images in the book are screen dumps taken from the current working version of the system. Because we have been developing almost exclusively on color workstations over the last five years, the user interface has become color-oriented and the various images are taken from a color display. Since they are printed in black and white, the actual colors are shown using gray scale or dithered images. Where it is appropriate throughout the text, we describe the use of color in the images and leave the rest to the reader's imagination.

The FIELD environment is available in source form without charge via ftp on the internet or on various media at a nominal cost, although its use is restricted to non-commercial purposes. Documentation is available in the form of man pages and a new tutorial and reference manual. Persons interested should contact the Brown University Computer Science Department Software Librarian through email at brusd@cs.brown.edu or at:

Software Librarian
Department of Computer Science
Box 1910
Brown University
Providence, RI 02912

In addition, we maintain a mosaic page on the FIELD environment providing a variety of information. This can be accessed at http://www.cs.brown.edu/software/field .

The current version is known to run on Sun workstations with the current operating system. It has been ported to ULTRIX on DECstations, HPUX, IBM's AIX, and other systems, but we do not test these ports on a regular basis. For more information on the availability and current status of the environment, or questions regarding the book or any of the tools, contact the author at spr@cs.brown.edu . Please send any corrections or comments on the system of the book to the same address.

Acknowledgments

The FIELD environment is a large system that would not have been possible without the help of many people. Although I wrote about 95% of the current code in BWE and FIELD, I relied on others to provide feedback, write documentation, implement preliminary versions of the user-interface toolkit, and provide packages to which I couldn't get around. Because of the time span involved, I'm sure that I am not able to cite everyone who should be mentioned here, but I will try.

The early workstation toolkit was a joint effort between myself and Marc Brown, aided by students such as Mark Vickers and staff such as Joe Pato and John Bazik. Later development efforts were assisted by John Stasko and Kevin Brophy who wrote the original BWE editor and Stefan Tucker who wrote the initial implementation of RIP and MPSI.

The assistance I valued the most on FIELD itself was feedback on the environment in the form of bug reports, requests for new features, or suggestions for improving it. While this feedback came from many sources, several stand out: the teaching assistants for CS11 and more recently CS15 who have put up with the environment for five years have and continue to make numerous suggestions; David Bristor of Sun Microsystems was an early outside FIELD user who provided significant feedback as well as an epoch interface; Scott Meyers was a driving force in the design of the class browser; Moises Lejter, in addition to developing an emacs interface, provided several suggestions on supporting C++; and Yi-Jing Lin provided several suggestions while working on FIELD at IBM.

Others have helped by implementing tools that are now part of the student environment. These include David Fedor who wrote the autocommenting package, and David Simons, Thomas Donovan, and Boris Putanec, all of whom helped in developing the top level student interfaces.

Others helped by writing documentation. Carolyn Duby wrote an early tutorial on FIELD. More recently, Fausto Monacelli has written a user's manual and an accompanying tutorial. Tutorials for the student version of the system have been written and updated by the CS11 and CS15 teaching assistants each year. In addition, Marc Brown, Carolyn Duby, Moises Lejter, Scott Meyers, Joe Pato, and John Stasko have all contributed to the various research publications on FIELD and BWE.

Finally, I need to thank both Trina Avery and Scott Meyers for providing me with valuable feedback on this book itself, Scott from a technical perspective and Trina for acting as my editor.

In addition to thanking all the people who helped with these efforts, I want to acknowledge the many sources of outside funding that made this research possible. This includes support from the Defence Advanced Research Projects Agency, the National Science Foundation, Digital Equipment Corporation, IBM, Sun Microsystems, and NYNEX.

  1. Integrated Programming Environments

FIELD, the Friendly Integrated Environment for Learning and Development, is an integrated programming environment : a collection of tools that communicate and coordinate with each other in order to let the user create, edit, compile, debug, test, and maintain a programming system.

In order to understand FIELD, its tools, and the decisions made in building it, one must first understand the context in which it was undertaken. We start by giving background information about programming environments. Then we describe our objectives in designing and building a new environment. Next we describe the different integration strategies used in programming environments, illustrating how FIELD differs from previous approaches. We conclude with an overview of the FIELD environment.

WHAT IS A PROGRAMMING ENVIRONMENT?

Programming is a complex process that involves the coordination of people, ideas and code. Environments are sets of tools that assist in this coordination and automate the process. It is hoped that better and more powerful environments will simplify programming.

Programming tools can be used for a wide variety of different programming applications. These applications differ in their intended project size, the number of programmers involved, and the hardware required to support the environment. Programs today range from a few lines to tens of millions of lines of code. The issues that arise when one person writes a small program are vastly different from those that arise when a large team of people works together to write a large system with a long lifetime. The sets of tools and hence the environments that are appropriate to these problems also differ.

Many current programming tools are useful for programming-in-the-small : problems tackled by a single programmer or a small programming team and programs ranging in size from a few lines up to hundreds of thousands of lines of code. These tools are geared toward simplifying programming itself and improving the productivity of the individual programmer. Such tools -- compilers, loaders, editors and debuggers -- have been around for a long time, and many are mature.

These tools contrast with those developed for programming-in-the-large . These deal with the process part of programming, attempting to automate or simplify the coordination necessary among people and over time in creating a large programming system. Included here are tools for coordinating code such as library and version control systems, tools for coordinating ideas such as interface checkers, CASE (Computer-Aided Software Engineering) tools for specifying design, and tools for coordinating people. Most tools of this nature are fairly recent, and are typically suitable for programs involving up to a hundred programmers and millions of lines of code. Very large systems today go beyond these limits. However, such programming-in-the-huge projects are few as yet, and researchers are only beginning to grapple with what tools might be appropriate for them.

Environments for programming-in-the-small differ substantially from those designed primarily for programming-in-the-large. To reflect this, the two types of environments have been given different names: environments for programming-in-the-large are often called software development environments , while those for programming-in-the-small are called program development environments . While this terminology is not adhered to rigorously, it does provide a useful distinction.

Today both program development environments and software development environments are being designed for workstations. A workstation is a scaled-up personal computer. Today's workstations provide compute power to the individual programmer that far exceeds the overall capability of yesterday's mainframes. For around $5,000 today, programmers can have a 100 MIP machine with 32 megabytes of main memory, 200 megabytes of disk and a high-resolution graphical display. Two-hundred-MIP machines with 64 megabytes of physical memory, a gigabyte of local disk, and hardware-assisted high-resolution 3D graphics displays are available and will be commonplace in three to five years. This continuing revolution in computing allows old tools to be made more powerful and opens vistas for new ones.

Workstations emphasize two dimensions of this hardware revolution through their considerable compute power and their graphical interfaces. The workstation's compute power lets a programming environment contain tools that would otherwise be too compute-intensive. One such tool is an interpreter for a high-level procedural language such as C or C++, as in the Centerline environment [Kauf88a]. Another is the memory checker offered by Purify [Hast92a]. The advantages of graphics are less obvious. A high-resolution display lets programmers look at more than twenty-four lines of code at once. It lets them have multiple windows viewing different contexts of a larger program. It lets them view the code, the output, and the errors and still interact with the debugger all at the same time. Moreover, the presence of graphics opens the door to the use of visualization technology for understanding programs as dynamic entities.

A programming environment is more than just a set of programming or process-oriented tools. It is an attempt to provide a unifying framework for these tools. It gives programmers a consistent interface so that a set of independent tools appears as a single entity. This is achieved by integration ; how tools are packaged to achieve integration is an important part of the environment.

CLASSIFICATION OF ENVIRONMENTS

There are three basic methods of packaging tools to provide an integrated environment. The two simpler schemes are to build the environment either as a single system, as has been done with the various Lisp environments, or as a set of independent tools, as has been done with UNIX®1. The third method is to have a set of related tools and a way for those tools to communicate. The integration mechanism thus provided can allow a high degree of sharing and coordination among the tools.

Single-system environments, which date back to the 1960s, have typically been developed to support a single programming language by providing a set of integrated facilities. For example, Lisp environments, from the early versions of Interlisp [Teit74a] through those developed for Common Lisp, provide editors, compilers, a host of debugging facilities, as well as other programming tools. Environments for procedural languages developed with the early time-sharing systems, for example with the various BASIC environments and interactive Fortran environments such as Quiktran.

Such environments were ushered into the modern age in the 1980s with systems like Gandalf from Carnegie Mellon University [Notk85a], POE from the University of Wisconsin at Madison [Fisc84a], PECAN from Brown University [Reis85a], the Cornell Program Synthesizer [Teit81a], Magpie from Tektronics [Deli84a], and Mentor from INRIA [Donz84a]. In each of these new compiler technology was used to offer syntax-directed editing and incremental compilation for immediate programmer feedback. Many of these environments were language-independent in the sense that they could be generated from specifications for a variety of different languages. These systems also introduced workstation-based programming tools such as graphical program views. Today's successors to these environments are the programming systems available on personal computers, such as Symantec's Think C and Think Pascal for the Macintosh, and Borland's Turbo C, C++, and Pascal for IBM PCs.

Single-system programming environments can easily offer a high degree of integration since the tools share the same control and data structures. At the same time they have several disadvantages. The primary disadvantage is that they are closed systems. It is difficult to add new tools or capabilities to a single-system environment, especially tools designed outside of the environment. Even with relatively extensible environments such as the various Lisp systems, incorporating a tool written for one environment into another is quite difficult. A second disadvantage is that the resultant systems become large and hence difficult to maintain and understand. A third disadvantage, especially in those systems that deal with procedural languages, is that the systems have not scaled to handle large programs.

The original alternative to providing a single-system programming environment was to provide an independent set of tools that operate on files. Early time-sharing environments such as Multics or the Dartmouth Time-Sharing System had separate editors, compilers and debuggers. The culmination of a loose-collection-of-tools type of environment was and continues to be the UNIX environment. UNIX has evolved through the efforts of numerous people both inside and outside its birthplace, Bell Laboratories. Intended as a programmer's environment, it has slowly evolved a large set of powerful tools that cover many aspects of the programming process. Moreover, it has become a mainstay in university and industrial research environments, and is a fertile ground for developing and experimenting with new programming tools.

The UNIX programming environment is built around the C language. A number of programming tools exist to support C. The main one is the C compiler, a portable version that lets C programs run on a variety of machines with minor modifications. A linking loader supports libraries, and a profiling facility lets programmers track down and fix performance problems. UNIX also contains a large library of subroutines. It currently offers a choice of symbolic debuggers, from adb at the assembler level to dbx and sdb at the source level. There are also generators based on C, including lex for FSA-based coding and yacc for context-free parsing. Moreover, UNIX offers some of the best text-editors currently available for creating the programs in the first place, including editors that "know" C and provide such facilities as automatic indentation and primitive syntax checking during type-in.

UNIX has also acquired tools for managing the process of programming by automating the programmer's day-to-day activities and controlling the components of large systems. One such tool is the make facility [Feld79a]. This is a configuration manager that provides a command language where the programmer can describe how the system is to be put together. Make then does intelligent recompilation and binding as needed. Two different version-control systems, sccs [Roch75a] and rcs [Tich82a], are also available for controlling files in large systems over their lifetime.

UNIX has demonstrated that it is relatively easy to add new tools into a loosely coupled environment. This openness has led to the variety of tools that are currently available and thus has enriched the environment. This approach of using independent tools has two primary disadvantages. The first is potentially poor performance. Because each of the tools is independent and compartmentalized, there is considerable duplication of effort and excess file input and output. For example, most compilers under UNIX are not particularly fast: they first run a macro preprocessor, then the compiler itself to generate assembler code, then the assembler to generate an object file, and finally a linking loader to generate the executable. The turnaround time for a single-line change in a 100,000-line system can be a matter of several minutes even on today's fastest workstations.

Another disadvantage of loosely coupled environments is that they do not give the programmer a consistent, integrated framework. Each tool typically offers its own interface and its own command language. Moreover, there is little if any communication among the tools, forcing the programmer to be the integration mechanism. For example, it is the programmer who must correlate line numbers in compiler error messages with the corresponding location in the source program.

The third class of programming environment is an integrated programming environment consisting of a set of tools along with an integration mechanism that ties the tools together. This type of environment offers many of the advantages of both the single-system and the loose-collections-of-tools environments. It is an open environment in that new tools can be developed independently and incorporated later through the integration mechanism. Moreover, by providing a powerful integration mechanism, this approach can offer a high degree of coupling among the tools, and thus appear to the programmer as a single environment.

OBJECTIVES IN BUILDING FIELD

FIELD is an integrated programming environment. In order to understand it and its design, we first consider the objectives and motivations that led to its development.

We developed FIELD in the mid 1980s, a time when we and many others were experimenting with workstation programming environments. These environments were typically single-system, closed environments geared toward showing tools that used incremental techniques and visualization rather than handling large-scale programs. At the time we felt it should be possible to develop such an environment for our own programming, i.e. one that dealt with real, moderately-sized, procedural programs and fit into the UNIX framework. In addition, while working on the PECAN environment, we were frustrated by UNIX's lack of visual tools and integrated facilities. FIELD was developed, then, both to give real programmers good programming tools and to show that the work we and others had done in programming environments could have a practical application.

Our primary objective in developing FIELD was to produce a usable, scalable environment for UNIX programming. It had to be able to deal with existing UNIX programming languages, notably C and Pascal. It had to be able to handle programs of the size that could reasonably be developed with the current UNIX tools, about a hundred thousand lines. Moreover, the environment had to be easy to use and to offer additional capabilities beyond the existing toolkit so that we and others would want to use it.

A second objective in developing this environment was to preserve the openness of the UNIX framework. We wanted to be able to use our existing work and that of others rather than having to build a whole new environment from scratch. This required that the environment both use all existing UNIX tools and be adaptable so that future tools could be incorporated easily.

In addition to producing a friendly, open environment that would actually be used, we wanted to provide a showcase for programming environment research. We were especially interested in research related to the use of workstation graphics. Previous work had demonstrated that multiple views and program visualization could be valuable tools for understanding both the static structure and the dynamic behavior of programs. We wanted to develop an environment that would make such tools easy to produce, and one in which such tools could be used for existing programs.

Another objective of this new environment was to provide good programming facilities for students at Brown University. Brown has been using workstations in undergraduate and graduate computer science education since the early 1980s, but the programming environment, even for introductory students, was limited to the tools available under UNIX. At the time, student environments such as Think Pascal were offering more convenient, more interactive, and generally more appealing environments than were available for workstations. We wanted to remedy this situation and to demonstrate the potential of workstations for programming.

The final criterion in developing the environment was simplicity. We were currently engaged in other research projects and had neither the manpower nor the time for substantial effort on a new environment. We felt that it was both possible and practical to build a new environment that used existing UNIX tools with minimal effort. Moreover, we felt that an environment based on simplicity and on existing tools would provide a strong foundation for future extensions.

INTEGRATION STRATEGIES

The key to designing the FIELD environment to meet these objectives was to find an integration mechanism that was simple and inexpensive, scalable, allowed easy incorporation of new tools, and would let us use the existing UNIX tools with little effort.

Integration Requirements

We established four criteria for integration based on an analysis of the desired interactions among the various tools of an integrated programming environment:

  • Tools must be able to interact with each other directly;
  • Dynamic information must be shared among the tools;
  • All source access must be through a common editor; and
  • Static, specialized information must be available to all tools.

The first requirement is that tools be able to interact with one another directly. If the user wants to set a breakpoint in the editor, the editor must be able to issue the corresponding debugger command. If the user wants to force a recompilation from the editor, the editor must inform the make interface. If the compiler detects errors, then the current editor focus should be changed to the erroneous context. If the user wants to find all occurrences of a variable in the program, a request must be made of a cross-referencing utility. If a variable display needs information about the type or contents of the value it is to display, it must be able to query the debugger.

The second requirement for an integrated environment is that it allow dynamic information to be shared among the tools. Different components of the environment need to know the current execution context. For example, the editor might want to highlight the current line of execution or the line last selected in cross-referencing. Different components also need to know something about the state of the other components. For example, the editor needs to know when the debugger sets breakpoints so it can inform the user; the make interface needs to know when the editor saves a file so it can initiate an automatic recompilation; the values of variables being traced need to be broadcast to appropriate displays whenever they change; error messages generated by the compiler need to be associated with the corresponding source code.

The third requirement for an integrated environment is that it provide consistent access to the program's source. Programmers access the source for many reasons. They edit it either to create it initially or to make changes. They view it to correlate error messages generated by the compiler, to see where they are during execution, and to see what portions of the program have been identified as hot spots by the profiler. They set breakpoints at source statements, trace variables and expressions defined in the source, and designate source components to cross-reference. A fully integrated environment should provide a single tool for accessing the source that can accommodate all these needs and any others that arise.

The fourth requirement for an integrated environment is that static, specialized information be available to the tools. This information includes the rules needed to build the system, cross-reference information, profiling data, and information about the program and the execution environment. Program information includes data about the types of variables and descriptions of these types. Execution information includes the current set of breakpoints and other run-time events. All this information must be available to various components of the system on demand and must be actively managed so that requests are satisfied with up-to-date data.

Beyond these general requirements, the objectives described in the previous section for developing the FIELD environment imposed additional constraints regarding openness, easy extensibility and cost. Because we wanted to use existing programming tools, it was important that the integration mechanism be easily incorporated into these tools. Because we wanted the environment to support ongoing research in the area of programming environments, we needed to be able to incorporate new tools, both ours and those developed by others. Finally, because we wanted to develop the environment with limited resources, the integration mechanism had to be relatively inexpensive to build and maintain.

Data Integration

The integration mechanisms for environments discussed in See CLASSIFICATION OF ENVIRONMENTS are based on data sharing. All tools in single-system environments have access to data structures representing the program, program analysis, and execution information, and each of the tools can explicitly invoke other tools as needed. In PECAN, for example, the editor, after receiving text input, invoked the parser to make a change in the underlying syntax tree and then invoked the incremental compiler to have this change be reflected in the symbol table and other semantic structures. Any compiler errors were placed as annotations on the syntax trees, which could then be presented to the user through the editor.

In environments based on independent tools, data sharing is done through the file system at much coarser granularity. Here the editor writes out the corrected source file. The compiler reads this file and generates an object file. The loader combines one or more object files to produce an executable file with symbol-table information appended. The debugger reads the symbol-table information associated with the executable to offer symbolic debugging and to correlate the run-time code with the original source. In these environments, programmers invoke tools explicitly. Recently, however, this invocation capability has been integrated into some of the tools. Thus, the emacs editor can explicitly invoke the compiler, make , or the debugger, and various debuggers can invoke an editor or make .

A natural extension of data sharing for an integrated environment is to use a program database. Here a database system is used to store the relevant system information for all tools. A program database extends the low-level data-structure sharing used by single-system environments by letting independent tools access a specific set of common data structures in a controlled way. In effect, the shared data structures of the single-system environments are placed under the control of a database system that provides consistency and integrity.

There are two approaches to implementing a programming environment based on a program database. In the first, all the tools use the database directly. That is, the tools are designed with the database in mind and use representations that either are stored in the database or can easily be derived from the database. This has the advantage of efficiency and consistency, and is the approach being used to develop Ada programming support environments [Munc89a] where an attributed abstract syntax representation is stored in a common database. The compiler, debugger, loader, and other tools all access the program as an abstract syntax tree by going through the common database system. The principal disadvantage of this approach is that existing tools must be rewritten to use the database. A secondary disadvantage is that the database representation must be determined before the tools are implemented, so that adding tools not initially anticipated can cause problems.

The second approach to using a program database is to treat it as a "software backplane". Here the tools can use whatever representation is most appropriate: preexisting tools can use their current representations and new tools can be written to use whatever representation is most efficient for their application. The database system stores a single extensible representation of the data that it maps to the forms needed for each particular application when that application is run. This approach has the advantage of allowing the use of existing tools and of making it easier to write or incorporate new tools in the future. It has the disadvantage that the mappings from the database representation to the application representation can be complex and are not necessarily one-to-one.

The use of a program database has disadvantages. The additional system needed to maintain the database complicates the programming environment. Database systems are large, complex programs, and a program database that deals with multiple clients and maintains consistent information is no exception. Moreover, this strategy requires that the representation of the program be well understood before most of the tools are written. Adding new tools that do not fit well with the original definition can be difficult. Finally, program database schemas are generally designed with a particular language in mind. It is difficult to adapt them to a different language or to accommodate multiple languages simultaneously.

Control Integration

While shared data structures have successfully been used to achieve an integrated environment, we felt that the disadvantages of using a single system or a program database were too great for their use in FIELD. We decided that data sharing at the file level could be used effectively if augmented by a communication mechanism between the various tools.

Most programming tools are compartmentalized. The compiler needs access to the source but not to dependency information or to cross-references. The configuration manager needs information about dependencies but doesn't care about the actual contents of the source files. The debugger generally needs only to have the symbol table and the executable and to be able to display, not understand, the source. This compartmentalization is closely reflected in the files used in UNIX and similar environments and is one of the reasons for their success.

The integration required among the tools is mainly that one tool needs information known to another tool, or that one tool needs the services provided by another tool. Thus the editor might need to know what line is currently executing, where errors occurred during compilation, or where the definition of a given function is. Similarly, the editor might need to request that the debugger set a breakpoint at a given line, or the debugger might need to tell the editor to display the current function.

For this sort of integration, using shared data structures, either directly or through a database system, is overkill. A much simpler mechanism is possible. This involves limited communication among the tools so that one tool can request action or information directly from another. This is the basis for an integration strategy based on control rather than data.

Control integration can be achieved by providing message passing among the various tools. Each tool must be adapted to both send and receive messages. This can be done by modifying the tool or by providing a wrapper around the tool. Each tool must offer whatever functionality is required of it by the other tools through messages.

Control integration provides many of the benefits we were looking for. The resultant environment is still a set of basically independent tools, yielding a degree of openness that is not found in environments using data integration. Control integration is also a relatively inexpensive mechanism: both the amount of code needed to support messaging and the number of modifications needed to existing tools are relatively small.

There are, however, several potential disadvantages to control integration. One of the primary advantages of data integration is that work can be shared among the tools. For example, one tool can parse the source and store the result in shared data structures. This result of the parse is then be used by both the compiler and tools for cross-referencing and other syntactic and semantic analysis.

It turns out that the lack of this feature is not a serious problem. Modern tools are so compartmentalized that large amounts of information rarely need to be shared and the amount of work being duplicated is not significant. In the few cases where such sharing is helpful, existing tools can easily be modified to provide the additional information. For example, we modified the GNU g++ compiler to produce cross-reference information, a change that required only about 1000 lines of code. Similarly, Sun modified its compilers to generate output files for their source browser.

A related drawback of both environments based on loose collections of tools and control integration is the time spent waiting for compilation and loading to finish. Data integration mechanisms have the potential to facilitate incremental compilation and incremental loading and hence to provide immediate feedback to the programmer. Performance can be addressed in part in a control-based environment using new tools that speed development. Incremental loaders are available and will become standard. Intelligent editors are being developed to offer immediate feedback on both syntactic and semantic errors. New configuration management tools offer more selective recompilation.

A third disadvantage of control-based integration is that it does not guarantee consistency among the tools. Since each tool has its own data structures, modifications made in one tool may not be reflected correctly in another. This difficulty is minimized, however, because of the compartmentalization of current tools. Moreover, message-passing mechanisms provide a good framework for maintaining consistency among multiple views. The various types of environments are summarized in See Summary of environment types.

OVERVIEW OF THE FIELD ENVIRONMENT

Because of the disadvantages of data integration and our belief that control integration with existing programming tools would be a practical alternative, we developed a message-based integration mechanism.

Further analysis of the communications requirements in a control-based environment showed that two types of messages need to be sent. The first are command messages: explicit command requests sent from one tool to another to achieve some particular action or to retrieve a particular piece of information. The second message class is informational. These contain data known to one tool that might be of potential interest to other tools. For example, the file and line where execution stopped in the debugger is of potential interest to a variety of tools. Both of these message classes can be supported using a broadcasting mechanism with a central message server. Tools register with the server when they start. Then, as they execute, the tools send messages to the message server, which then broadcasts these messages to the other tools.

To make this practical, the broadcasting is selective. Each tool, when it starts, notifies the message server of the messages it is interested in receiving, specifying any command requests other tools can make of this tool and any information messages this tool will want to act on. Then the message server, when it receives a message from a tool, needs to broadcast it only to those tools that have previously expressed interest.

The center of the FIELD environment is a message server that supports this type of selective broadcasting . Messages handled by the server are simple strings, and string-based pattern matching is used to determine which clients receive the rebroadcast. Both asynchronous messages and synchronous messages with replies are supported.

This can be seen in the overall architectural diagram of FIELD shown in See Overall FIELD architecture. Here the message server, MSG, sits in the middle, serving as the communications and integration mechanism for a large variety of tools. The tools that are part of FIELD are shown as rectangular boxes. Other components of the environment that are not message based such as the cross-reference database xrefdb , are shown as rounded rectangles. Tools from the underlying UNIX environment are shown as ellipses. Solid arcs represent message-based communication. Dotted arcs represent subprocesses that are run using a pipe or a pseudo-tty.

Communicating with the message server are the various tools provided by the FIELD environment. These tools are of two basic types, services and viewers. Services exist to offer facilities to other tools in the environment. These include the back end of the debugger, ddt_mon , to control the execution environment, the formserver interface to make and rcs for configuration management and version control, a cross-referencer, xrefserver , that maintains databases on programs, and monserver , a monitoring service for sampling program execution.

Viewing tools provide the user interface to the programming environment. A sample screen from FIELD is shown in See Sample FIELD screen. The window at the upper right is the control panel showing the available tools. The principal viewing tool for the source is an annotation editor, annotedit , shown in the lower right. This is a wrapper around a full-function editor that provides annotations on the source. The annotations are tied to the message system. They let the user initiate commands for other tools from the editor and give other tools a consistent means for displaying information relevant to the source.

Other viewers are implemented as graphical front ends for standard UNIX tools or for the various service tools provided. The dbg debugging tool, shown in the middle left of See Sample FIELD screen, has both a textual and a visual front end, as well as a data structure displayer, display , shown in the upper right and displays of the current debugger state such as viewstack shown in the center. The configuration management service has a visual front end, formview , shown in the lower left. Textual and two graphical front ends display information stored in the cross-reference databases. The textual view, xref , is shown in the middle right. There are also a variety of visualizers for program monitoring as well as a graphical front end for the various UNIX profiling tools.

  1. The FIELD Integration Mechanism

FIELD integrates tools by providing a simple message-passing framework that supports selective broadcast messaging. The center of the framework is the message server that the various tools communicate with. When a tool starts, it finds the message server and sends it a description of the messages it is interested in. As it runs, the tool sends messages to the server. The message server matches each message against the descriptions that were registered by all the tools, and forwards the message to those tools that have expressed interest in it.

THE MESSAGE SYSTEM

This basic concepts of a central message server and broadcast messages have been used in a number of different systems, ranging from windowing systems to artificial intelligence systems to previous programming environments.

Evolution of Message Passing

The X windowing system offers a window-based event mechanism in which the X server acts as the message coordinator [Sche86a]. Each client specifies, for each of its windows, what events it is interested in on the basis of the event type. Events not handled by the immediately affected window are passed up the window hierarchy until a suitable window is found. Events are usually generated by the server, but can also be generated explicitly by clients.

Sun's NeWS windowing system generalized this mechanism by allowing the selection of events based on patterns [Micr87a]. Events are record structures and the clients define an event pattern using a sample message structure. An incoming message matches the pattern if each specified field in the pattern structure matches the corresponding field in the incoming message. While the X mechanism works across multiple processes, in NeWS each client downloads Postscript code to the server where the event handling is done using lightweight processes. This type of event handling is also characteristic of the model-view-controller input mechanism offered by Smalltalk [Gold83a].

The concept of a blackboard system is well known in artificial intelligence applications. Such systems are composed of multiple daemons, each responsible for one aspect of the application. The system itself and each of the daemons can post messages by effectively writing them on a common "blackboard", i.e. the central message server. Each daemon then reads and processes the messages that it is interested in. In these systems, the daemons are typically controlled by messages, i.e. they run only when appropriate messages are posted.

The use of generalized message passing has been less prevalent in programming environments. The ALOE system, developed as part of Gandalf, used action routines on abstract syntax trees [Kais85a]. When the syntax tree was changed or when the user requested an action, the nodes of the syntax tree were informed by predefined callback routines. This can be thought of as a simple message-passing scheme in which the syntax tree manager sends messages to the nodes by invoking the appropriate callbacks.

The PECAN program development environment [Reis85a] generalized such message-passing systems. PECAN provided a central event manager. Tools could define events by specifying the event name and the argument types, and could send events by specifying the event name and a list of arguments matching the predefined types. Tools could also register for events by specifying the event name and a callback routine to be called whenever an event of the given type occurred. This mechanism was essential to maintaining the consistency of multiple views and became the central means of organizing the system. The mechanism was much simpler than that used in FIELD: it did not work across processes, messages had to be defined explicitly and all parties had to be aware of all parts of the definition, and selectivity of message reception was based solely on the message type. Nevertheless, this mechanism was a primary motivation for using message-based integration in FIELD.

Message Architecture

The message facility uses both a client library that is linked into each FIELD component and a separate message server process as shown in See FIELD messaging architecture. Each tool and service starts by first initializing the client library. This cause the library to try to open a connection to the current message server. If no message server is found, a new one is started. After initializing the library, the tool registers patterns describing the messages it is interested in using the local client library. The local library then registers them in turn with the message server. As the tool runs, it sends messages through its client library directly to the message server. This is true even if the message is destined for itself. The message server determines which clients are interested in the message and forwards it to the corresponding client libraries. The client library finds each registered pattern that matches the incoming message, decodes the message according to the pattern, and calls the application with the decoded arguments.

TCP/IP domain stream sockets [Leff86a] are used for message communication, letting the message server and the tools run on different machines. The connection with the message server is based on a "known" file that defaults to

/usr/tmp/msg.<HOST>.<USER>.addr

where < HOST> contains the hostname of the server and < USER> contains the user's login id. An alternative file can be specified by using the -msgfile <file> option on the command line when starting the tool. The message server uses this file to record its host name and the message port it monitors when it starts. It also uses the UNIX file-locking mechanism to lock the file. When a client needs to connect to the message server, it checks that this file exists and is locked and then attempts to connect to the message server using the host and port number in the file.

The connection file makes it easy to insure that only one message server is active for a given session. It also gives the message facility a degree of protection or security based on the UNIX file system. The message server lets the initial client specify the group and owner as well as the file permissions for this file. Since new clients can attach to the message server only if they can both lock and read this file, access to the message server can be restricted to a given user or a given user group.

Messages are passed between the clients and the message server as ASCII strings. This simplifies the FIELD architecture in several ways: we need not worry about byte ordering or floating-point representation when passing messages between machines of different types; debugging the messaging system is simpler, since it is easy to monitor the message traffic; and obvious mechanisms are available for defining patterns to indicate the messages of interest for the various clients.

Message Conventions

In order to insure consistency between tools and to make it easier to add new tools, we adopted a set of conventions that define the form of messages. These conventions were based on our previous work with PECAN and our early experiences with FIELD.

Each message starts with a tool identification field. Command messages use the name of the tool group that handles the command. For example, all debugger command messages start with DDT and all cross-reference database command messages start with XREF . The identification string for information messages that are not directed toward a specific tool names the sender. All information messages send by the debugger, for example, are prefixed with DEBUG . Complex packages like the debugger, with lots of command and information messages use, different identification strings to avoid conflicts. Simpler tools, with a limited set of messages, use the same identification string.

Following the identification string is the name of the message. For a command message, this is the command; for an information message, this identifies what information is being sent. For example, the command message

DDT EVAL tree (*root)

is a command message to evaluate the expression (*root) , while the message

DEBUG VALUE tree /pro/field/test/tree.c 33 j 654

is an information message noting that the variable j has the value 654 at line 33 of file tree.c. Following the message name are additional arguments separated by spaces. The first argument is generally the name of the binary system (tree in the above examples) that the message refers to.

Other conventions are used in messages. All strings that might include an embedded blank are sent in quoted form. The message pattern matcher automatically recognizes the character `\37' as the start of a literal string that ends at the next such character. All file names are sent using full pathnames to avoid any problems with differences over the current working directory. Locations are identified by providing a file name and a line number and, when possible, a function name. Any text field representing an unknown value is replaced with an asterisk. Any unknown numeric field is sent as a zero.

Finally, messages and message patterns are typically defined as open ended, allowing additional parameters or identifying information to be added to the end of the message without affecting clients. Message patterns are defined so that any unexpected arguments at the end of the message are ignored. This has allowed us to augment messages to provided additional functionality without having to change existing tools.

PATTERN MATCHING

Patterns are used by the FIELD message server to determine which tools should receive the rebroadcast of a message. They are used by the client library within each tool to determine what routines should be called with the incoming message and to decode the message by extracting the arguments to be passed to these routines. Our goal in designing FIELD's pattern-matching strategy was to make message specification as simple as possible while at the same time making it easy to decode arbitrary messages.

One method for defining message patterns is to use the powerful patter