Projects

NoTamper, WAPTEC, Plato: Web Form Analysis and Synthesis (2008-Present)

On the World Wide Web, it is common for websites to request information using web forms, e.g. Amazon will ask for a name, address, city and ZIP code. These web forms provide visual feedback as a user enters information, pointing out mistakes as they are made. For example, only certain combinations of cities and ZIP codes are compatible; neither is sufficient to compute the other, but there are constraints on the combination. When someone enters an incompatible city and ZIP code, the form will highlight those widgets, pointing to an error.

To complicate matters, for the sake of security any error checking that occurs on the web form must be replicated on the web server that processes that form; otherwise, an attacker who submits data to the server without going through the web form could force the server to perform actions that violate the spirit of the application. For example, a banking application might provide a web form enabling money to be transferred between two accounts owned by the user. If the web form forces the user to select two accounts she owns but the server fails to check that both accounts are owned by the user submitting the request, a malicious user could transfer arbitrary amounts of money between two arbitrary accounts by submitting a request directly to the server.

To address these issues, we developed three tools. NoTamper and WAPTEC are tools that automatically detect constraints on user input that are enforced by the web form but not by the web server. NoTamper discovered the flaw in the banking application discussed above. NoTamper treats the web server as a black box, making it applicable to a wide range of web applications; WAPTEC treats the web server as a white box, and while more accurate and complete is only applicable when the web server is written in PHP.

The third tool, Plato, takes a different approach. Instead of detecting the presence of mismatches between the web form and the web server, Plato automatically generates the web form and the error-checking code on the web server, ensuring that all of the error checks performed by the form are also performed by the server. Plato takes as input a logical representation of the intended semantics for the data accepted by the form and generates (i) a web form that both checks for errors and fills in values implied by user-provided data and (ii) web server code that when invoked identifies errors and/or computes implied values.

Papers:

Plato: A Compiler for Interactive Web Forms (PADL 2011) (Tech Report with proofs)
WAPTEC: Whitebox Analysis of Web Applications for Parameter Tampering Exploit Construction (CCS 2011)
NoTamper: Automatic Blackbox Detection of Parameter Tampering Opportunities in Web Applications (CCS 2010)
Automatic Web Form Construction via Compilation of Paraconsistent Entailment to Relational Databases (Tech Report 2010)
SmartForm: A Web-based feature configuration tool (VAMOS 2009)
Inconsistency-tolerant Reasoning with Classical Logic and Large Databases (SARA 2009)

TBA: Tag-Based Authorization (2009-Present)

Two MITRE reports published in 2005 describe the problems the U.S. government has had in recent years with its access control system when engaged in coalitions with other countries. Coalitions arise whenever several countries collaborate to address issues of worldwide significance such as the 2011 military action in Libya and tsunami in Japan. Coalitions are problematic for access control because they require huge changes to each country's access control policy. Whenever a country joins the coalition, the U.S. must grant all of the new country's operatives rights to access pertinent information, and whenever a country leaves the coalition, the U.S. must revoke those rights.

TBA (Tag-Based Authorization) was designed as a replacement for the current U.S. scheme, which is based on Bell-LaPadula. In TBA, both users and documents are assigned tags representing their pertinent characteristics, and access control decisions are made based solely on the tags of the users and documents. Those decisions are expressed in a logical language that was designed for simplicity while allowing for modular policy authoring via delegation. TBA achieves the same scalability of Bell-LaPadula because tagging can be accomplished by relatively untrained users, yet achieves nearly the same flexiblity as many other logic-based access control solutions in the literature because the majority of the access control policy is expressed in a logical language.

Papers:

TBA: A Hybrid of Logic and Extensional Access Control Systems (FAST 2011)

FML: Declarative Network Management (2007-2009)

NOX is a recent network architecture that supports flow-level control of a network. Each flow (communication using a specific protocol that is initiated by a user logged onto a machine connected to a physical port and received by another user logged onto another machine connected to a different physical port) is constrained by a single network-wide policy. Every time a flow is initiated, a central controller looks to see which constraints should be imposed on that flow by consulting the policy.

FML (Flow Management Language) is the declarative language that NOX uses for expressing network security policies. FML admits collaborative policy authorship, conflict identification and resolution, policy relaxation and tightening through the addition of statements, and extremely efficient processing.

Papers:

Practical Declarative Network Management (SIGCOMM Workshop)
Expressing and Enforcing Flow-based Network Security Policies Language (Tech Report)
Used as the access control language for a commercial product by Citrix

Collaborative Programming (2008)

Collaborative Programming consists of all settings in which groups of people issue instructions to computer systems. Collaborative Programming differs from traditional programming settings because instruction sets can be incomplete and conflicting. An incomplete instruction set may only say what to do some of the time or what actions the system is forbidden from performing. A conflicting instruction set may simultaneously instruct the system to perform some action and forbid the system from performing that same action.

Technology that supports Collaborative Programming must be able to combine independently authored instruction sets and be tolerant of incompleteness and conflicts. This project endeavors to understand fundamentally the role that contradictions play in computer science: should they remain relegated to information integration problems or should they play a more central role in the field, eventually being embraced by software engineers?

Papers:
Collaborative Programming: Applications of Logic and Automated Reasoning (IJCAR Workshop)

Herbrand Logic (2006-2008)

Herbrand logic is a simple derivative of first-order logic where every object has a name and two distinct names always refer to different objects. It is easier to teach than first-order logic, and it is better able to express many of the problems faced by computer scientists.

Herbrand logic builds in some of the central assumptions found in deductive databases, logic programming, constraint satisfaction, and (some) formal verification techniques. It turns out that Herbrand logic is more expressive than first-order logic. For example, arithmetic for the natural numbers appears to be finitely axiomatizable in Herbrand logic while it is not even recursively enumerable in first-order logic. The price for this greater expressiveness is lack of computability: neither entailment nor satisfiability is semi-decidable. Nevertheless, it allows computer scientists to express more of the problems they care about solving.

Paper: Herbrand Logic (Tech Report)

Extensional Reasoning (2001-2008)

Relational databases have had great industrial success in computer science, but today's automated theorem provers and model builders do not routinely leverage that success. In Extensional Reasoning, when given a question about knowledge represented in classical logic, the system automatically translates it into a question about a database system. Answering the database version of the question can sometimes be several orders of magnitude more efficient because the knowledge captured by the logic has been organized into a special form that is amenable to processing. In a database system, every predicate corresponds to exactly one table, and the system differentiates between the explicit (extensional) predicates and the implicit (intensional) predicates. Extensionally defined tables can be reasoned about very efficiently because of smart indexing; intensionally defined tables can be reasoned about efficiently because negation as failure is used. Current work focuses on translating a decidable logic into Datalog, but an obvious enhancement would translate an undecidable logic into a logic programming language.

It turns out that for complete logical theories, i.e. those satisfied by essentially a single model, the major hurdle in translating a theory into a database system is detecting that the theory is complete. The case of incomplete theories is theoretically more interesting because each of the models that satisfy the theory must be compiled out into a single model; thus, Extensional Reasoning in the case of incomplete theories amounts to a form of theory-completion. Once the theory has been completed appropriately, translating into a database is straightforward.

Papers:

Injecting the How into the What: Investigating a Finite Classical Logic (KR)
Extensional Reasoning (thesis)
Extensional Reasoning (ESARLT)
Reformulation for Extensional Reasoning (SARA)
Axiom Schemata as Metalevel Axioms: Model Theory (AAAI)
See also Automatic Web Form Construction

General Game Playing (2005-2007)

A General Game Playing (GGP) system is one that can accept a formal description of an unknown game in a known language at runtime and, without further human interaction, can play the game effectively.

General Game Playing systems are characterized by their use of general cognitive information processing technologies (such as knowledge representation, reasoning, learning, and rational behavior). Unlike specialized game playing systems (such as Deep Blue), they do not rely exclusively on algorithms designed in advance for specific games. Deep Blue may have beaten the world chess champion, but it cannot play checkers or even balance a check book.

General Game Playing is a recent initiative developed by the Stanford Logic Group. To foster research in this area, we developed an online testbed that allows players to compete in multi-player games; we ran a GGP competition at AAAI the last two years; we taught classes on GGP at Stanford and supported GGP classes in two universities in Germany.

Papers:
General Game Playing: Game Description Language Specification (Tech Report)
Automatically Proving Playability Through Abstraction (Tech Report)

Transfer Learning (2005-2007)

Knowledge acquired in one setting can often be employed to aid agents in another setting. The similarity of the settings has great impact on what knowledge can be transferred and the extent to which that knowledge is valuable. DARPA/IPTO has sponsored a program to investigate the idea of Transfer Learning, where agents are given a set of source problems and a set of target problems. The agent's performance on the target after playing the source is compared to the agent's performance after playing just the target.

Our role in this project is to provide a formal foundation for this work and a testbed built on this foundation for evaluating an agent's ability to perform transfer. Our approach leverages the General Game Playing framework to formalize the notion of a problem and mappings between general games to formalize relationships between problems.

Paper: Transfer Learning Level Definitions (Tech Report)

HP-Utility: Utility computing (2004-2005)

Hewlett-Packard is interested in building a utility computing environment. They have a large number of resources at their disposal (computers, software, etc.) and want to rent the use of that equipment to clients who each want it configured in a certain way. For example, a small business might want an e-commerce site set up but does not want to maintain the hardware; such a business might employ HP to construct and maintain such a site.

Currently, designing, constructing, and maintaining those resources is done by hand. HP wants to automate more of that process by employing declarative descriptions of resources and their configurations so that results from constraint satisfaction and planning can be brought to bear on the problem.

Papers:
Object-Oriented Constraint Satisfaction Problems (Tech Report)
Using Object-Oriented Constraint Satisfaction for Automated Configuration Generation (DSOM 2004)

FXAgents: Web service composition (2002-2003)

This project includes the dynamic discovery and composition of web services. We leverage the flexibility of first-order logic to describe web services and have attempted to develop web service execution plans using standard deductive planning techniques. (Un)fortunately, we have found these techniques insufficient for our needs, and have begun investigating several planning topics in dynamic environments like the web.

Paper: Adding AI to Web Services (LNAI 2004)