Competitive Agents for Information Filtering

Paul E. Baclace

Communications of the ACM, December 1992/Vol.35, No.12, pp.50.

Programming an information filter with rules can be a daunting task. Automatically constructing and refining a user's interest profile based on user feedback about documents can simplify this task. In the experimental filtering system being explored at Autodesk [1], a user selects a discrete rating value (e.g., terrible, boring, somewhat interesting, no comment, very interesting) for each document read. A learning algorithm correlates these user ratings with document features such as author, subject, selected keywords, organizations and shared ratings from earlier readers to prioritize incoming information.

Ideas from Agoric Open Systems [2], an optimization technique based on an economic model in which agents compete via marketing trading, were used to create a database of agents that learn to prioritize incoming information. The system starts out by extracting features from a document (e.g., author John Doe or keyword modem) and wakes up dormant agents that are sensitive to the found features. When the agent database is empty, as it is when first used, no agents are found; so the filtering system rates information with no comment and does no prioritization. All features not found are used to create new agents for future use. Thus, the system bootstraps itself and avoids prioritizing information with which it has no experience.

Each agent is sensitive to individual features or conjunctions of features (e.g., keyword object and keyword oriented) and has a store of money and a rating prediction value in range [-1., +1.]. The discrete user ratings are mapped onto this real number range. Each time an agent is triggered, it must pay a fixed transaction cost.

After all features in a document are scanned and a set of agents are activated, the estimated ratings of these activated agents are summed to form the predicted rating (priority) of the document. This process is repeated until a batch of rated documents is prioritized and presented to the user.

After the user reads a document and provides a rating, the active agents for that document are paid off or charged in a competition according to how close each came to predicting the user's rating. Better-than-average agents share a reward of $1 put into the agent economy to buy the prediction.

Agents that are farther than average from the user's rating are charged a penalty which is redistributed to the better-than-average agents. This competition uses [a] crowding-factor which limits the number of [profitable] active agents to some maximum. If too many agents are triggered for a particular document, then many end up losing money because the transaction fee is larger than their share of the $1 reward. After this, the active agents learn individually: they adjust their predicted rating, moving it closer to the user's rating by some amount to prevent the size of the agent database from increasing without bound, agents pay rent weekly; those that run out of money are deleted from the database in a given database of agents there are three kinds of agents:

  1. Good agents that get regularly paid and can pay their rent.
  2. Spurious agents that never get triggered again (such as spelling errors) and eventually run out of money due to rent.
  3. Random agents that are of no predictive value that eventually run out of money due to competition.

Given this competition, it is hoped that for most kinds of documents the agent database will have sufficient agents of kind (1) to provide good filtering.

1. Baclace, P. Personal Information Intake Filtering, Bellcore 1991 workshop on High-performance Information Filtering.

2. Miller, M.S. and Drexler, E. Markets and Computation: Agoric Open Systems In The Ecology of Computation, B.A. Huberman, Ed., Elsevier, 1988, pp. 133-176. See The Ecology of Computation.

[Corrections to 1992 CACM publication indicated like this.]