Proceedings Template - WORD - PDF

W
Shared by: mimama
-
Stats
views:
13
posted:
11/16/2008
language:
English
pages:
4
Document Sample
scope of work template
							                                          UMBCTAC Strategy
                                                     Li DING, Yongmei Shi,
                                                    Zhongli Ding, Rong Pan
                                                    Department of CSEE, UMBC
                                                        ding.li@umbc.edu
ABSTRACT                                                              Some factors are partially predictable, such as airline ticket price.
In this paper, we describe the design and implementation of           Figure 1 shows how airline ticket price changes over time. (We
UMBCTAC.                                                              assume the start price to be $0.) The figure shows that the airline
                                                                      ticket tends to have larger variation at the end of game.

Categories and Subject Descriptors                                                             600
                                                                                                                          Airline Ticket price change over time

Algorithms.                                                                                                                                                           mean
                                                                                                                                                                      min
                                                                                               500                                                                    max

General Terms
Algorithms, Design.                                                                            400




                                                                        price( start from 0)
                                                                                               300
Keywords
TAC                                                                                            200


1. INTRODUCTION                                                                                100

The Trading Agent Competition (TAC) is an international forum
designed to promote and encourage high quality research into the                                 0

optimization problem in dynamic and uncertain context. The
description of TAC can be found in TAC website.                                                -100
                                                                                                      0             5           10          15        20          25          30
                                                                                                          # of changes, price changes every 24~32 seconds, at most 12 minutes
2002 UMBCTAC is completely different from 2001 UMBCTAC.
It is basically a heuristic based approach. Heuristics are learned    Figure 1 how airline ticket price changes over time (statistics
from the statistical observations of game history and common          based on 10000 controlled experiments)
sense.
In this paper, section 2 will discuss some preliminary observations
                                                                      2.2 Learn from the game history
                                                                      There are quite a few heuristics we have learn from 2001 TAC
based on the game history and rules. Section 3 will introduce the
                                                                      reports, such as “short trip will always have better performance”,
architecture and design details of UMBCTAC. Section 4 will talk
                                                                      “early bidder has good performance” and etc. How to predict the
some future work and conclude our work.
                                                                      hotel price is a hot topic. An accurate estimation of hotel price is
                                                                      really helpful for the agent to make decision, especially for those
2. Preliminary Observations                                           agents using linear programming (LP) approach.
2.1 Learn from the rules                                              Game history is a good source for estimating hotel price. And
 “The agent who can achieve best score will win in the game”. In      there are quite a few choices: (1) we can randomly draw a price
order to achieve the best score, an insight view should be taken on   from all history prices on one auction. The approach doesn’t make
the utility function, which dominates the final score.                much sense because the randomly picked price may be skewed;
  Table 1 Utility function (Adapted from TAC description)             (2) median or mean over historical price can be used as an
                                                                      estimation of hotel close price. Our observation shows that
u = 1000 - travel_penalty + hotel_bonus + fun_bonus                   median is a little bit optimistic while mean is a little bit
where                                                                 pessimistic.
travel_penalty = 100*(|AA - PA| + |AD - PD|)                          Figure 2 and 3 show 1000 and 100 games statistics in seeding
                                                                      round. There are 20 legal hotel combinations, 10 for each type of
hotel_bonus = TT? * HP                                                hotel. The 10 hotel combinations are: 1, 2, 3, 4, 12, 23, 34, 123,
                                                                      234, 1234, where 1 means Monday, 2 means Tuesday and so on.
fun_bonus = AW? * AW + AP? * AP + MU? * MU                            In the figures, we compute the estimated price for each hotel
                                                                      combination, the 1-10 on x axis are for cheap hotel and 11-20 are
                                                                      for good hotel. The two figures show that: (1) shorter hotel
At the first glance, all the factors seem to be random and            combinations cost less; (2) cheaper hotel cost less; (3) the
unpredictable. However, that is an illusion.                          distribution of hotel combination price is very similar for 100
Some factors are truly unpredictable, such as the order of hotel      games and 1000 games.
auction closure, the start price of airline ticket and client
preferences.
                            seeding.txt 2166-3134                                Client Agent is the representative of client. It holds a list of
     600                                                                         favored travel plan with ranking, and initial user preference.
                                                                                 It can negotiate with dealer agent to find a good travel plan
                                                                                 for the agent.
     500
                                                                                 Predictor agent is indeed a learner of history and human
                                                                                 knowledge. It is used to estimate hotel cost, e-tickets cost,
     400                                                                         and hotel auction risk based on expert knowledge and game
                                                                                 history.

     300
                                                                                 Memo agent is added to system in case of system failure. It
                                                                                 records decision information, such as clients’ travel choices,
                                                                                 to a log file. So that the system can start and restart at any
     200                                                                         time during the game.


     100



      0
       0     2    4     6     8      10     12      14    16    18    20



Figure 2 Hotel price based on 1000 games in 2002 seeding
round. Mean is denoted by red solid line, and median is
denoted by blue circles.
                             temp.txt (2928-3134)
     600

                                                                            Figure 4 UMBCTAC Multi-agent Achitecture
     500

                                                                            3.2 Hotel/Airline auction strategy
     400                                                                    UMBCTAC use early bidder strategy on hotel and airline auction.
                                                                            It allocates all resources at the very beginning and fix on the plan
                                                                            all the time. We choose early bidder strategy because: (1) change
     300                                                                    of plan will always cost a lot; (2) easy to implement and control;
                                                                            (3) gain- risk model can be used to achieve good statistical
                                                                            results.
                                                                            Gain-Risk model is used to select a good combination of travel
     200
                                                                            plans for all clients. i.e. it will pick up a good travel plan for each
                                                                            client and then find a combination of these plans which achieve
     100                                                                    low risk and good gain. Gain can be evaluated by the sum of
                                                                            estimated score for each client. Risk is the probability of reaching
                                                                            very high close price in hotel auction, i.e. with high probability
       0
        0    2     4    6      8      10     12      14    16    18    20   we will spend more money on hotel rooms.
                                                                            The algorithm consists of three important parts: how to estimate
                                                                            gain, how to estimate risk and how to search a good solution with
Figure 3 Hotel price based on 100 games in 2002 seeding                     both good gain and low risk.
round.
                                                                            3.2.1 Estimate gain
3. Design issues                                                            We know that a client has 20 possible travel plans, each of which
3.1 System architecture                                                     can be represented with triple (in-day, out-day, hotel-type). The
UMBCTAC is designed as a multi-agent system. TACAgent is a                  score of a travel plan can be calculated according to the given
modification of DummyAgent provided by SICS. We add some                    utility function. At the beginning of the game, we know air
more agents to create an auction agent community.                           penalty, air cost, hotel bonus and we don’t know hotel cost, e-
                                                                            Bonus. So the client agent can consult with Predictor agent for
    TAC Agent is in charge of communication, with both TAC
                                                                            estimated hotel cost and e-Bonus. Then an estimated scored can
    Server(use XML) and human users( use GUI)
                                                                            be computed for any possible travel plan.
    Dealer Agent is the head of all agents. It maintains the
    statistics of currently owned resources, globally adjusts               3.2.2 Estimate risk
    initial travel plans according to gain-risk model, and handles          There are lots of factors affect the risk. Since we estimate hotel
    e-Tickets auction                                                       price based on statistical value of game history, and the mean or
median will always achieved when our bids quantity in each             There are several possible situations in the e-ticket market: (1) no
auction is around average, we use the difference between the true      one wants tickets, thus the sell price will decrease; (2) no one sells
allocation of the combination of 8 clients’ travel plans and           tickets, thus the buy price will increase; (3) someone wants to sell
expected allocation an agent can have to evaluate risk. Therefore      and someone wants to buy but their prices do not match yet. In
we got some heuristics.                                                the third case, the price change can be modeled as several rounds,
          Since there are 16 rooms in each hotel auction and there     each round starts from large difference, and end up with a match
          are 8 agents, an agent can have 2 rooms in an auction in     (see figure 5). The third case is very common in real life. To buy
          average. More room allocation in an auction will cause       a ticket in a low price, we need to determine when to buy. Our
          higher risk.                                                 approach use following rules:

          The penalty of high risk is much higher than the profit.               Use a function which can change over time.
          i.e. we shouldn’t attempt high risk.                                   Use a random factor to increase the probability of
          High risk is caused either by a trip with long duration or             achieving a match
          too much room allocation in a hotel auction. And day 2,                Use a threshold to avoid pay too much or sell in a too
          3 have higher risk than day 1, 4.                                      low price
          In most time, a travel plan witch matches or between
          the preferred arrival day and departure day has lower
          risk than others.
Currently we use threshold and weight to quantify the risk like the
formula below. For each hotel auction, we setup a threshold on
maximum rooms we can bid in that auction and a corresponding
risk weight. When the allocation of rooms is over our threshold,
there will be some risk, otherwise no risk. The weight is higher
for day 2, 3. Then we sum up the risks for each hotel auction to
get the overall risk.
Risk = Sum ( # of rooms over threshold * weight)
Another consideration is the travel plan selection. For each client,
we know the some travel plans are good and with low risk, while
the others not. So we only need to consider a small part of the 20
travels for the client.
                                                                                   Figure 5 Price change in e-ticket auction
3.2.3 Search best gain-risk
Based on the above analysis, an algorithm is used to choose the
best travel plan combination (table 2).                                3.3.2 Determine bid action and bid price
            Table 2 Balanced Gain-Risk Algorithm                       When we place a buy bid, it can be either higher than current sell
                                                                       price, which means we want to buy the e-ticket now, or lower,
1.   Clients select favored travel plan (FTP)
                                                                       which means we only want to spend that amount to buy an e-
2.   Clients estimate score for each FTP,                              ticket and we will wait for someone willing to sell at that price.
3.   Clients submit their FTP with score to Dealer                     Since the travel plans for all clients are already settled, the
4.   Dealer globally search among possible travel plan                 problem is to find an optimal allocation of e-tickets for all clients.
     combinations, to find the one with lowest risk. If there are      This is a dynamical allocation problem because we can buy and
     more than one combinations which have the same risk, we           sell e-tickets during the game. Instead of using liner
     choose the one which has highest gain                             programming, we try a probabilistic based approach.
5.   Dealer submits bids aggressively                                  We do not allocate e-ticket to any client during game, and each e-
                                                                       ticket auction is handled individually. In each auction, we simply
3.3 E-tickets auction strategy                                         estimate how many tickets are needed, estimate how much the
We separate the entertainment ticket bidding component from the        clients will pay for that, and decide whether sell or buy a ticket.
airline, hotel auction. i.e. we first settle down the optimal travel   We compute the lowest price that clients can offer, the sell price
plan combination with algorithms in 3.2. Then, based on the            range, and the buy price range.
travel plan, we will dynamically bid in entertainment ticket                     The lowest price clients can offer is determine by how
market to achieve optimal e-ticket allocation.                                   many clients will stay in Tampa on that day, how long
                                                                                 they will stay in Tampa, and their bonus on the
3.3.1 Bid price change over time                                                 entertainment corresponding to this auction.
Bid price includes: sell price—how much to you ask for an e-
ticket and buy price—how much you want to spend for an e-                        The sell price range consists of a high price and a low
ticket.                                                                          price. If current buy price in auction is higher than low
                                                                                 price, we can sell the ticket; if current buy price in
                                                                                 auction is lower than low price, we do nothing; else we
          will put the high price on the auction and see if anyone       4. Future work
          can accept it.                                                 UMBCTAC use simple heuristics to achieve average behavior.
          The buy price works the same as the sell price.                And the result is that it ranks 2nd in qualify round, 3rd in seeding
                                                                         round and 4th in final. We believe these are the best a heuristic
                                                                         coded trading agent can achieve.
Here we introduce price range (low, high) in order to achieve
better performance. For example, we can delay a sale to get better       During the game, we also have some interesting observations, and
price because some time the buyer might offer better price one           they can help us design future TAC agents.
minute later. We also can’t wait too long because someone else                     Our gain-risk model will always output such travel plan
might sell the e-ticket at that price before we do. It is still a kind             combinations which need two rooms in each auction,
of heuristic, but in real game, it does work.                                      especially in auctions on day2, 3. That is, every time
Another consideration is the tendency of buy and sell. From the                    wee need fixed number of room. Can we change the
observation on lots of game, we found that in each auction,                        search policy to “find the plan combination with best
                                                                                   gain while use and only use two rooms in each
          We always don’t need to have more than 4 tickets.
                                                                                   auction”?
          In an e-ticket auction, the tendency of buying a e-ticket
                                                                                   Another consideration is how to measure gain. When
          is related to the number of tickets we currently have.
                                                                                   current distribution of hotel price becomes quite
          When we have less than 2 e-tickets, we might need to
                                                                                   different from the one learned from history, the
          purchase some tickets, on when we have more than 2 e-
                                                                                   estimated score will be meaningless. Such situation
          tickets, we might need to sell some tickets
                                                                                   always happens at the beginning of each round. Can we
We run the bidding algorithm (see below) individually for each e-                  use margin value instead of score to evaluate the gain
ticket auction. Let k be the number of e-tickets dealer agent                      of a travel plan? A margin value is the score without
already has in hand. We use a probability function P(k) to                         the estimated hotel cost, and it indicates the maximum
determine whether buying or selling. Let w be the highest price                    profit we can get from that client.
which can be offer by the 8 clients for that auction. Let t be the
                                                                                   Another consideration was discussed on the risk
percentage of time which has passed, it range from 0 to 1.
                                                                                   evaluation in gain-risk model. A sum up of the risk is
1.   #initialize                                                                   not theoretically sound. We’d better estimate the risk
2.   compute k, P(k) based on current allocation                                   for each auction, then compute the risk for each client
                                                                                   buy multiply the values, and then sum them up.
3.   compute w according to the need of clients
                                                                                   An early bidder does work well in lots of cases. But it
4.   # buy                                                                         also have some vital problems: when the hotel price
5.   compute (low-buy, high-buy) price based on P(k), t and w                      rises up to a very high level, it can’t withdraw; when it
                                                                                   is overbid, the travel package will failed too. A good
6.   with probability P(k), we send a buy bid – if current ask
                                                                                   solution is to dynamically change travel plan in the
     price in auction falls between our acceptable range, we buy it
                                                                                   middle of the game. There are two kinds of strategies
     instantly, or we post the low price in the auction
                                                                                   commonly used: (1) bid more rooms in each auction, so
7.   # sell                                                                        as to get more rooms in early closed auction and to be
8.   compute (low-buy, high-buy) price based on P(k), t and w                      overbid in late closed auction. Such approach works
                                                                                   well when there are not too many clients want stay in
9.   with probability P(k), if current bid price in auction falls                  Tampa, but it also boosts the average close price of all
     between our acceptable range, we sell the e-ticket instantly,                 auctions; (2) delay some airline ticket purchase – when
     otherwise we post a sell bid with high price                                  being overbid in hotel auction, change travel plan to a
Note that the buy price is always less than w, while the sell price                shorter one. Such approach works well when there are
is always larger than w. Our current approach, P(k) is computed                    too many clients want stay in Tampa, but it might
by following formula                                                               introduce extra cost for delayed airline ticket purchase.
                                         k
                                                                                   Can we combine the two difference approaches to
                            P(k) =0.93                                             achieve better performance?
                                                                         Our approach is highly heuristic based, but its performance is not
          Ticket owned           Probability P(k)                        too bad. We still not fully understand why and how the heuristics
          0                      0.9                                     work, and the theoretical bound of our approaches. Future work
                                                                         can be taken on reveal theoretical explanation and rules for
          1                      0.729                                   dynamically finding a solution in uncertain context.
          2                      0.38742048
          3                      0.058149736                             5. ACKNOWLEDGMENTS
                                                                         Our thanks to Dr. Finin, Dr. Peng, and Dr. Oates for help and
                                                                         support.

						
Related docs