Proceedings Template - WORD - PDF
Document Sample


UMBCTAC Strategy
Li DING, Yongmei Shi,
Zhongli Ding, Rong Pan
Department of CSEE, UMBC
ding.li@umbc.edu
ABSTRACT Some factors are partially predictable, such as airline ticket price.
In this paper, we describe the design and implementation of Figure 1 shows how airline ticket price changes over time. (We
UMBCTAC. assume the start price to be $0.) The figure shows that the airline
ticket tends to have larger variation at the end of game.
Categories and Subject Descriptors 600
Airline Ticket price change over time
Algorithms. mean
min
500 max
General Terms
Algorithms, Design. 400
price( start from 0)
300
Keywords
TAC 200
1. INTRODUCTION 100
The Trading Agent Competition (TAC) is an international forum
designed to promote and encourage high quality research into the 0
optimization problem in dynamic and uncertain context. The
description of TAC can be found in TAC website. -100
0 5 10 15 20 25 30
# of changes, price changes every 24~32 seconds, at most 12 minutes
2002 UMBCTAC is completely different from 2001 UMBCTAC.
It is basically a heuristic based approach. Heuristics are learned Figure 1 how airline ticket price changes over time (statistics
from the statistical observations of game history and common based on 10000 controlled experiments)
sense.
In this paper, section 2 will discuss some preliminary observations
2.2 Learn from the game history
There are quite a few heuristics we have learn from 2001 TAC
based on the game history and rules. Section 3 will introduce the
reports, such as “short trip will always have better performance”,
architecture and design details of UMBCTAC. Section 4 will talk
“early bidder has good performance” and etc. How to predict the
some future work and conclude our work.
hotel price is a hot topic. An accurate estimation of hotel price is
really helpful for the agent to make decision, especially for those
2. Preliminary Observations agents using linear programming (LP) approach.
2.1 Learn from the rules Game history is a good source for estimating hotel price. And
“The agent who can achieve best score will win in the game”. In there are quite a few choices: (1) we can randomly draw a price
order to achieve the best score, an insight view should be taken on from all history prices on one auction. The approach doesn’t make
the utility function, which dominates the final score. much sense because the randomly picked price may be skewed;
Table 1 Utility function (Adapted from TAC description) (2) median or mean over historical price can be used as an
estimation of hotel close price. Our observation shows that
u = 1000 - travel_penalty + hotel_bonus + fun_bonus median is a little bit optimistic while mean is a little bit
where pessimistic.
travel_penalty = 100*(|AA - PA| + |AD - PD|) Figure 2 and 3 show 1000 and 100 games statistics in seeding
round. There are 20 legal hotel combinations, 10 for each type of
hotel_bonus = TT? * HP hotel. The 10 hotel combinations are: 1, 2, 3, 4, 12, 23, 34, 123,
234, 1234, where 1 means Monday, 2 means Tuesday and so on.
fun_bonus = AW? * AW + AP? * AP + MU? * MU In the figures, we compute the estimated price for each hotel
combination, the 1-10 on x axis are for cheap hotel and 11-20 are
for good hotel. The two figures show that: (1) shorter hotel
At the first glance, all the factors seem to be random and combinations cost less; (2) cheaper hotel cost less; (3) the
unpredictable. However, that is an illusion. distribution of hotel combination price is very similar for 100
Some factors are truly unpredictable, such as the order of hotel games and 1000 games.
auction closure, the start price of airline ticket and client
preferences.
seeding.txt 2166-3134 Client Agent is the representative of client. It holds a list of
600 favored travel plan with ranking, and initial user preference.
It can negotiate with dealer agent to find a good travel plan
for the agent.
500
Predictor agent is indeed a learner of history and human
knowledge. It is used to estimate hotel cost, e-tickets cost,
400 and hotel auction risk based on expert knowledge and game
history.
300
Memo agent is added to system in case of system failure. It
records decision information, such as clients’ travel choices,
to a log file. So that the system can start and restart at any
200 time during the game.
100
0
0 2 4 6 8 10 12 14 16 18 20
Figure 2 Hotel price based on 1000 games in 2002 seeding
round. Mean is denoted by red solid line, and median is
denoted by blue circles.
temp.txt (2928-3134)
600
Figure 4 UMBCTAC Multi-agent Achitecture
500
3.2 Hotel/Airline auction strategy
400 UMBCTAC use early bidder strategy on hotel and airline auction.
It allocates all resources at the very beginning and fix on the plan
all the time. We choose early bidder strategy because: (1) change
300 of plan will always cost a lot; (2) easy to implement and control;
(3) gain- risk model can be used to achieve good statistical
results.
Gain-Risk model is used to select a good combination of travel
200
plans for all clients. i.e. it will pick up a good travel plan for each
client and then find a combination of these plans which achieve
100 low risk and good gain. Gain can be evaluated by the sum of
estimated score for each client. Risk is the probability of reaching
very high close price in hotel auction, i.e. with high probability
0
0 2 4 6 8 10 12 14 16 18 20 we will spend more money on hotel rooms.
The algorithm consists of three important parts: how to estimate
gain, how to estimate risk and how to search a good solution with
Figure 3 Hotel price based on 100 games in 2002 seeding both good gain and low risk.
round.
3.2.1 Estimate gain
3. Design issues We know that a client has 20 possible travel plans, each of which
3.1 System architecture can be represented with triple (in-day, out-day, hotel-type). The
UMBCTAC is designed as a multi-agent system. TACAgent is a score of a travel plan can be calculated according to the given
modification of DummyAgent provided by SICS. We add some utility function. At the beginning of the game, we know air
more agents to create an auction agent community. penalty, air cost, hotel bonus and we don’t know hotel cost, e-
Bonus. So the client agent can consult with Predictor agent for
TAC Agent is in charge of communication, with both TAC
estimated hotel cost and e-Bonus. Then an estimated scored can
Server(use XML) and human users( use GUI)
be computed for any possible travel plan.
Dealer Agent is the head of all agents. It maintains the
statistics of currently owned resources, globally adjusts 3.2.2 Estimate risk
initial travel plans according to gain-risk model, and handles There are lots of factors affect the risk. Since we estimate hotel
e-Tickets auction price based on statistical value of game history, and the mean or
median will always achieved when our bids quantity in each There are several possible situations in the e-ticket market: (1) no
auction is around average, we use the difference between the true one wants tickets, thus the sell price will decrease; (2) no one sells
allocation of the combination of 8 clients’ travel plans and tickets, thus the buy price will increase; (3) someone wants to sell
expected allocation an agent can have to evaluate risk. Therefore and someone wants to buy but their prices do not match yet. In
we got some heuristics. the third case, the price change can be modeled as several rounds,
Since there are 16 rooms in each hotel auction and there each round starts from large difference, and end up with a match
are 8 agents, an agent can have 2 rooms in an auction in (see figure 5). The third case is very common in real life. To buy
average. More room allocation in an auction will cause a ticket in a low price, we need to determine when to buy. Our
higher risk. approach use following rules:
The penalty of high risk is much higher than the profit. Use a function which can change over time.
i.e. we shouldn’t attempt high risk. Use a random factor to increase the probability of
High risk is caused either by a trip with long duration or achieving a match
too much room allocation in a hotel auction. And day 2, Use a threshold to avoid pay too much or sell in a too
3 have higher risk than day 1, 4. low price
In most time, a travel plan witch matches or between
the preferred arrival day and departure day has lower
risk than others.
Currently we use threshold and weight to quantify the risk like the
formula below. For each hotel auction, we setup a threshold on
maximum rooms we can bid in that auction and a corresponding
risk weight. When the allocation of rooms is over our threshold,
there will be some risk, otherwise no risk. The weight is higher
for day 2, 3. Then we sum up the risks for each hotel auction to
get the overall risk.
Risk = Sum ( # of rooms over threshold * weight)
Another consideration is the travel plan selection. For each client,
we know the some travel plans are good and with low risk, while
the others not. So we only need to consider a small part of the 20
travels for the client.
Figure 5 Price change in e-ticket auction
3.2.3 Search best gain-risk
Based on the above analysis, an algorithm is used to choose the
best travel plan combination (table 2). 3.3.2 Determine bid action and bid price
Table 2 Balanced Gain-Risk Algorithm When we place a buy bid, it can be either higher than current sell
price, which means we want to buy the e-ticket now, or lower,
1. Clients select favored travel plan (FTP)
which means we only want to spend that amount to buy an e-
2. Clients estimate score for each FTP, ticket and we will wait for someone willing to sell at that price.
3. Clients submit their FTP with score to Dealer Since the travel plans for all clients are already settled, the
4. Dealer globally search among possible travel plan problem is to find an optimal allocation of e-tickets for all clients.
combinations, to find the one with lowest risk. If there are This is a dynamical allocation problem because we can buy and
more than one combinations which have the same risk, we sell e-tickets during the game. Instead of using liner
choose the one which has highest gain programming, we try a probabilistic based approach.
5. Dealer submits bids aggressively We do not allocate e-ticket to any client during game, and each e-
ticket auction is handled individually. In each auction, we simply
3.3 E-tickets auction strategy estimate how many tickets are needed, estimate how much the
We separate the entertainment ticket bidding component from the clients will pay for that, and decide whether sell or buy a ticket.
airline, hotel auction. i.e. we first settle down the optimal travel We compute the lowest price that clients can offer, the sell price
plan combination with algorithms in 3.2. Then, based on the range, and the buy price range.
travel plan, we will dynamically bid in entertainment ticket The lowest price clients can offer is determine by how
market to achieve optimal e-ticket allocation. many clients will stay in Tampa on that day, how long
they will stay in Tampa, and their bonus on the
3.3.1 Bid price change over time entertainment corresponding to this auction.
Bid price includes: sell price—how much to you ask for an e-
ticket and buy price—how much you want to spend for an e- The sell price range consists of a high price and a low
ticket. price. If current buy price in auction is higher than low
price, we can sell the ticket; if current buy price in
auction is lower than low price, we do nothing; else we
will put the high price on the auction and see if anyone 4. Future work
can accept it. UMBCTAC use simple heuristics to achieve average behavior.
The buy price works the same as the sell price. And the result is that it ranks 2nd in qualify round, 3rd in seeding
round and 4th in final. We believe these are the best a heuristic
coded trading agent can achieve.
Here we introduce price range (low, high) in order to achieve
better performance. For example, we can delay a sale to get better During the game, we also have some interesting observations, and
price because some time the buyer might offer better price one they can help us design future TAC agents.
minute later. We also can’t wait too long because someone else Our gain-risk model will always output such travel plan
might sell the e-ticket at that price before we do. It is still a kind combinations which need two rooms in each auction,
of heuristic, but in real game, it does work. especially in auctions on day2, 3. That is, every time
Another consideration is the tendency of buy and sell. From the wee need fixed number of room. Can we change the
observation on lots of game, we found that in each auction, search policy to “find the plan combination with best
gain while use and only use two rooms in each
We always don’t need to have more than 4 tickets.
auction”?
In an e-ticket auction, the tendency of buying a e-ticket
Another consideration is how to measure gain. When
is related to the number of tickets we currently have.
current distribution of hotel price becomes quite
When we have less than 2 e-tickets, we might need to
different from the one learned from history, the
purchase some tickets, on when we have more than 2 e-
estimated score will be meaningless. Such situation
tickets, we might need to sell some tickets
always happens at the beginning of each round. Can we
We run the bidding algorithm (see below) individually for each e- use margin value instead of score to evaluate the gain
ticket auction. Let k be the number of e-tickets dealer agent of a travel plan? A margin value is the score without
already has in hand. We use a probability function P(k) to the estimated hotel cost, and it indicates the maximum
determine whether buying or selling. Let w be the highest price profit we can get from that client.
which can be offer by the 8 clients for that auction. Let t be the
Another consideration was discussed on the risk
percentage of time which has passed, it range from 0 to 1.
evaluation in gain-risk model. A sum up of the risk is
1. #initialize not theoretically sound. We’d better estimate the risk
2. compute k, P(k) based on current allocation for each auction, then compute the risk for each client
buy multiply the values, and then sum them up.
3. compute w according to the need of clients
An early bidder does work well in lots of cases. But it
4. # buy also have some vital problems: when the hotel price
5. compute (low-buy, high-buy) price based on P(k), t and w rises up to a very high level, it can’t withdraw; when it
is overbid, the travel package will failed too. A good
6. with probability P(k), we send a buy bid – if current ask
solution is to dynamically change travel plan in the
price in auction falls between our acceptable range, we buy it
middle of the game. There are two kinds of strategies
instantly, or we post the low price in the auction
commonly used: (1) bid more rooms in each auction, so
7. # sell as to get more rooms in early closed auction and to be
8. compute (low-buy, high-buy) price based on P(k), t and w overbid in late closed auction. Such approach works
well when there are not too many clients want stay in
9. with probability P(k), if current bid price in auction falls Tampa, but it also boosts the average close price of all
between our acceptable range, we sell the e-ticket instantly, auctions; (2) delay some airline ticket purchase – when
otherwise we post a sell bid with high price being overbid in hotel auction, change travel plan to a
Note that the buy price is always less than w, while the sell price shorter one. Such approach works well when there are
is always larger than w. Our current approach, P(k) is computed too many clients want stay in Tampa, but it might
by following formula introduce extra cost for delayed airline ticket purchase.
k
Can we combine the two difference approaches to
P(k) =0.93 achieve better performance?
Our approach is highly heuristic based, but its performance is not
Ticket owned Probability P(k) too bad. We still not fully understand why and how the heuristics
0 0.9 work, and the theoretical bound of our approaches. Future work
can be taken on reveal theoretical explanation and rules for
1 0.729 dynamically finding a solution in uncertain context.
2 0.38742048
3 0.058149736 5. ACKNOWLEDGMENTS
Our thanks to Dr. Finin, Dr. Peng, and Dr. Oates for help and
support.
Related docs
Get documents about "