Proceedings Template - WORD - PDF
Shared by: mimama
UMBCTAC Strategy Li DING, Yongmei Shi, Zhongli Ding, Rong Pan Department of CSEE, UMBC firstname.lastname@example.org ABSTRACT Some factors are partially predictable, such as airline ticket price. In this paper, we describe the design and implementation of Figure 1 shows how airline ticket price changes over time. (We UMBCTAC. assume the start price to be $0.) The figure shows that the airline ticket tends to have larger variation at the end of game. Categories and Subject Descriptors 600 Airline Ticket price change over time Algorithms. mean min 500 max General Terms Algorithms, Design. 400 price( start from 0) 300 Keywords TAC 200 1. INTRODUCTION 100 The Trading Agent Competition (TAC) is an international forum designed to promote and encourage high quality research into the 0 optimization problem in dynamic and uncertain context. The description of TAC can be found in TAC website. -100 0 5 10 15 20 25 30 # of changes, price changes every 24~32 seconds, at most 12 minutes 2002 UMBCTAC is completely different from 2001 UMBCTAC. It is basically a heuristic based approach. Heuristics are learned Figure 1 how airline ticket price changes over time (statistics from the statistical observations of game history and common based on 10000 controlled experiments) sense. In this paper, section 2 will discuss some preliminary observations 2.2 Learn from the game history There are quite a few heuristics we have learn from 2001 TAC based on the game history and rules. Section 3 will introduce the reports, such as “short trip will always have better performance”, architecture and design details of UMBCTAC. Section 4 will talk “early bidder has good performance” and etc. How to predict the some future work and conclude our work. hotel price is a hot topic. An accurate estimation of hotel price is really helpful for the agent to make decision, especially for those 2. Preliminary Observations agents using linear programming (LP) approach. 2.1 Learn from the rules Game history is a good source for estimating hotel price. And “The agent who can achieve best score will win in the game”. In there are quite a few choices: (1) we can randomly draw a price order to achieve the best score, an insight view should be taken on from all history prices on one auction. The approach doesn’t make the utility function, which dominates the final score. much sense because the randomly picked price may be skewed; Table 1 Utility function (Adapted from TAC description) (2) median or mean over historical price can be used as an estimation of hotel close price. Our observation shows that u = 1000 - travel_penalty + hotel_bonus + fun_bonus median is a little bit optimistic while mean is a little bit where pessimistic. travel_penalty = 100*(|AA - PA| + |AD - PD|) Figure 2 and 3 show 1000 and 100 games statistics in seeding round. There are 20 legal hotel combinations, 10 for each type of hotel_bonus = TT? * HP hotel. The 10 hotel combinations are: 1, 2, 3, 4, 12, 23, 34, 123, 234, 1234, where 1 means Monday, 2 means Tuesday and so on. fun_bonus = AW? * AW + AP? * AP + MU? * MU In the figures, we compute the estimated price for each hotel combination, the 1-10 on x axis are for cheap hotel and 11-20 are for good hotel. The two figures show that: (1) shorter hotel At the first glance, all the factors seem to be random and combinations cost less; (2) cheaper hotel cost less; (3) the unpredictable. However, that is an illusion. distribution of hotel combination price is very similar for 100 Some factors are truly unpredictable, such as the order of hotel games and 1000 games. auction closure, the start price of airline ticket and client preferences. seeding.txt 2166-3134 Client Agent is the representative of client. It holds a list of 600 favored travel plan with ranking, and initial user preference. It can negotiate with dealer agent to find a good travel plan for the agent. 500 Predictor agent is indeed a learner of history and human knowledge. It is used to estimate hotel cost, e-tickets cost, 400 and hotel auction risk based on expert knowledge and game history. 300 Memo agent is added to system in case of system failure. It records decision information, such as clients’ travel choices, to a log file. So that the system can start and restart at any 200 time during the game. 100 0 0 2 4 6 8 10 12 14 16 18 20 Figure 2 Hotel price based on 1000 games in 2002 seeding round. Mean is denoted by red solid line, and median is denoted by blue circles. temp.txt (2928-3134) 600 Figure 4 UMBCTAC Multi-agent Achitecture 500 3.2 Hotel/Airline auction strategy 400 UMBCTAC use early bidder strategy on hotel and airline auction. It allocates all resources at the very beginning and fix on the plan all the time. We choose early bidder strategy because: (1) change 300 of plan will always cost a lot; (2) easy to implement and control; (3) gain- risk model can be used to achieve good statistical results. Gain-Risk model is used to select a good combination of travel 200 plans for all clients. i.e. it will pick up a good travel plan for each client and then find a combination of these plans which achieve 100 low risk and good gain. Gain can be evaluated by the sum of estimated score for each client. Risk is the probability of reaching very high close price in hotel auction, i.e. with high probability 0 0 2 4 6 8 10 12 14 16 18 20 we will spend more money on hotel rooms. The algorithm consists of three important parts: how to estimate gain, how to estimate risk and how to search a good solution with Figure 3 Hotel price based on 100 games in 2002 seeding both good gain and low risk. round. 3.2.1 Estimate gain 3. Design issues We know that a client has 20 possible travel plans, each of which 3.1 System architecture can be represented with triple (in-day, out-day, hotel-type). The UMBCTAC is designed as a multi-agent system. TACAgent is a score of a travel plan can be calculated according to the given modification of DummyAgent provided by SICS. We add some utility function. At the beginning of the game, we know air more agents to create an auction agent community. penalty, air cost, hotel bonus and we don’t know hotel cost, e- Bonus. So the client agent can consult with Predictor agent for TAC Agent is in charge of communication, with both TAC estimated hotel cost and e-Bonus. Then an estimated scored can Server(use XML) and human users( use GUI) be computed for any possible travel plan. Dealer Agent is the head of all agents. It maintains the statistics of currently owned resources, globally adjusts 3.2.2 Estimate risk initial travel plans according to gain-risk model, and handles There are lots of factors affect the risk. Since we estimate hotel e-Tickets auction price based on statistical value of game history, and the mean or median will always achieved when our bids quantity in each There are several possible situations in the e-ticket market: (1) no auction is around average, we use the difference between the true one wants tickets, thus the sell price will decrease; (2) no one sells allocation of the combination of 8 clients’ travel plans and tickets, thus the buy price will increase; (3) someone wants to sell expected allocation an agent can have to evaluate risk. Therefore and someone wants to buy but their prices do not match yet. In we got some heuristics. the third case, the price change can be modeled as several rounds, Since there are 16 rooms in each hotel auction and there each round starts from large difference, and end up with a match are 8 agents, an agent can have 2 rooms in an auction in (see figure 5). The third case is very common in real life. To buy average. More room allocation in an auction will cause a ticket in a low price, we need to determine when to buy. Our higher risk. approach use following rules: The penalty of high risk is much higher than the profit. Use a function which can change over time. i.e. we shouldn’t attempt high risk. Use a random factor to increase the probability of High risk is caused either by a trip with long duration or achieving a match too much room allocation in a hotel auction. And day 2, Use a threshold to avoid pay too much or sell in a too 3 have higher risk than day 1, 4. low price In most time, a travel plan witch matches or between the preferred arrival day and departure day has lower risk than others. Currently we use threshold and weight to quantify the risk like the formula below. For each hotel auction, we setup a threshold on maximum rooms we can bid in that auction and a corresponding risk weight. When the allocation of rooms is over our threshold, there will be some risk, otherwise no risk. The weight is higher for day 2, 3. Then we sum up the risks for each hotel auction to get the overall risk. Risk = Sum ( # of rooms over threshold * weight) Another consideration is the travel plan selection. For each client, we know the some travel plans are good and with low risk, while the others not. So we only need to consider a small part of the 20 travels for the client. Figure 5 Price change in e-ticket auction 3.2.3 Search best gain-risk Based on the above analysis, an algorithm is used to choose the best travel plan combination (table 2). 3.3.2 Determine bid action and bid price Table 2 Balanced Gain-Risk Algorithm When we place a buy bid, it can be either higher than current sell price, which means we want to buy the e-ticket now, or lower, 1. Clients select favored travel plan (FTP) which means we only want to spend that amount to buy an e- 2. Clients estimate score for each FTP, ticket and we will wait for someone willing to sell at that price. 3. Clients submit their FTP with score to Dealer Since the travel plans for all clients are already settled, the 4. Dealer globally search among possible travel plan problem is to find an optimal allocation of e-tickets for all clients. combinations, to find the one with lowest risk. If there are This is a dynamical allocation problem because we can buy and more than one combinations which have the same risk, we sell e-tickets during the game. Instead of using liner choose the one which has highest gain programming, we try a probabilistic based approach. 5. Dealer submits bids aggressively We do not allocate e-ticket to any client during game, and each e- ticket auction is handled individually. In each auction, we simply 3.3 E-tickets auction strategy estimate how many tickets are needed, estimate how much the We separate the entertainment ticket bidding component from the clients will pay for that, and decide whether sell or buy a ticket. airline, hotel auction. i.e. we first settle down the optimal travel We compute the lowest price that clients can offer, the sell price plan combination with algorithms in 3.2. Then, based on the range, and the buy price range. travel plan, we will dynamically bid in entertainment ticket The lowest price clients can offer is determine by how market to achieve optimal e-ticket allocation. many clients will stay in Tampa on that day, how long they will stay in Tampa, and their bonus on the 3.3.1 Bid price change over time entertainment corresponding to this auction. Bid price includes: sell price—how much to you ask for an e- ticket and buy price—how much you want to spend for an e- The sell price range consists of a high price and a low ticket. price. If current buy price in auction is higher than low price, we can sell the ticket; if current buy price in auction is lower than low price, we do nothing; else we will put the high price on the auction and see if anyone 4. Future work can accept it. UMBCTAC use simple heuristics to achieve average behavior. The buy price works the same as the sell price. And the result is that it ranks 2nd in qualify round, 3rd in seeding round and 4th in final. We believe these are the best a heuristic coded trading agent can achieve. Here we introduce price range (low, high) in order to achieve better performance. For example, we can delay a sale to get better During the game, we also have some interesting observations, and price because some time the buyer might offer better price one they can help us design future TAC agents. minute later. We also can’t wait too long because someone else Our gain-risk model will always output such travel plan might sell the e-ticket at that price before we do. It is still a kind combinations which need two rooms in each auction, of heuristic, but in real game, it does work. especially in auctions on day2, 3. That is, every time Another consideration is the tendency of buy and sell. From the wee need fixed number of room. Can we change the observation on lots of game, we found that in each auction, search policy to “find the plan combination with best gain while use and only use two rooms in each We always don’t need to have more than 4 tickets. auction”? In an e-ticket auction, the tendency of buying a e-ticket Another consideration is how to measure gain. When is related to the number of tickets we currently have. current distribution of hotel price becomes quite When we have less than 2 e-tickets, we might need to different from the one learned from history, the purchase some tickets, on when we have more than 2 e- estimated score will be meaningless. Such situation tickets, we might need to sell some tickets always happens at the beginning of each round. Can we We run the bidding algorithm (see below) individually for each e- use margin value instead of score to evaluate the gain ticket auction. Let k be the number of e-tickets dealer agent of a travel plan? A margin value is the score without already has in hand. We use a probability function P(k) to the estimated hotel cost, and it indicates the maximum determine whether buying or selling. Let w be the highest price profit we can get from that client. which can be offer by the 8 clients for that auction. Let t be the Another consideration was discussed on the risk percentage of time which has passed, it range from 0 to 1. evaluation in gain-risk model. A sum up of the risk is 1. #initialize not theoretically sound. We’d better estimate the risk 2. compute k, P(k) based on current allocation for each auction, then compute the risk for each client buy multiply the values, and then sum them up. 3. compute w according to the need of clients An early bidder does work well in lots of cases. But it 4. # buy also have some vital problems: when the hotel price 5. compute (low-buy, high-buy) price based on P(k), t and w rises up to a very high level, it can’t withdraw; when it is overbid, the travel package will failed too. A good 6. with probability P(k), we send a buy bid – if current ask solution is to dynamically change travel plan in the price in auction falls between our acceptable range, we buy it middle of the game. There are two kinds of strategies instantly, or we post the low price in the auction commonly used: (1) bid more rooms in each auction, so 7. # sell as to get more rooms in early closed auction and to be 8. compute (low-buy, high-buy) price based on P(k), t and w overbid in late closed auction. Such approach works well when there are not too many clients want stay in 9. with probability P(k), if current bid price in auction falls Tampa, but it also boosts the average close price of all between our acceptable range, we sell the e-ticket instantly, auctions; (2) delay some airline ticket purchase – when otherwise we post a sell bid with high price being overbid in hotel auction, change travel plan to a Note that the buy price is always less than w, while the sell price shorter one. Such approach works well when there are is always larger than w. Our current approach, P(k) is computed too many clients want stay in Tampa, but it might by following formula introduce extra cost for delayed airline ticket purchase. k Can we combine the two difference approaches to P(k) =0.93 achieve better performance? Our approach is highly heuristic based, but its performance is not Ticket owned Probability P(k) too bad. We still not fully understand why and how the heuristics 0 0.9 work, and the theoretical bound of our approaches. Future work can be taken on reveal theoretical explanation and rules for 1 0.729 dynamically finding a solution in uncertain context. 2 0.38742048 3 0.058149736 5. ACKNOWLEDGMENTS Our thanks to Dr. Finin, Dr. Peng, and Dr. Oates for help and support.