Cigarette Prices in Bloomington, IN
Brenton Barczykowski Kelley School of Business, Indiana University firstname.lastname@example.org December 15, 2008
Abstract This report focuses on cigarette pricing in Bloomington, IN. Beginning with background introductory information on cigarette consumption and cigarette composition, the report identifies several factors that may influence cigarette pricing at various tobacco vendors. These factors include: price of milk, price of Snickers bar, weekly hours open, number of cigarette posters on property, whether gasoline, hot food, or alcohol are sold, whether public restrooms are offered, and the distance of the vendor from the nearest bar and Indiana University Campus. The observations are then regressed on both Camel cigarette prices and Marlboro cigarette prices and a stepwise regression method is applied to obtain the best equation, which includes two significant variables: “Weekly Hours Open” and “Number of Cigarette Posters” that explain 43% of the variation in cigarette pricing.
Overview This report investigates factors drive cigarette prices in Bloomington, IN. All data was collected in Bloomington, IN from establishments that sell cigarettes, including gas stations, liquor stores, grocery stores and others. After collecting the data, regression analysis will
determine significant independent variable(s) that influence cigarette prices. The independent variables observed were type of cigarette, price of regular Snicker bar, price of 1 gallon of 2% milk, number of cigarette posters on property, number of hours open per week, distance from nearest bar, distance from Indiana University Campus, and whether the establishment had public restrooms, sold gasoline, sold alcohol, and sold hot food. Through my research and regression analysis, I will develop the best regression equation to predict cigarette prices.
Smoking Many people have smoked a cigarette spontaneously at some point in their lives. Roughly 21% of adults in the United States habitually smoke, equating to more than 45 million Americans1. Figure 1 shows the aggregate and per capita consumption of cigarettes in high tobacco use countries. While the U.S. is not the leading consumer, they are among the highest in the world. The number of adult smokers in the U.S. decreased from 42% in 1965 to 21% in 20062. The large decrease is likely attributed to increased education about the cancerous effects of smoking, the ban on tobacco television advertising, and increased levels of personal health awareness.
Casey, Maxine. “Smoking Statistics in the USA”. 25 April 2007. 26 October 2008. <http://ezinearticles.com/ ?Smoking-Statistics-In-The-USA&id=540596> 2 “Cigarette Smoking Among Adults”. Center for Disease Control. 7 November 2007. 26 October 2008. <http://www.cdc.gov/mmwr/ preview/ mmwrhtml/mm5644a2.htm#fig >
Estimation of Cigarette Consumption by Adults in 2006 Country China USA Japan Russia Indonesia Cigarettes consumed (millions) 1,643,000 451,000 328,000 258,000 215,000 Population (millions) 1,248 270 126 146 200 Cigarettes consumed per capita 1,320 1,670 2,600 1,760 1,070
Source: “Cigarette Smoking Among Adults”. Center for Disease Control. 7 November 2007. 29 October 2008. <http://www.cdc.gov/mmwr/ preview/ mmwrhtml/mm5644a2.htm#fig>
Types of Cigarettes Several varieties of cigarettes exist. Traditionally, cigarettes consist of a paper tube filled with processed tobacco. The end of the cigarette placed in the mouth also contains a filter roughly 1 inch long. The filter removes some, not all, of the smoke and its chemical contents. It also gives the smoker an area to place his/her lips around without moistening the tobacco so that it becomes too wet to burn. The filter is not smoked as part of the cigarette.
“Light” cigarettes refer to cigarettes with a milder flavor. Light cigarettes utilize a perforated air filter that allows for a higher air to smoke ratio thus reducing inhaled smoke and diluting taste. Others have argued that the perforated filter is designed to “fool” smoking
machines that measure the level of chemicals in inhaled smoke 3.
Generally, light cigarettes
have lower nicotine levels as a result of this perforated filter, though the amount of nicotine and carcinogens inhaled varies greatly amongst individual smokers of light cigarettes 4. Regular cigarettes are the same in structure, except they have a less perforated, shorter filter and may use more flavorful tobacco blends.
Menthol cigarettes come in either the light or full flavor variety. Menthol compounds flavor cigarettes with a cool, mint taste, making them more palatable to smokers. Menthol also stimulates skin nerves, producing a cooling sensation without a physical drop in temperature 5. Menthol cigarettes have received criticism because menthol causes oral and respiratory irritation that can result in bleeding over time 6.
Tobacco Companies The U.S. tobacco market consists of two major competitors: Phillip Morris (wholly owned subsidiary of Altria Group) and RJ Reynolds. Phillip Morris produces brands such as Marlboro, Parliament, Basic and Virginia Slims 7. RJ Reynolds markets Camel, Doral, Winston, Kool and others8. For my analysis, I focused on the marquis brand for each competitor:
Marlboro and Camel. For full flavor varieties, I used price data for Marlboro Reds and Camel
“The Truth about Light Cigarettes: Q & A”. National Cancer Institute. 17 August 2004. 8 December 2008. <http://www.cancer.gov/cancertopics/factsheet/Tobacco/light-cigarettes> 4 Rigotti, Nancy and Tindle, Hilary. “The Fallacy of Light Cigarettes”. 13 March 2004. 8 December 2008. <http://www.bmj.com/cgi/content/full/328/7440/E278> 5 Eceles, R. “Menthol and Related Cooling Compounds”. University of Wales College of Cardiff, UK. August 2004. 10 December 2008. <http://www.ncbi.nlm.nih.gov/pubmed/7529306> 6 IBID 7 “Phillip Morris List of Brands”. Tobacco.org. 10 December 2008. <http://www.tobacco.org/Resources/ 00pmbrands.html> 8 “RJ Reynolds Tobacco Company Fact Book”. Rjrt.com. 10 December 2008. <http://www.rjrt.com/ company/profileFactBook.asp>
Filters. In order to identify possible price differences in type of cigarette, I observed Marlboro Lights and Camel Lights. Also, I took menthol derivatives into account by observing prices on Marlboro Milds (full flavor menthol) and Marlboro Menthol Light from Phillip Morris and Camel Menthol and Camel Menthol Light from the RJ Reynolds Company.
Data Collection Data was collected between October 19, 2008 and October 26, 2008. I physically traveled to each location and made observations on site. I did not have a preset list of
destinations; I simply drove all throughout campus, visiting locations I knew and stopping at ones I didn’t know or forgot about. In total, 30 observations were made in Bloomington. Figure 2 shows the location of each observation. To ensure accuracy, I carefully counted and recorded each observation two times. If the two observations were not identical, I repeated the process until both observations were the same. Figure 3 shows the data collected.
Location of Observed Cigarette Vendors
Data Collected on Cigarette Prices and Independent Variables
Before collecting the data, I determined what factors may contribute to cigarette prices. I created a list of independent variables that may shed some light on price variability. The following list describes the logic used to create a list of independent variables that contribute to cigarette pricing, as well as the average, minimum, and maximum for each observation.
Snicker Bar Price The price of a Snicker bar indentifies if a relationship exists between the price of cigarettes and the price of an impulse purchase commonly available at most cigarette vendors. The average price of a regular snickers bar was $1.00, ranging from $0.58 to $1.49. Only 23 of 30 location sold Snicker bars, which is taken into consideration during regression analysis.
Gallon 2% Milk Price The price of a gallon of 2% milk determines if cigarette prices are correlated with the price of a necessity good commonly available at cigarette retailers. The average price was $3.57, ranging from $2.79 and $4.59. Only 22 of 30 locations sold 1 gallon containers of 2% milk, which is accounted for in the regression analysis.
Weekly Hours Open Cigarette prices may be correlated with operating costs. Therefore, the numbers of hours open per week determines if there is a relationship between labor costs and cigarette prices. Cigarette vendors were open an average of 143.4 hours per week, some open as few as 92 hours and some as many as 168 hours (24 hours a day, 7 days a week).
Number of Cigarette Posters The amount of advertising at each location determines whether a vendor lowers prices to when trying to attract more customers. For purposes of this report, I defined a poster as any display with a cigarette brand name listed. A poster can only be counted once, even if it contains advertising for multiple tobacco products. The average number of posters was 7, ranging from 1 to 15.
Distance from Campus Since campus is not a specific point, I measured distance from the Indiana University Bursar Office (located at 601 E Kirkwood, Bloomington, IN). I used Google Maps “directions” function to determine distance, which provides road distance traveled between locations to the nearest tenth of a mile. Due to several one-way streets near the Bursar Office, some locations are recorded farther from the bursar office than they physically are. On average, cigarette vendors were 2.43 miles from the Bursar office. The closest observation was 1.1 miles from the Bursar office, the farthest was 5.3 miles.
Distance from Bars Since many people view drinking and smoking as complements, the distance from the nearest bar was measured. Due to the large number of bars in Bloomington, IN, the variable was determined by measuring the distance between the vendor and one of three different bars in Bloomington: Kilroy’s on Kirkwood (502 E. Kirkwood), Jakes Nightclub (419 N. Walnut), and the Trojan Horse (100 E Kirkwood). All measurements are road-traveled distances. On average, cigarette vendors are 1.82 miles the nearest bar, ranging from 0.2 miles to 3.9 miles.
Public Bathrooms Increased overhead costs associated with public restrooms may contribute to higher cigarette prices. Public restrooms are defined as being visible and available to store patrons.
Roughly 53% of observations included a visible, public restroom. The observation was recorded with a “1” indicating the presence of public restrooms and “0” indicating the absence of them. This method was used for all following variables.
Gasoline Sold A gas station may sell more cigarettes by selling them at a low price and attracting more customers. Therefore, the type of store (gas station, grocery store, etc.) may impact pricing. Approximately 46% of observed vendors sold gasoline.
Alcohol Sold This variable identifies whether a store that sells alcohol can charge a price premium on cigarettes or if the store attempts to bundle alcohol and cigarettes to increase sales volume. Cigarettes and alcohol are almost always sold at the same location since 90% of observed cigarette vendors also sold alcohol.
Hot Food Sold Hot food is food that does not have to be heated by the customer. Stores that provide cold food and microwaves are not considered establishments that sell hot food. This variable determines if costs of food preparation affects cigarette prices. Exactly 50% of cigarette vendors also sold hot food.
Quantitative Analysis: Statistical Definitions The regression analysis for cigarette prices focuses on three main criteria: P-Values of TStatistics, P-Values of F-Statistics, and R-Square statistics. P-Values are compared to the
significance level to determine if the independent variable is statistically different from zero. For this report, P-Values were especially helpful when derived from the T-Statistic of each variable. If the P-Value was larger than .05 for the T-Statistic, it was concluded the independent variable was not statistically different from zero and the variable was removed from the regression equation; variables with P-Values less than or equal to .05 were kept.
The T-Statistic for each variable revealed whether the variable had an impact on the regression (the variable is statistically different from zero). Each variable was tested at the 95% confidence level, a corresponding T-Statistic of 1.96. So, all variables with T-Statistics less than 1.96 are considered to have an impact on the dependent variable. For the purposes of this report, only P-Values of T-Statistics will be analyzed.
F-Statistics are used to determine the significance of a multiple regression equation. While T-Statistics examine the significance of individual independent variables, the F-Statistic examines the significance of all independent variables collectively. For the purposes of this report, the regression was considered to be significant if the P-Value of the F-statistic was less than .05.
R-Square provides the amount of variability in observations that is explained by the independent variables. For example, an R-Squared of .67 indicates that 67% of the variation in
the dependent variable can be explained by the independent variables. A combination of RSquare, T-Statistics and F-Statistics analysis creates the best possible equation.
Quantitative Analysis: Regression Formation After collecting data on 30 observations, I used regression analysis to determine independent variable significance. I eliminated two qualitative independent variables before entering them into a regression: Light and Menthol cigarettes. I found that prices for light and menthol of the same brand were identical.
Identical prices among different types of cigarettes may exist for several reasons. First, from a supplier standpoint, the difference in production costs for the different types of cigarettes is so minimal that it approaches zero, thus not justifying a price premium for different types of cigarettes. Secondly, from a demand side standpoint, it may be that tobacco consumers are indifferent between types of cigarettes. Therefore if the price of a pack of menthol cigarettes increased relative to light and filter cigarettes, consumers who usually smoke menthol cigarettes would buy the other, cheaper derivatives. Accordingly, manufacturers do not price types of cigarettes differently to minimize cannibalization.
After eliminating these two variables, I proceeded to run a regression, using all independent variables. Then, I eliminated the independent variable with the highest P-Value greater than .05, and ran the regression again using the remaining variables. I continued this process until all remaining variables had a P-Value of .05 or less.
Not all independent variables were recorded at each location. Several locations, mostly liquor stores, did not sell Snicker bars or gallons of 2% milk. So, for the first regression, I used the 22 out of 30 observations with all independent variables present. The first regression had an R-Square value of .57, meaning that 57% of the variation in prices for Camel cigarettes was explained by the 10 independent variables. The P-Value of the F-Statistic was 0.27, meaning the regression equation, as a whole, is not significant at the 5% level. The resulting Coefficients and T-Statistics are shown in Figure 4.
Independent Variable Coefficients and T-Stat P-Values of 1st Regression Independent Variable Price of Snicker Bar Price of Gallon 2% Milk Weekly Hours Open Number Cigarette Posters Public Bathrooms Gas Sold Alcohol Sold Hot Food Sold Distance from Campus Distance from Bar Coefficient -0.399 0.397 0.015 -0.069 0.002 0.293 -0.300 0.207 0.216 -0.125 P-Value of T-Stat 0.405 0.088 0.046 0.036 0.991 0.250 0.434 0.447 0.117 0.436
As Figure 4 shows, all variables except “Number of Cigarette Posters” and “Weekly Hours Open” have P-Values greater than .05, making them insignificant at the 5% level. According to my established methodology, I eliminated “Public Bathrooms” since it had the
highest P-Value (.991) and ran another regression. I repeated this process four times, eliminating four independent variables.
The resulting equation had an R-Squared of .49, meaning 49% of the variation in Camel prices was explained by the 6 remaining independent variables. The equation’s F-Statistic’s PValue is .08, making the equation as a whole significant at the 10% level, however, still insignificant at the 5% level. The independent variable Coefficients and T-Statistic P-Values are shown in Figure 5.
Independent Variable Coefficients and T-Stat P-Values of 5th Regression Independent Variable Price of Gallon 2% Milk Weekly Hours Open Number of Cigarette Posters Gas Sold Hot Food Distance from Campus Coefficient 0.320 0.013 -0.065 0.252 0.124 0.146 P-Value of T-Stat 0.110 0.030 0.017 0.245 0.459 0.092
There are still only two variables that are significant at the 5% level: “Weekly Hours Open” and “Number of Cigarette Posters” However, this regression shows “Distance From
Campus” is significant at the 10% level and “Price of Gallon 2% Milk” is very close to significant at the 10% level.
Running 2 more regressions eliminated the Independent variables “Alcohol Sold” and “Hot Food Sold.” The equation had an R-Square of .42, meaning the four remaining
independent variables explained 42% of the variation in Camel prices. The initial equation explained 57% of the variation in price, but the 7 th equation’s independent variables are more significant despite explaining less variation. The equation’s F-Statistic P-Value is .04, making the equation first to be significant at the 5% level. The Coefficient and T-Statistic P-Values are shown in Figure 6.
Independent Variable Coefficients and T-Stat P-Values of 7th Regression Independent Variable Price of Gallon 2% Milk Weekly Hours Open Number of Cigarette Posters Distance from Campus Coefficient 0.181 0.010 -0.043 0.158 T-Stat P-Value 0.252 0.057 0.019 0.054
With the exception of “Price of Gallon 2% Milk”, all variables are now significant at the 10% level. “Number of Cigarette Posters” is significant at the less than 5% level, and “Weekly Hours Open” and “Distance from Campus” are less than .001 from significance at the 5% level. 14
However, since all variables are not significant at the 5% level, “Price of Gallon 2% Milk” must be eliminated and another regression run. It is interesting to note the T-Statistic P-Value of “Price of Gallon 2% Milk” increased from the previous regression, showing correlation with another variable in the equation.
The 8th regression has an R-Square of .37, meaning that 37% of the variation in Camel prices was explained by the three remaining independent variables. The equation’s F-Statistic has a P-Value of .034, well within the 5% significant level. Figure 7 shows the Independent Variable's Coefficients and T-Statistic P-Values.
Independent Variable Coefficients and T-Stat P-Values of 8th Regression Independent Variable Weekly Hours Open Number of Cigarette Posters Distance from Campus Coefficient 0.008 -0.037 0.113 T-Stat P-Value 0.100 0.032 0.109
Surprisingly, the T-Statistic P-Value of “Weekly Hours Open” increased from the previous equation. Still, the only variable significant at the 5% level is “Number of Cigarette Posters”. “Weekly Hours Open” and “Distance from Campus” are significant and nearly
significant at the 90% level, respectively. More variables were eliminated to find a moresignificant equation.
The next regression eliminated “Distance from Campus”. It has an R-Square of .28, meaning only 28% of the variation in price was explained by the two variables. The F-Statistic was equal to .04, still significant at the 5% level. Figure 8 shows the Coefficient and T-Statistic P-Values of each variable.
Figure 8 Independent Variable Coefficients and T-Stat P-Values of 8th Regression Independent Variable Weekly Hours Open Number of Cigarette Posters Coefficient 0.009 -0.041 T-Stat P-Value 0.070 0.022
“Weekly Hours Open” is not significant at the 5% level, but “Number of Cigarette Posters” is. Both variables are significant at the 10% level. At the 10% level, the final
regression equation would be:
Price of Camel Cigarettes = $3.02 + $0.009*(Weekly Hours Open) $0.041*(Number of Cigarette Posters)
This means, with a base price of $3.02, prices of Camel cigarettes will increase $0.09 for every 10 hours the store is open per week and decrease roughly $0.04 for each cigarette poster on property. I proceeded to run another regression, eliminating “Weekly Hours Open”, to see what effect it would have on “Number of Cigarette Posters.”
The resulting equation’s R-Square was .14, extremely low. The F-Statistic was .08, not significant at the 5% level. “Number of Cigarette Posters” increased its T-Statistic P-Value to .09, again not significant at the 5% level. The equation is useless due to the extremely low RSquare and because the T-Statistic P-Value and F-Statistic P-Value were not significant at the 5% level. Therefore, the best equation contained both “Weekly Hours Open” and “Number of Cigarette Posters.”
This series of regressions only utilized 22 out of 30 observations.
Since “Price of
Snickers Bar” and “Price of Gallon of Milk” are insignificant based on 22 observations, I will replace them with a dummy variable, allowing me to use all 30 observations when performing the regression analysis.
Quantitative Analysis: Dummy Variables Dummy variables will test the impact of milk and Snickers bars prices on Camel Cigarette prices by using all 30 observations instead of only 22 observations. The variable was changed to a “1” if the store sold milk or Snickers and a “0 if the store did not. The first regression’s R-Square is .51, meaning that 51% of the variation in Camel cigarette prices is
explained by the 10 variables. It has an F-Statistic P-Value of .09, which is significant at the 10% level, but not at the 5% level. Figure 9 shows the Coefficients and T-Statistic P-Values of each independent variable.
Independent Variable Coefficients and T-Stat P-Values of 1st Regression (Dummy Variables) Independent Variable Snicker Bar Sold Milk Sold Weekly Hours Open Number of Cigarette Posters Public restrooms Gas Sold Alcohol Sold Hot Food Sold Distance from Campus Distance from Bar Coefficient -0.297 0.082 0.009 -0.040 0.010 0.020 -0.127 0.129 0.117 -0.105 T-Stat P-Value 0.602 0.867 0.092 0.051 0.949 0.913 0.697 0.538 0.261 0.385
According to the results, no variables are significant at the 5% level. “Number of Cigarette Posters” is almost significant at the 5% level, but is significant at the 10% level along with “Weekly Hours Open”. However, more regressions are needed in order to develop a good equation and to see if the addition of 8 observations changes results from the previous series of equations.
Eliminating 6 variables resulted in a better regression equation. Two of the variables cut from the equation were the Dummy Variables for Milk and Snickers, showing that whether a store sells gallons of 2% milk or Snicker bars does not have a significant impact on the price of Camel cigarettes. The 7th regression’s R-Square equals .49. It also has an F-Statistic P-Value of .001, making it significant well below the 5% level. Figure 10 shows the Coefficients and TStatistic P-Values of the 4 remaining variables.
Independent Variable Coefficients and T-Stat P-Values of 7th Regression (Dummy Variables) Independent Variable Weekly Hours Open Number of Cigarette Posters Distance from Campus Distance from Bars Coefficient 0.0077 -0.0397 0.1404 -0.1258 T-Stat P-Value 0.0002 0.0046 0.0922 0.2169
The seventh regression is interesting because 3 of 4 variables are significant at the 10% level (except “Distance from Bars”). Since not all variables are significant, I eliminated
“Distance from Bars” and ran another regression. However, eliminating “Distance from Bars” cause the P-Value of the T-Statistic for “Distance from Campus” to rise to 0.22, again making it insignificant. So, after eliminating both distance variables, the best regression was formed.
The 9th and final regression has an R-Square of .43, meaning that 43% of variation in Camel cigarette prices is explained by 2 variables: “Weekly Hours Open” and “Number of Cigarette Posters”. The equation has an F-Statistic of .0005, making the equation as a whole significant below the 1% level. Figure 11 shows the Coefficients and T-Statistic P-Values for the independent variables.
Independent Variable Coefficients and T-Stat P-Values of 9th Regression (Dummy Variables) Independent Variable Weekly Hours Open Number of Cigarette Posters Coefficient 0.007 -0.034 T-Stat P-Value 0.0003 0.009
“Weekly Hours Open” and “Number of Cigarette Posters” are both significant at the 1% level, much lower than the desired 5% level. This is the best equation for predicting cigarette prices, as it is significant as a whole and each independent variable is significant. Also, the equation is based on 30 observations, instead of only 22. The final regression equation is:
Price of Camel = $3.29+$0.007*(Weekly Hours Open) - $0.035*(Number Cigarette Posters)
Therefore, the base price per pack is $3.29 and for every ten hours the store is open per week, $0.07 is added to the price. Also, for cigarette posters, $0.035 is subtracted from the price of each pack for each poster/advertisement present. These observations are very logical. Every hour the store is open, it incurs more variable costs. Therefore, the more hours a store is open, the more revenue it will need to generate to at least reach its breakeven point. So, the price of a pack of cigarettes will increase the longer a store is open for business.
The effect of the number of cigarette posters on property could mean several things. First, stores with more advertising may be trying to attract more customers in order to focus on selling a greater volume at a lower cost. Thus, the more advertising used to gain customers, the lower the price in order to encourage a higher volume of sales at the store. Also, cigarette retailers may pay a premium to have their posters displayed on property. This increased revenue for the store would allow cigarette vendors to slightly lower prices. Next, I performed a
regression analysis for Marlboro cigarettes to see if I would obtain different results.
Quantitative Analysis: Marlboro Cigarettes The regression equation for Marlboro Cigarettes is almost identical to Camel Cigarettes. I used the same dummy variables and independent variables for each brand and tested each using 22 and 30 observations. The same process yielded the same (almost exact same) results for Marlboro cigarettes. The regression for Marlboro cigarettes resulted in the same two significant variables with p values below .05: “Weekly Hours Open” and “Number of Cigarette Posters”.
The final equation’s R-Square was .43, the same as the R-Square for the final Camel regression. It had an F-Statistic P-Value of .006, slightly higher than Camel cigarettes, but not significantly so. The final regression equation for Marlboro cigarettes is:
Price Marlboro = $3.29 + $0.007*(Weekly Hours Open) - $0.034*(Number Cigarette Posters)
This equation is almost identical to the equation for Camel cigarettes, except that the coefficient for number of cigarette posters is .001 lower for Marlboro cigarettes. Thus, with a base price of $3.29, the price increases $0.07 for every 10 hours the location is open per week and decreases $.034 per pack for each cigarette poster at the location. The logic behind this is the same as for Camel cigarettes.
Conclusion After gathering data and performing regression analysis, I concluded that the two main factors that drive prices in Bloomington, IN are “Weekly Hours Open” and “Number of Cigarette Posters.” The best regression equation, using only significant variables and all
observations has an R-Square value of .43, meaning 43% of the variation of cigarette prices could be explained by the “Number of Cigarette Posters” and “Weekly Hours Open.”
For further research, I would suggest identifying and testing other, more in depth variables such as number of employees per location, average rate of pay for employees per location, number of packs of cigarettes on hand, prices of other tobacco products, or visiting the locations on multiple dates to collect data to identify any seasonal effects. However, it is possible there may be unobservable factors that contribute to pricing, including attitudes of individual store proprietors towards cigarettes, contracts between franchise stores that fix cigarette prices, etc. But, for now, to find the cheapest pack of cigarettes, look for the store with a combination of least weekly hours of operation and most cigarette posters.