Usage of Scraped Data in Price Dynamic Measurement
DOI:
https://doi.org/10.18778/0208-6018.352.02Keywords:
inflation, CPI, Web‑scraping, GEKS‑J, Jevons, Dutot, GEKS‑D, Chained Jevons, online shopping, Big dataAbstract
Web‑scraping is a technique used to automatically extract data from websites. After the rise‑up of on‑lines shopping (which results in more shops posting their full price offer on their websites) it allows to acquire information about prices of goods sold by the retailers such as supermarkets or internet shops. Usage of web‑scraped data allows to lower the costs, improve the measurement quality and monitor the price change in real time. Due to before mentioned reasons this method became the object of research studies from both statistical offices (Eurostat, British Office of National Statistics, Belgium Statbel) and universities (for ex. Billion Prices Project conducted on MIT). However, usage of scrapped data for the CPI calculation entails with multiple challenges with their collection, processing and aggregation. The purpose of this article is to examine the possibility of using scrapped data in toy price dynamic analysis. Especially the purpose is to compare the results from different inde Xformulas. In this article the empirical study based on data from 4 different shops is presented (53 chosen products sold in Amazon, Wallmart, Smarterkids and KBKids).
Downloads
References
Balk B. M. (1995), Axiomatic Price Index Theory: A Survey, „International Statistical Reviews”, vol. 63, s. 69–93.
Google Scholar
DOI: https://doi.org/10.2307/1403778
Białek J. (2019), Remarks on Geo‑Logarithmic Price Indices, „Journal of Official Statistics”, vol. 35, no. 2, s. 287–317.
Google Scholar
DOI: https://doi.org/10.2478/jos-2019-0014
Białek J., Bobel A. (2019), Comparison of Price Index Methods for the CPI Measurement Using Scanner Data, 16th Meeting of the Ottawa Group on Price Indices, Rio de Janeiro.
Google Scholar
Cavallo A. (2013), Online vs Official Price Indexes: Measuring Argentina’s Inflation, „Journal of Monetary Economics”, vol. 60, no, 2, s. 152–165.
Google Scholar
DOI: https://doi.org/10.1016/j.jmoneco.2012.10.002
Cavallo A. (2017), Are Online and Offline Prices Similar? Evidence from Large Multi‑channel Retailers, „American Economic Review”, vol. 107, s. 283–303.
Google Scholar
DOI: https://doi.org/10.1257/aer.20160542
Cavallo A. (2018), Scraped Data and Sticky Prices, „The Review of Economics and Statistics”, vol. 100, s. 105–119.
Google Scholar
DOI: https://doi.org/10.1162/REST_a_00652
Cavallo A., Rigobon R. (2016), The Billion Prices Project: Using Online Prices for Measurement and Research, „Journal of Economic Perspectives”, vol. 30, no. 2, s. 151–178.
Google Scholar
DOI: https://doi.org/10.1257/jep.30.2.151
Chessa A. G., Griffioen R. (2019), Comparing Price Indices of Clothing and Footwear for Scanner Data and Web Scraped Data, „Economics and Statistics: Big Data and Statistics”, no. 509, s. 49–69.
Google Scholar
DOI: https://doi.org/10.24187/ecostat.2019.509.1984
Consumer Price Index Manual. Theory and practice (2004), International Labour Office, Geneva.
Google Scholar
Dutot C. F. (1738), Reflexions Politiques sur les Finances et le Commerce, vol. 1, Les Freres Vaillant et Nicolas Prevost, The Hague.
Google Scholar
Eltetö Ö., Köves P. (1964), Egy nemzetközi összehasonlításoknál fellépő indexszámítási problémáról. On a Problem of Index Number Computation Relating to International Comparisons (in Hungarian), „Statisztikai Szemle”, no. 42, s. 507–518.
Google Scholar
Eurostat, https://ec.europa.eu/eurostat/web/digital-economy-and-society/data/database (dostęp: 10.02.2020).
Google Scholar
Gini C. (1931), On the Circular Test of Index Numbers, „Metron”, no. 9, s. 3–24.
Google Scholar
Ivancic L., Fox K. J., Diewert W. E. (2011), Scanner Data, Time Aggregation and the Construction of Price Indexes, „Journal of Econometrics”, vol. 151, s. 24–35.
Google Scholar
DOI: https://doi.org/10.1016/j.jeconom.2010.09.003
Jevons W. (1865), The Coal Question, Macmillan & Co., London.
Google Scholar
Lewel P. (2015), Is the Carli index flawed? Assessing the case for the new retail price index RPIJ, „Journal of the Royal Statistical Society Series A (Statistics in Society)”, vol. 178, no. 2, s. 303–336.
Google Scholar
DOI: https://doi.org/10.1111/rssa.12061
Lunnemann P., Wintr L. (2006), Are Internet Prices Sticky?, ECB Working Paper, no. 645.
Google Scholar
DOI: https://doi.org/10.2139/ssrn.907314
Macias P., Stelmasiak D. (2018), Food inflation nowcasting with web scraped data, NBP Working Paper, no. 302.
Google Scholar
Office for National Statistics (b.r.), ONS methodology working paper series number 12 – a comparison of index number methodology used on UK web scraped price data, https://www.ons.gov.uk/methodology/methodologicalpublications/generalmethodology/onsworkingpaperseries/onsmethodologyworkingpaperseriesnumber12acomparisonofindexnumbermethodologyusedonukwebscrapedpricedata (dostęp: 1.02.2020).
Google Scholar
Office for National Statistics (2017), Research indices using web scraped price data: clothing data, https://www.ons.gov.uk/economy/inflationandpriceindices/articles/researchindicesusingwebscrapedpricedata/clothingdata (dostęp: 1.02.2020).
Google Scholar
Radzikowski B., Śmietanka A. (2016), Online CASE CPI, First International Conference on Advanced Research Methods and Analytics, València.
Google Scholar
DOI: https://doi.org/10.4995/CARMA2016.2016.3133
Szulc B. (1964), Indices for Multiregional Comparisons, „Przegląd Statystyczny”, nr 3, s. 239–254.
Google Scholar
Yang Z., Gan L., Tang F. (2010), A Study of Price Evolution in the Online Toy Market. Economics, „Open‑Assessment E‑Journal”, vol. 4, no. 28, s. 1–29.
Google Scholar
DOI: https://doi.org/10.5018/economics-ejournal.ja.2010-28
Zhang L. (2020), Proxy expenditure weights for Consumer Price Index: audit sampling inference for big‑data statistics, „Journal of the Royal Statistical Society: Series A (Statistics in Society)”, https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/rssa.12632 (dostęp: 10.02.2020).
Google Scholar