Data-Informed Thinking + Doing

Dimension Reduction for Reducing Complexity

Subsetting eBay auction predictors by finding uncorrelated linear data with maximum variance—using PCA in R, Python, and Julia.


Data Understanding

str(ebay_r)
'data.frame':	1972 obs. of  8 variables:
 $ category     : Factor w/ 18 levels "Antique/Art/Craft",..: 14 14 14 14 14 14 14 14 14 14 ...
 $ currency     : Factor w/ 3 levels "EUR","GBP","US": 3 3 3 3 3 3 3 3 3 3 ...
 $ seller_rating: int  3249 3249 3249 3249 3249 3249 3249 3249 3249 3249 ...
 $ duration     : int  5 5 5 5 5 5 5 5 5 5 ...
 $ end_day      : Factor w/ 7 levels "Fri","Mon","Sat",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ close_price  : num  0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
 $ open_price   : num  0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
 $ competitive  : int  0 0 0 0 0 0 0 0 0 0 ...

Exploratory Data Analysis


Further Readings

  • Albright, S. C., Winston, W. L., & Zappe, C. (2003). Data Analysis for Managers with Microsoft Excel (2nd ed.). South-Western College Publishing.
  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R (2nd ed.). Springer. https://doi.org/10.1007/978-1-0716-1418-1
  • Shmueli, G., Patel, N. R., & Bruce, P. C. (2007). Data Mining for Business Intelligence. Wiley.
Recent Thoughts