Data-Informed Thinking + Doing
Dimension Reduction for Reducing Complexity
Subsetting eBay auction predictors by finding uncorrelated linear data with maximum variance—using PCA in R, Python, and Julia.
Data Understanding
str(ebay_r)
'data.frame': 1972 obs. of 8 variables:
$ category : Factor w/ 18 levels "Antique/Art/Craft",..: 14 14 14 14 14 14 14 14 14 14 ...
$ currency : Factor w/ 3 levels "EUR","GBP","US": 3 3 3 3 3 3 3 3 3 3 ...
$ seller_rating: int 3249 3249 3249 3249 3249 3249 3249 3249 3249 3249 ...
$ duration : int 5 5 5 5 5 5 5 5 5 5 ...
$ end_day : Factor w/ 7 levels "Fri","Mon","Sat",..: 2 2 2 2 2 2 2 2 2 2 ...
$ close_price : num 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
$ open_price : num 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ...
$ competitive : int 0 0 0 0 0 0 0 0 0 0 ...
Exploratory Data Analysis
Further Readings
- Albright, S. C., Winston, W. L., & Zappe, C. (2003). Data Analysis for Managers with Microsoft Excel (2nd ed.). South-Western College Publishing.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R (2nd ed.). Springer. https://doi.org/10.1007/978-1-0716-1418-1
- Shmueli, G., Patel, N. R., & Bruce, P. C. (2007). Data Mining for Business Intelligence. Wiley.
Recent Thoughts