Data-Informed Thinking + Doing

Time Series Analysis & Forecasting

Forecasting CO2 levels in Hawaii—using statistical models, machine learning, and deep learning in R, Python, and Julia.

Measuring and predicting CO2 levels is crucial as it enables us to monitor and understand the impact of human activities on climate change. CO2 is a major greenhouse gas responsible for trapping heat in the atmosphere, leading to global warming and its adverse effects, such as extreme weather events, rising sea levels, and ecological disruptions. By tracking CO2 levels, scientists can assess trends, formulate effective climate policies, and encourage sustainable practices to mitigate its impact. Accurate predictions help anticipate future changes, allowing us to take timely action, safeguarding the environment and ensuring a habitable planet for generations to come.

Getting Started

If you are interested in reproducing this work, here are the versions of R, Python, and Julia used (as well as the respective packages for each). Additionally, Leland Wilkinson’s approach to data visualization (Grammar of Graphics) has been adopted for this work. Finally, my coding style here is verbose, in order to trace back where functions/methods and variables are originating from, and make this a learning experience for everyone—including me.

cat(R.version$version.string, R.version$nickname)
R version 4.2.3 (2023-03-15) Shortstop Beagle
require(devtools)
devtools::install_version("dplyr", version="1.1.2", repos="http://cran.us.r-project.org")
devtools::install_version("ggplot2", version="3.4.2", repos="http://cran.us.r-project.org")
devtools::install_version("lubridate", version="1.9.2", repos="http://cran.us.r-project.org")
library(dplyr)
library(ggplot2)
library(lubridate)
import sys
print(sys.version)
3.11.4 (v3.11.4:d2340ef257, Jun  6 2023, 19:15:51) [Clang 13.0.0 (clang-1300.0.29.30)]
!pip install pandas==2.0.3
!pip install plotnine==0.12.1
import pandas
import plotnine
using InteractiveUtils
InteractiveUtils.versioninfo()
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin22.4.0)
  CPU: 8 × Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 8 virtual cores
Environment:
  DYLD_FALLBACK_LIBRARY_PATH = /Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server
using Pkg
Pkg.add(name="CSV", version="0.10.11")
Pkg.add(name="DataFrames", version="1.5.0")
Pkg.add(name="CategoricalArrays", version="0.10.8")
Pkg.add(name="Colors", version="0.12.10")
Pkg.add(name="Cairo", version="1.0.5")
Pkg.add(name="Gadfly", version="1.4.0")
using CSV
using DataFrames
using CategoricalArrays
using Colors
using Cairo
using Gadfly
using Dates

Importing and Examining Dataset

co2_r <- read.csv("../../dataset/co2-mm-mlo.csv")
str(object=co2_r)
'data.frame':	782 obs. of  8 variables:
 $ year          : int  1958 1958 1958 1958 1958 1958 1958 1958 1958 1958 ...
 $ month         : int  3 4 5 6 7 8 9 10 11 12 ...
 $ decimal.date  : num  1958 1958 1958 1958 1959 ...
 $ average       : num  316 317 318 317 316 ...
 $ deseasonalized: num  314 315 315 315 315 ...
 $ ndays         : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ sdev          : num  -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 ...
 $ unc           : num  -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 ...
head(x=co2_r, n=8)
  year month decimal.date average deseasonalized ndays sdev   unc
1 1958     3         1958     316            314    -1  -10 -0.99
2 1958     4         1958     317            315    -1  -10 -0.99
3 1958     5         1958     318            315    -1  -10 -0.99
4 1958     6         1958     317            315    -1  -10 -0.99
5 1958     7         1959     316            315    -1  -10 -0.99
6 1958     8         1959     315            316    -1  -10 -0.99
7 1958     9         1959     313            316    -1  -10 -0.99
8 1958    10         1959     312            315    -1  -10 -0.99
tail(x=co2_r, n=8)
    year month decimal.date average deseasonalized ndays sdev  unc
775 2022     9         2023     416            420    28 0.41 0.15
776 2022    10         2023     416            419    30 0.27 0.10
777 2022    11         2023     418            420    25 0.52 0.20
778 2022    12         2023     419            420    24 0.50 0.20
779 2023     1         2023     419            419    31 0.40 0.14
780 2023     2         2023     420            419    25 0.62 0.24
781 2023     3         2023     421            420    31 0.72 0.25
782 2023     4         2023     423            421    29 0.63 0.23
co2_py = pandas.read_csv("../../dataset/co2-mm-mlo.csv")
co2_py.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 782 entries, 0 to 781
Data columns (total 8 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   year            782 non-null    int64  
 1   month           782 non-null    int64  
 2   decimal date    782 non-null    float64
 3   average         782 non-null    float64
 4   deseasonalized  782 non-null    float64
 5   ndays           782 non-null    int64  
 6   sdev            782 non-null    float64
 7   unc             782 non-null    float64
dtypes: float64(5), int64(3)
memory usage: 49.0 KB
co2_py.head(8)
   year  month  decimal date  average  deseasonalized  ndays  sdev   unc
0  1958      3     1958.2027   315.70          314.43     -1 -9.99 -0.99
1  1958      4     1958.2877   317.45          315.16     -1 -9.99 -0.99
2  1958      5     1958.3699   317.51          314.71     -1 -9.99 -0.99
3  1958      6     1958.4548   317.24          315.14     -1 -9.99 -0.99
4  1958      7     1958.5370   315.86          315.18     -1 -9.99 -0.99
5  1958      8     1958.6219   314.93          316.18     -1 -9.99 -0.99
6  1958      9     1958.7068   313.20          316.08     -1 -9.99 -0.99
7  1958     10     1958.7890   312.43          315.41     -1 -9.99 -0.99
co2_py.tail(8)
     year  month  decimal date  average  deseasonalized  ndays  sdev   unc
774  2022      9     2022.7083   415.95          419.50     28  0.41  0.15
775  2022     10     2022.7917   415.78          419.13     30  0.27  0.10
776  2022     11     2022.8750   417.51          419.51     25  0.52  0.20
777  2022     12     2022.9583   418.95          419.66     24  0.50  0.20
778  2023      1     2023.0417   419.47          419.12     31  0.40  0.14
779  2023      2     2023.1250   420.41          419.46     25  0.62  0.24
780  2023      3     2023.2083   421.00          419.53     31  0.72  0.25
781  2023      4     2023.2917   423.28          420.59     29  0.63  0.23
co2_jl = CSV.File("../../dataset/co2-mm-mlo.csv") |> DataFrames.DataFrame
782×8 DataFrame
 Row │ year   month  decimal date  average  deseasonalized  ndays  sdev     unc
     │ Int64  Int64  Float64       Float64  Float64         Int64  Float64  Float64
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │  1958      3       1958.2    315.7           314.43     -1    -9.99    -0.99
   2 │  1958      4       1958.29   317.45          315.16     -1    -9.99    -0.99
   3 │  1958      5       1958.37   317.51          314.71     -1    -9.99    -0.99
   4 │  1958      6       1958.45   317.24          315.14     -1    -9.99    -0.99
   5 │  1958      7       1958.54   315.86          315.18     -1    -9.99    -0.99
   6 │  1958      8       1958.62   314.93          316.18     -1    -9.99    -0.99
   7 │  1958      9       1958.71   313.2           316.08     -1    -9.99    -0.99
   8 │  1958     10       1958.79   312.43          315.41     -1    -9.99    -0.99
   9 │  1958     11       1958.87   313.33          315.2      -1    -9.99    -0.99
  10 │  1958     12       1958.96   314.67          315.43     -1    -9.99    -0.99
  11 │  1959      1       1959.04   315.58          315.55     -1    -9.99    -0.99
  12 │  1959      2       1959.13   316.48          315.86     -1    -9.99    -0.99
  13 │  1959      3       1959.2    316.65          315.38     -1    -9.99    -0.99
  14 │  1959      4       1959.29   317.72          315.41     -1    -9.99    -0.99
  15 │  1959      5       1959.37   318.29          315.49     -1    -9.99    -0.99
  16 │  1959      6       1959.45   318.15          316.03     -1    -9.99    -0.99
  17 │  1959      7       1959.54   316.54          315.86     -1    -9.99    -0.99
  18 │  1959      8       1959.62   314.8           316.06     -1    -9.99    -0.99
  19 │  1959      9       1959.71   313.84          316.73     -1    -9.99    -0.99
  20 │  1959     10       1959.79   313.33          316.33     -1    -9.99    -0.99
  21 │  1959     11       1959.87   314.81          316.68     -1    -9.99    -0.99
  22 │  1959     12       1959.96   315.58          316.35     -1    -9.99    -0.99
  23 │  1960      1       1960.04   316.43          316.4      -1    -9.99    -0.99
  24 │  1960      2       1960.13   316.98          316.36     -1    -9.99    -0.99
  25 │  1960      3       1960.2    317.58          316.28     -1    -9.99    -0.99
  26 │  1960      4       1960.29   319.03          316.7      -1    -9.99    -0.99
  27 │  1960      5       1960.37   320.04          317.22     -1    -9.99    -0.99
  28 │  1960      6       1960.46   319.59          317.47     -1    -9.99    -0.99
  29 │  1960      7       1960.54   318.18          317.52     -1    -9.99    -0.99
  30 │  1960      8       1960.62   315.9           317.19     -1    -9.99    -0.99
  31 │  1960      9       1960.71   314.17          317.08     -1    -9.99    -0.99
  32 │  1960     10       1960.79   313.83          316.83     -1    -9.99    -0.99
  33 │  1960     11       1960.87   315.0           316.88     -1    -9.99    -0.99
  34 │  1960     12       1960.96   316.19          316.96     -1    -9.99    -0.99
  35 │  1961      1       1961.04   316.89          316.86     -1    -9.99    -0.99
  36 │  1961      2       1961.13   317.7           317.08     -1    -9.99    -0.99
  37 │  1961      3       1961.2    318.54          317.26     -1    -9.99    -0.99
  38 │  1961      4       1961.29   319.48          317.16     -1    -9.99    -0.99
  39 │  1961      5       1961.37   320.58          317.76     -1    -9.99    -0.99
  40 │  1961      6       1961.45   319.77          317.63     -1    -9.99    -0.99
  41 │  1961      7       1961.54   318.57          317.88     -1    -9.99    -0.99
  42 │  1961      8       1961.62   316.79          318.06     -1    -9.99    -0.99
  43 │  1961      9       1961.71   314.99          317.9      -1    -9.99    -0.99
  44 │  1961     10       1961.79   315.31          318.32     -1    -9.99    -0.99
  45 │  1961     11       1961.87   316.1           317.99     -1    -9.99    -0.99
  46 │  1961     12       1961.96   317.01          317.79     -1    -9.99    -0.99
  ⋮  │   ⋮      ⋮         ⋮           ⋮           ⋮           ⋮       ⋮        ⋮
 738 │  2019      8       2019.62   410.18          412.1      29     0.33     0.12
 739 │  2019      9       2019.71   408.76          412.26     30     0.38     0.13
 740 │  2019     10       2019.79   408.75          412.1      29     0.31     0.11
 741 │  2019     11       2019.88   410.48          412.52     26     0.4      0.15
 742 │  2019     12       2019.96   411.98          412.74     31     0.4      0.14
 743 │  2020      1       2020.04   413.61          413.26     29     0.73     0.26
 744 │  2020      2       2020.12   414.34          413.39     28     0.69     0.25
 745 │  2020      3       2020.21   414.74          413.26     26     0.33     0.12
 746 │  2020      4       2020.29   416.45          413.76     28     0.66     0.24
 747 │  2020      5       2020.38   417.31          413.9      27     0.61     0.23
 748 │  2020      6       2020.46   416.6           414.23     27     0.44     0.16
 749 │  2020      7       2020.54   414.62          414.31     30     0.55     0.19
 750 │  2020      8       2020.62   412.78          414.73     25     0.25     0.1
 751 │  2020      9       2020.71   411.52          415.07     29     0.31     0.11
 752 │  2020     10       2020.79   411.51          414.86     30     0.22     0.08
 753 │  2020     11       2020.88   413.12          415.12     27     0.81     0.3
 754 │  2020     12       2020.96   414.26          414.97     30     0.47     0.17
 755 │  2021      1       2021.04   415.52          415.17     29     0.44     0.16
 756 │  2021      2       2021.12   416.75          415.8      28     1.02     0.37
 757 │  2021      3       2021.21   417.64          416.17     28     0.86     0.31
 758 │  2021      4       2021.29   419.05          416.36     24     1.12     0.44
 759 │  2021      5       2021.38   419.13          415.71     28     0.9      0.33
 760 │  2021      6       2021.46   418.94          416.57     28     0.65     0.23
 761 │  2021      7       2021.54   416.96          416.65     30     0.71     0.25
 762 │  2021      8       2021.62   414.47          416.43     26     0.72     0.27
 763 │  2021      9       2021.71   413.3           416.85     27     0.29     0.11
 764 │  2021     10       2021.79   413.93          417.28     29     0.35     0.12
 765 │  2021     11       2021.88   415.01          417.01     30     0.36     0.13
 766 │  2021     12       2021.96   416.71          417.42     28     0.48     0.17
 767 │  2022      1       2022.04   418.19          417.84     29     0.73     0.26
 768 │  2022      2       2022.12   419.28          418.32     27     0.92     0.34
 769 │  2022      3       2022.21   418.81          417.33     30     0.78     0.27
 770 │  2022      4       2022.29   420.23          417.54     28     0.85     0.31
 771 │  2022      5       2022.38   420.99          417.58     30     0.76     0.27
 772 │  2022      6       2022.46   420.99          418.62     28     0.3      0.11
 773 │  2022      7       2022.54   418.9           418.59     27     0.57     0.21
 774 │  2022      8       2022.62   417.19          419.15     27     0.37     0.14
 775 │  2022      9       2022.71   415.95          419.5      28     0.41     0.15
 776 │  2022     10       2022.79   415.78          419.13     30     0.27     0.1
 777 │  2022     11       2022.88   417.51          419.51     25     0.52     0.2
 778 │  2022     12       2022.96   418.95          419.66     24     0.5      0.2
 779 │  2023      1       2023.04   419.47          419.12     31     0.4      0.14
 780 │  2023      2       2023.12   420.41          419.46     25     0.62     0.24
 781 │  2023      3       2023.21   421.0           419.53     31     0.72     0.25
 782 │  2023      4       2023.29   423.28          420.59     29     0.63     0.23
                                                                    691 rows omitted

Wrangling Data

co2_clean_r <- co2_r
co2_clean_r$datestamp <- paste0(co2_clean_r$year, "-", sprintf("%02d", co2_clean_r$month))
co2_clean_r$datestamp <- as.Date(paste0(co2_clean_r$datestamp, "-15"), format = "%Y-%m-%d")
co2_clean_r <- co2_clean_r[, c("datestamp", names(co2_clean_r)[-which(names(co2_clean_r) == "datestamp")])]
str(object=co2_clean_r)
'data.frame':	782 obs. of  9 variables:
 $ datestamp     : Date, format: "1958-03-15" "1958-04-15" "1958-05-15" "1958-06-15" ...
 $ year          : int  1958 1958 1958 1958 1958 1958 1958 1958 1958 1958 ...
 $ month         : int  3 4 5 6 7 8 9 10 11 12 ...
 $ decimal.date  : num  1958 1958 1958 1958 1959 ...
 $ average       : num  316 317 318 317 316 ...
 $ deseasonalized: num  314 315 315 315 315 ...
 $ ndays         : int  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
 $ sdev          : num  -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 ...
 $ unc           : num  -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 ...
head(x=co2_clean_r, n=8)
   datestamp year month decimal.date average deseasonalized ndays sdev   unc
1 1958-03-15 1958     3         1958     316            314    -1  -10 -0.99
2 1958-04-15 1958     4         1958     317            315    -1  -10 -0.99
3 1958-05-15 1958     5         1958     318            315    -1  -10 -0.99
4 1958-06-15 1958     6         1958     317            315    -1  -10 -0.99
5 1958-07-15 1958     7         1959     316            315    -1  -10 -0.99
6 1958-08-15 1958     8         1959     315            316    -1  -10 -0.99
7 1958-09-15 1958     9         1959     313            316    -1  -10 -0.99
8 1958-10-15 1958    10         1959     312            315    -1  -10 -0.99
tail(x=co2_clean_r, n=8)
     datestamp year month decimal.date average deseasonalized ndays sdev  unc
775 2022-09-15 2022     9         2023     416            420    28 0.41 0.15
776 2022-10-15 2022    10         2023     416            419    30 0.27 0.10
777 2022-11-15 2022    11         2023     418            420    25 0.52 0.20
778 2022-12-15 2022    12         2023     419            420    24 0.50 0.20
779 2023-01-15 2023     1         2023     419            419    31 0.40 0.14
780 2023-02-15 2023     2         2023     420            419    25 0.62 0.24
781 2023-03-15 2023     3         2023     421            420    31 0.72 0.25
782 2023-04-15 2023     4         2023     423            421    29 0.63 0.23
co2_clean_py = co2_py
# co2_clean_py["datestamp"] = pandas.to_datetime(co2_clean_py["datestamp"]).dt.date
# co2_clean_py.info()
# co2_clean_py.head(n=8)
# co2_clean_py.tail(n=8)
co2_clean_jl = co2_jl;
co2_clean_jl.datestamp = Dates.Date.(co2_clean_jl.year, co2_clean_jl.month, 15);
co2_clean_jl = co2_clean_jl[:, [names(co2_clean_jl)[end]; setdiff(names(co2_clean_jl), [names(co2_clean_jl)[end]])]]
782×9 DataFrame
 Row │ datestamp   year   month  decimal date  average  deseasonalized  ndays  sdev     unc
     │ Date        Int64  Int64  Float64       Float64  Float64         Int64  Float64  Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1958-03-15   1958      3       1958.2    315.7           314.43     -1    -9.99    -0.99
   2 │ 1958-04-15   1958      4       1958.29   317.45          315.16     -1    -9.99    -0.99
   3 │ 1958-05-15   1958      5       1958.37   317.51          314.71     -1    -9.99    -0.99
   4 │ 1958-06-15   1958      6       1958.45   317.24          315.14     -1    -9.99    -0.99
   5 │ 1958-07-15   1958      7       1958.54   315.86          315.18     -1    -9.99    -0.99
   6 │ 1958-08-15   1958      8       1958.62   314.93          316.18     -1    -9.99    -0.99
   7 │ 1958-09-15   1958      9       1958.71   313.2           316.08     -1    -9.99    -0.99
   8 │ 1958-10-15   1958     10       1958.79   312.43          315.41     -1    -9.99    -0.99
   9 │ 1958-11-15   1958     11       1958.87   313.33          315.2      -1    -9.99    -0.99
  10 │ 1958-12-15   1958     12       1958.96   314.67          315.43     -1    -9.99    -0.99
  11 │ 1959-01-15   1959      1       1959.04   315.58          315.55     -1    -9.99    -0.99
  12 │ 1959-02-15   1959      2       1959.13   316.48          315.86     -1    -9.99    -0.99
  13 │ 1959-03-15   1959      3       1959.2    316.65          315.38     -1    -9.99    -0.99
  14 │ 1959-04-15   1959      4       1959.29   317.72          315.41     -1    -9.99    -0.99
  15 │ 1959-05-15   1959      5       1959.37   318.29          315.49     -1    -9.99    -0.99
  16 │ 1959-06-15   1959      6       1959.45   318.15          316.03     -1    -9.99    -0.99
  17 │ 1959-07-15   1959      7       1959.54   316.54          315.86     -1    -9.99    -0.99
  18 │ 1959-08-15   1959      8       1959.62   314.8           316.06     -1    -9.99    -0.99
  19 │ 1959-09-15   1959      9       1959.71   313.84          316.73     -1    -9.99    -0.99
  20 │ 1959-10-15   1959     10       1959.79   313.33          316.33     -1    -9.99    -0.99
  21 │ 1959-11-15   1959     11       1959.87   314.81          316.68     -1    -9.99    -0.99
  22 │ 1959-12-15   1959     12       1959.96   315.58          316.35     -1    -9.99    -0.99
  23 │ 1960-01-15   1960      1       1960.04   316.43          316.4      -1    -9.99    -0.99
  24 │ 1960-02-15   1960      2       1960.13   316.98          316.36     -1    -9.99    -0.99
  25 │ 1960-03-15   1960      3       1960.2    317.58          316.28     -1    -9.99    -0.99
  26 │ 1960-04-15   1960      4       1960.29   319.03          316.7      -1    -9.99    -0.99
  27 │ 1960-05-15   1960      5       1960.37   320.04          317.22     -1    -9.99    -0.99
  28 │ 1960-06-15   1960      6       1960.46   319.59          317.47     -1    -9.99    -0.99
  29 │ 1960-07-15   1960      7       1960.54   318.18          317.52     -1    -9.99    -0.99
  30 │ 1960-08-15   1960      8       1960.62   315.9           317.19     -1    -9.99    -0.99
  31 │ 1960-09-15   1960      9       1960.71   314.17          317.08     -1    -9.99    -0.99
  32 │ 1960-10-15   1960     10       1960.79   313.83          316.83     -1    -9.99    -0.99
  33 │ 1960-11-15   1960     11       1960.87   315.0           316.88     -1    -9.99    -0.99
  34 │ 1960-12-15   1960     12       1960.96   316.19          316.96     -1    -9.99    -0.99
  35 │ 1961-01-15   1961      1       1961.04   316.89          316.86     -1    -9.99    -0.99
  36 │ 1961-02-15   1961      2       1961.13   317.7           317.08     -1    -9.99    -0.99
  37 │ 1961-03-15   1961      3       1961.2    318.54          317.26     -1    -9.99    -0.99
  38 │ 1961-04-15   1961      4       1961.29   319.48          317.16     -1    -9.99    -0.99
  39 │ 1961-05-15   1961      5       1961.37   320.58          317.76     -1    -9.99    -0.99
  40 │ 1961-06-15   1961      6       1961.45   319.77          317.63     -1    -9.99    -0.99
  41 │ 1961-07-15   1961      7       1961.54   318.57          317.88     -1    -9.99    -0.99
  42 │ 1961-08-15   1961      8       1961.62   316.79          318.06     -1    -9.99    -0.99
  43 │ 1961-09-15   1961      9       1961.71   314.99          317.9      -1    -9.99    -0.99
  44 │ 1961-10-15   1961     10       1961.79   315.31          318.32     -1    -9.99    -0.99
  45 │ 1961-11-15   1961     11       1961.87   316.1           317.99     -1    -9.99    -0.99
  46 │ 1961-12-15   1961     12       1961.96   317.01          317.79     -1    -9.99    -0.99
  ⋮  │     ⋮         ⋮      ⋮         ⋮           ⋮           ⋮           ⋮       ⋮        ⋮
 738 │ 2019-08-15   2019      8       2019.62   410.18          412.1      29     0.33     0.12
 739 │ 2019-09-15   2019      9       2019.71   408.76          412.26     30     0.38     0.13
 740 │ 2019-10-15   2019     10       2019.79   408.75          412.1      29     0.31     0.11
 741 │ 2019-11-15   2019     11       2019.88   410.48          412.52     26     0.4      0.15
 742 │ 2019-12-15   2019     12       2019.96   411.98          412.74     31     0.4      0.14
 743 │ 2020-01-15   2020      1       2020.04   413.61          413.26     29     0.73     0.26
 744 │ 2020-02-15   2020      2       2020.12   414.34          413.39     28     0.69     0.25
 745 │ 2020-03-15   2020      3       2020.21   414.74          413.26     26     0.33     0.12
 746 │ 2020-04-15   2020      4       2020.29   416.45          413.76     28     0.66     0.24
 747 │ 2020-05-15   2020      5       2020.38   417.31          413.9      27     0.61     0.23
 748 │ 2020-06-15   2020      6       2020.46   416.6           414.23     27     0.44     0.16
 749 │ 2020-07-15   2020      7       2020.54   414.62          414.31     30     0.55     0.19
 750 │ 2020-08-15   2020      8       2020.62   412.78          414.73     25     0.25     0.1
 751 │ 2020-09-15   2020      9       2020.71   411.52          415.07     29     0.31     0.11
 752 │ 2020-10-15   2020     10       2020.79   411.51          414.86     30     0.22     0.08
 753 │ 2020-11-15   2020     11       2020.88   413.12          415.12     27     0.81     0.3
 754 │ 2020-12-15   2020     12       2020.96   414.26          414.97     30     0.47     0.17
 755 │ 2021-01-15   2021      1       2021.04   415.52          415.17     29     0.44     0.16
 756 │ 2021-02-15   2021      2       2021.12   416.75          415.8      28     1.02     0.37
 757 │ 2021-03-15   2021      3       2021.21   417.64          416.17     28     0.86     0.31
 758 │ 2021-04-15   2021      4       2021.29   419.05          416.36     24     1.12     0.44
 759 │ 2021-05-15   2021      5       2021.38   419.13          415.71     28     0.9      0.33
 760 │ 2021-06-15   2021      6       2021.46   418.94          416.57     28     0.65     0.23
 761 │ 2021-07-15   2021      7       2021.54   416.96          416.65     30     0.71     0.25
 762 │ 2021-08-15   2021      8       2021.62   414.47          416.43     26     0.72     0.27
 763 │ 2021-09-15   2021      9       2021.71   413.3           416.85     27     0.29     0.11
 764 │ 2021-10-15   2021     10       2021.79   413.93          417.28     29     0.35     0.12
 765 │ 2021-11-15   2021     11       2021.88   415.01          417.01     30     0.36     0.13
 766 │ 2021-12-15   2021     12       2021.96   416.71          417.42     28     0.48     0.17
 767 │ 2022-01-15   2022      1       2022.04   418.19          417.84     29     0.73     0.26
 768 │ 2022-02-15   2022      2       2022.12   419.28          418.32     27     0.92     0.34
 769 │ 2022-03-15   2022      3       2022.21   418.81          417.33     30     0.78     0.27
 770 │ 2022-04-15   2022      4       2022.29   420.23          417.54     28     0.85     0.31
 771 │ 2022-05-15   2022      5       2022.38   420.99          417.58     30     0.76     0.27
 772 │ 2022-06-15   2022      6       2022.46   420.99          418.62     28     0.3      0.11
 773 │ 2022-07-15   2022      7       2022.54   418.9           418.59     27     0.57     0.21
 774 │ 2022-08-15   2022      8       2022.62   417.19          419.15     27     0.37     0.14
 775 │ 2022-09-15   2022      9       2022.71   415.95          419.5      28     0.41     0.15
 776 │ 2022-10-15   2022     10       2022.79   415.78          419.13     30     0.27     0.1
 777 │ 2022-11-15   2022     11       2022.88   417.51          419.51     25     0.52     0.2
 778 │ 2022-12-15   2022     12       2022.96   418.95          419.66     24     0.5      0.2
 779 │ 2023-01-15   2023      1       2023.04   419.47          419.12     31     0.4      0.14
 780 │ 2023-02-15   2023      2       2023.12   420.41          419.46     25     0.62     0.24
 781 │ 2023-03-15   2023      3       2023.21   421.0           419.53     31     0.72     0.25
 782 │ 2023-04-15   2023      4       2023.29   423.28          420.59     29     0.63     0.23
                                                                                691 rows omitted
co2_clean_jl
782×9 DataFrame
 Row │ datestamp   year   month  decimal date  average  deseasonalized  ndays  sdev     unc
     │ Date        Int64  Int64  Float64       Float64  Float64         Int64  Float64  Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1958-03-15   1958      3       1958.2    315.7           314.43     -1    -9.99    -0.99
   2 │ 1958-04-15   1958      4       1958.29   317.45          315.16     -1    -9.99    -0.99
   3 │ 1958-05-15   1958      5       1958.37   317.51          314.71     -1    -9.99    -0.99
   4 │ 1958-06-15   1958      6       1958.45   317.24          315.14     -1    -9.99    -0.99
   5 │ 1958-07-15   1958      7       1958.54   315.86          315.18     -1    -9.99    -0.99
   6 │ 1958-08-15   1958      8       1958.62   314.93          316.18     -1    -9.99    -0.99
   7 │ 1958-09-15   1958      9       1958.71   313.2           316.08     -1    -9.99    -0.99
   8 │ 1958-10-15   1958     10       1958.79   312.43          315.41     -1    -9.99    -0.99
   9 │ 1958-11-15   1958     11       1958.87   313.33          315.2      -1    -9.99    -0.99
  10 │ 1958-12-15   1958     12       1958.96   314.67          315.43     -1    -9.99    -0.99
  11 │ 1959-01-15   1959      1       1959.04   315.58          315.55     -1    -9.99    -0.99
  12 │ 1959-02-15   1959      2       1959.13   316.48          315.86     -1    -9.99    -0.99
  13 │ 1959-03-15   1959      3       1959.2    316.65          315.38     -1    -9.99    -0.99
  14 │ 1959-04-15   1959      4       1959.29   317.72          315.41     -1    -9.99    -0.99
  15 │ 1959-05-15   1959      5       1959.37   318.29          315.49     -1    -9.99    -0.99
  16 │ 1959-06-15   1959      6       1959.45   318.15          316.03     -1    -9.99    -0.99
  17 │ 1959-07-15   1959      7       1959.54   316.54          315.86     -1    -9.99    -0.99
  18 │ 1959-08-15   1959      8       1959.62   314.8           316.06     -1    -9.99    -0.99
  19 │ 1959-09-15   1959      9       1959.71   313.84          316.73     -1    -9.99    -0.99
  20 │ 1959-10-15   1959     10       1959.79   313.33          316.33     -1    -9.99    -0.99
  21 │ 1959-11-15   1959     11       1959.87   314.81          316.68     -1    -9.99    -0.99
  22 │ 1959-12-15   1959     12       1959.96   315.58          316.35     -1    -9.99    -0.99
  23 │ 1960-01-15   1960      1       1960.04   316.43          316.4      -1    -9.99    -0.99
  24 │ 1960-02-15   1960      2       1960.13   316.98          316.36     -1    -9.99    -0.99
  25 │ 1960-03-15   1960      3       1960.2    317.58          316.28     -1    -9.99    -0.99
  26 │ 1960-04-15   1960      4       1960.29   319.03          316.7      -1    -9.99    -0.99
  27 │ 1960-05-15   1960      5       1960.37   320.04          317.22     -1    -9.99    -0.99
  28 │ 1960-06-15   1960      6       1960.46   319.59          317.47     -1    -9.99    -0.99
  29 │ 1960-07-15   1960      7       1960.54   318.18          317.52     -1    -9.99    -0.99
  30 │ 1960-08-15   1960      8       1960.62   315.9           317.19     -1    -9.99    -0.99
  31 │ 1960-09-15   1960      9       1960.71   314.17          317.08     -1    -9.99    -0.99
  32 │ 1960-10-15   1960     10       1960.79   313.83          316.83     -1    -9.99    -0.99
  33 │ 1960-11-15   1960     11       1960.87   315.0           316.88     -1    -9.99    -0.99
  34 │ 1960-12-15   1960     12       1960.96   316.19          316.96     -1    -9.99    -0.99
  35 │ 1961-01-15   1961      1       1961.04   316.89          316.86     -1    -9.99    -0.99
  36 │ 1961-02-15   1961      2       1961.13   317.7           317.08     -1    -9.99    -0.99
  37 │ 1961-03-15   1961      3       1961.2    318.54          317.26     -1    -9.99    -0.99
  38 │ 1961-04-15   1961      4       1961.29   319.48          317.16     -1    -9.99    -0.99
  39 │ 1961-05-15   1961      5       1961.37   320.58          317.76     -1    -9.99    -0.99
  40 │ 1961-06-15   1961      6       1961.45   319.77          317.63     -1    -9.99    -0.99
  41 │ 1961-07-15   1961      7       1961.54   318.57          317.88     -1    -9.99    -0.99
  42 │ 1961-08-15   1961      8       1961.62   316.79          318.06     -1    -9.99    -0.99
  43 │ 1961-09-15   1961      9       1961.71   314.99          317.9      -1    -9.99    -0.99
  44 │ 1961-10-15   1961     10       1961.79   315.31          318.32     -1    -9.99    -0.99
  45 │ 1961-11-15   1961     11       1961.87   316.1           317.99     -1    -9.99    -0.99
  46 │ 1961-12-15   1961     12       1961.96   317.01          317.79     -1    -9.99    -0.99
  ⋮  │     ⋮         ⋮      ⋮         ⋮           ⋮           ⋮           ⋮       ⋮        ⋮
 738 │ 2019-08-15   2019      8       2019.62   410.18          412.1      29     0.33     0.12
 739 │ 2019-09-15   2019      9       2019.71   408.76          412.26     30     0.38     0.13
 740 │ 2019-10-15   2019     10       2019.79   408.75          412.1      29     0.31     0.11
 741 │ 2019-11-15   2019     11       2019.88   410.48          412.52     26     0.4      0.15
 742 │ 2019-12-15   2019     12       2019.96   411.98          412.74     31     0.4      0.14
 743 │ 2020-01-15   2020      1       2020.04   413.61          413.26     29     0.73     0.26
 744 │ 2020-02-15   2020      2       2020.12   414.34          413.39     28     0.69     0.25
 745 │ 2020-03-15   2020      3       2020.21   414.74          413.26     26     0.33     0.12
 746 │ 2020-04-15   2020      4       2020.29   416.45          413.76     28     0.66     0.24
 747 │ 2020-05-15   2020      5       2020.38   417.31          413.9      27     0.61     0.23
 748 │ 2020-06-15   2020      6       2020.46   416.6           414.23     27     0.44     0.16
 749 │ 2020-07-15   2020      7       2020.54   414.62          414.31     30     0.55     0.19
 750 │ 2020-08-15   2020      8       2020.62   412.78          414.73     25     0.25     0.1
 751 │ 2020-09-15   2020      9       2020.71   411.52          415.07     29     0.31     0.11
 752 │ 2020-10-15   2020     10       2020.79   411.51          414.86     30     0.22     0.08
 753 │ 2020-11-15   2020     11       2020.88   413.12          415.12     27     0.81     0.3
 754 │ 2020-12-15   2020     12       2020.96   414.26          414.97     30     0.47     0.17
 755 │ 2021-01-15   2021      1       2021.04   415.52          415.17     29     0.44     0.16
 756 │ 2021-02-15   2021      2       2021.12   416.75          415.8      28     1.02     0.37
 757 │ 2021-03-15   2021      3       2021.21   417.64          416.17     28     0.86     0.31
 758 │ 2021-04-15   2021      4       2021.29   419.05          416.36     24     1.12     0.44
 759 │ 2021-05-15   2021      5       2021.38   419.13          415.71     28     0.9      0.33
 760 │ 2021-06-15   2021      6       2021.46   418.94          416.57     28     0.65     0.23
 761 │ 2021-07-15   2021      7       2021.54   416.96          416.65     30     0.71     0.25
 762 │ 2021-08-15   2021      8       2021.62   414.47          416.43     26     0.72     0.27
 763 │ 2021-09-15   2021      9       2021.71   413.3           416.85     27     0.29     0.11
 764 │ 2021-10-15   2021     10       2021.79   413.93          417.28     29     0.35     0.12
 765 │ 2021-11-15   2021     11       2021.88   415.01          417.01     30     0.36     0.13
 766 │ 2021-12-15   2021     12       2021.96   416.71          417.42     28     0.48     0.17
 767 │ 2022-01-15   2022      1       2022.04   418.19          417.84     29     0.73     0.26
 768 │ 2022-02-15   2022      2       2022.12   419.28          418.32     27     0.92     0.34
 769 │ 2022-03-15   2022      3       2022.21   418.81          417.33     30     0.78     0.27
 770 │ 2022-04-15   2022      4       2022.29   420.23          417.54     28     0.85     0.31
 771 │ 2022-05-15   2022      5       2022.38   420.99          417.58     30     0.76     0.27
 772 │ 2022-06-15   2022      6       2022.46   420.99          418.62     28     0.3      0.11
 773 │ 2022-07-15   2022      7       2022.54   418.9           418.59     27     0.57     0.21
 774 │ 2022-08-15   2022      8       2022.62   417.19          419.15     27     0.37     0.14
 775 │ 2022-09-15   2022      9       2022.71   415.95          419.5      28     0.41     0.15
 776 │ 2022-10-15   2022     10       2022.79   415.78          419.13     30     0.27     0.1
 777 │ 2022-11-15   2022     11       2022.88   417.51          419.51     25     0.52     0.2
 778 │ 2022-12-15   2022     12       2022.96   418.95          419.66     24     0.5      0.2
 779 │ 2023-01-15   2023      1       2023.04   419.47          419.12     31     0.4      0.14
 780 │ 2023-02-15   2023      2       2023.12   420.41          419.46     25     0.62     0.24
 781 │ 2023-03-15   2023      3       2023.21   421.0           419.53     31     0.72     0.25
 782 │ 2023-04-15   2023      4       2023.29   423.28          420.59     29     0.63     0.23
                                                                                691 rows omitted

Performing Exploratory Data Analysis (EDA)

linechart_co2_r <- ggplot2::ggplot(co2_clean_r) +
    geom_line(aes(x=datestamp, y=average), color=palette_michaelmallari_r[3]) +
    scale_y_continuous(expand=c(0, 0), position="right") +
    labs(
        title="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
        alt="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
        x="",
        y="",
        caption="Source: National Oceanic & Atmospheric Administration"
    ) +
    theme_michaelmallari_r()

linechart_co2_r

# linechart_co2_py = (plotnine.ggplot(data=co2_clean_py)
#     + plotnine.geoms.geom_line(mapping=plotnine.mapping.aes(x="datestamp", y="co2"), color=palette_michaelmallari_py[3])
#     + plotnine.labels.labs(
#         title="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
#         x="",
#         y=""
#     )
# )
# 
# linechart_co2_py
linechart_co2_jl = Gadfly.plot(
    co2_clean_jl,
    Gadfly.layer(x="datestamp", y="average", Geom.line),
    Gadfly.Scale.x_continuous(minticks=4, maxticks=6),
    Gadfly.Scale.y_continuous(minticks=4, maxticks=6),
    Gadfly.Guide.title("CO2 Mole Fraction (ppm) in Hawaii, 1958-2023"),
    Gadfly.Guide.xlabel(""),
    Gadfly.Guide.ylabel(""),
    theme_michaelmallari_jl
);

Statistical Model-Based Forecasting

Machine Learning-Based Forecasting

Deep Learning-Based Forecasting


References

  • National Oceanic & Atmospheric Administration. (n.d.). Trends in Atmospheric Carbon Dioxide (Mauna Loa Observatory, Hawaii). Global Monitoring Laboratory. Retrieved May 12, 2023, from https://gml.noaa.gov/ccgg/trends/mlo.html
  • Tsay, R. S. (2010). Analysis of Financial Time Series (3rd ed.). Wiley.
Applied Advanced Analytics & AI in Sports