Time Series Analysis & Forecasting
Forecasting CO2 levels in Hawaii—using statistical models, machine learning, and deep learning in R, Python, and Julia.
Measuring and predicting CO2 levels is crucial as it enables us to monitor and understand the impact of human activities on climate change. CO2 is a major greenhouse gas responsible for trapping heat in the atmosphere, leading to global warming and its adverse effects, such as extreme weather events, rising sea levels, and ecological disruptions. By tracking CO2 levels, scientists can assess trends, formulate effective climate policies, and encourage sustainable practices to mitigate its impact. Accurate predictions help anticipate future changes, allowing us to take timely action, safeguarding the environment and ensuring a habitable planet for generations to come.
Getting Started
If you are interested in reproducing this work, here are the versions of R, Python, and Julia used (as well as the respective packages for each). Additionally, Leland Wilkinson’s approach to data visualization (Grammar of Graphics) has been adopted for this work. Finally, my coding style here is verbose, in order to trace back where functions/methods and variables are originating from, and make this a learning experience for everyone—including me.
cat(R.version$version.string, R.version$nickname)
R version 4.2.3 (2023-03-15) Shortstop Beagle
require(devtools)
devtools::install_version("dplyr", version="1.1.2", repos="http://cran.us.r-project.org")
devtools::install_version("ggplot2", version="3.4.2", repos="http://cran.us.r-project.org")
devtools::install_version("lubridate", version="1.9.2", repos="http://cran.us.r-project.org")
library(dplyr)
library(ggplot2)
library(lubridate)
import sys
print(sys.version)
3.11.4 (v3.11.4:d2340ef257, Jun 6 2023, 19:15:51) [Clang 13.0.0 (clang-1300.0.29.30)]
!pip install pandas==2.0.3
!pip install plotnine==0.12.1
import pandas
import plotnine
using InteractiveUtils
InteractiveUtils.versioninfo()
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin22.4.0)
CPU: 8 × Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
Threads: 1 on 8 virtual cores
Environment:
DYLD_FALLBACK_LIBRARY_PATH = /Library/Frameworks/R.framework/Resources/lib:/Library/Java/JavaVirtualMachines/jdk1.8.0_241.jdk/Contents/Home/jre/lib/server
using Pkg
Pkg.add(name="CSV", version="0.10.11")
Pkg.add(name="DataFrames", version="1.5.0")
Pkg.add(name="CategoricalArrays", version="0.10.8")
Pkg.add(name="Colors", version="0.12.10")
Pkg.add(name="Cairo", version="1.0.5")
Pkg.add(name="Gadfly", version="1.4.0")
using CSV
using DataFrames
using CategoricalArrays
using Colors
using Cairo
using Gadfly
using Dates
Importing and Examining Dataset
co2_r <- read.csv("../../dataset/co2-mm-mlo.csv")
str(object=co2_r)
'data.frame': 782 obs. of 8 variables:
$ year : int 1958 1958 1958 1958 1958 1958 1958 1958 1958 1958 ...
$ month : int 3 4 5 6 7 8 9 10 11 12 ...
$ decimal.date : num 1958 1958 1958 1958 1959 ...
$ average : num 316 317 318 317 316 ...
$ deseasonalized: num 314 315 315 315 315 ...
$ ndays : int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
$ sdev : num -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 ...
$ unc : num -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 ...
head(x=co2_r, n=8)
year month decimal.date average deseasonalized ndays sdev unc
1 1958 3 1958 316 314 -1 -10 -0.99
2 1958 4 1958 317 315 -1 -10 -0.99
3 1958 5 1958 318 315 -1 -10 -0.99
4 1958 6 1958 317 315 -1 -10 -0.99
5 1958 7 1959 316 315 -1 -10 -0.99
6 1958 8 1959 315 316 -1 -10 -0.99
7 1958 9 1959 313 316 -1 -10 -0.99
8 1958 10 1959 312 315 -1 -10 -0.99
tail(x=co2_r, n=8)
year month decimal.date average deseasonalized ndays sdev unc
775 2022 9 2023 416 420 28 0.41 0.15
776 2022 10 2023 416 419 30 0.27 0.10
777 2022 11 2023 418 420 25 0.52 0.20
778 2022 12 2023 419 420 24 0.50 0.20
779 2023 1 2023 419 419 31 0.40 0.14
780 2023 2 2023 420 419 25 0.62 0.24
781 2023 3 2023 421 420 31 0.72 0.25
782 2023 4 2023 423 421 29 0.63 0.23
co2_py = pandas.read_csv("../../dataset/co2-mm-mlo.csv")
co2_py.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 782 entries, 0 to 781
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 year 782 non-null int64
1 month 782 non-null int64
2 decimal date 782 non-null float64
3 average 782 non-null float64
4 deseasonalized 782 non-null float64
5 ndays 782 non-null int64
6 sdev 782 non-null float64
7 unc 782 non-null float64
dtypes: float64(5), int64(3)
memory usage: 49.0 KB
co2_py.head(8)
year month decimal date average deseasonalized ndays sdev unc
0 1958 3 1958.2027 315.70 314.43 -1 -9.99 -0.99
1 1958 4 1958.2877 317.45 315.16 -1 -9.99 -0.99
2 1958 5 1958.3699 317.51 314.71 -1 -9.99 -0.99
3 1958 6 1958.4548 317.24 315.14 -1 -9.99 -0.99
4 1958 7 1958.5370 315.86 315.18 -1 -9.99 -0.99
5 1958 8 1958.6219 314.93 316.18 -1 -9.99 -0.99
6 1958 9 1958.7068 313.20 316.08 -1 -9.99 -0.99
7 1958 10 1958.7890 312.43 315.41 -1 -9.99 -0.99
co2_py.tail(8)
year month decimal date average deseasonalized ndays sdev unc
774 2022 9 2022.7083 415.95 419.50 28 0.41 0.15
775 2022 10 2022.7917 415.78 419.13 30 0.27 0.10
776 2022 11 2022.8750 417.51 419.51 25 0.52 0.20
777 2022 12 2022.9583 418.95 419.66 24 0.50 0.20
778 2023 1 2023.0417 419.47 419.12 31 0.40 0.14
779 2023 2 2023.1250 420.41 419.46 25 0.62 0.24
780 2023 3 2023.2083 421.00 419.53 31 0.72 0.25
781 2023 4 2023.2917 423.28 420.59 29 0.63 0.23
co2_jl = CSV.File("../../dataset/co2-mm-mlo.csv") |> DataFrames.DataFrame
782×8 DataFrame
Row │ year month decimal date average deseasonalized ndays sdev unc
│ Int64 Int64 Float64 Float64 Float64 Int64 Float64 Float64
─────┼──────────────────────────────────────────────────────────────────────────────
1 │ 1958 3 1958.2 315.7 314.43 -1 -9.99 -0.99
2 │ 1958 4 1958.29 317.45 315.16 -1 -9.99 -0.99
3 │ 1958 5 1958.37 317.51 314.71 -1 -9.99 -0.99
4 │ 1958 6 1958.45 317.24 315.14 -1 -9.99 -0.99
5 │ 1958 7 1958.54 315.86 315.18 -1 -9.99 -0.99
6 │ 1958 8 1958.62 314.93 316.18 -1 -9.99 -0.99
7 │ 1958 9 1958.71 313.2 316.08 -1 -9.99 -0.99
8 │ 1958 10 1958.79 312.43 315.41 -1 -9.99 -0.99
9 │ 1958 11 1958.87 313.33 315.2 -1 -9.99 -0.99
10 │ 1958 12 1958.96 314.67 315.43 -1 -9.99 -0.99
11 │ 1959 1 1959.04 315.58 315.55 -1 -9.99 -0.99
12 │ 1959 2 1959.13 316.48 315.86 -1 -9.99 -0.99
13 │ 1959 3 1959.2 316.65 315.38 -1 -9.99 -0.99
14 │ 1959 4 1959.29 317.72 315.41 -1 -9.99 -0.99
15 │ 1959 5 1959.37 318.29 315.49 -1 -9.99 -0.99
16 │ 1959 6 1959.45 318.15 316.03 -1 -9.99 -0.99
17 │ 1959 7 1959.54 316.54 315.86 -1 -9.99 -0.99
18 │ 1959 8 1959.62 314.8 316.06 -1 -9.99 -0.99
19 │ 1959 9 1959.71 313.84 316.73 -1 -9.99 -0.99
20 │ 1959 10 1959.79 313.33 316.33 -1 -9.99 -0.99
21 │ 1959 11 1959.87 314.81 316.68 -1 -9.99 -0.99
22 │ 1959 12 1959.96 315.58 316.35 -1 -9.99 -0.99
23 │ 1960 1 1960.04 316.43 316.4 -1 -9.99 -0.99
24 │ 1960 2 1960.13 316.98 316.36 -1 -9.99 -0.99
25 │ 1960 3 1960.2 317.58 316.28 -1 -9.99 -0.99
26 │ 1960 4 1960.29 319.03 316.7 -1 -9.99 -0.99
27 │ 1960 5 1960.37 320.04 317.22 -1 -9.99 -0.99
28 │ 1960 6 1960.46 319.59 317.47 -1 -9.99 -0.99
29 │ 1960 7 1960.54 318.18 317.52 -1 -9.99 -0.99
30 │ 1960 8 1960.62 315.9 317.19 -1 -9.99 -0.99
31 │ 1960 9 1960.71 314.17 317.08 -1 -9.99 -0.99
32 │ 1960 10 1960.79 313.83 316.83 -1 -9.99 -0.99
33 │ 1960 11 1960.87 315.0 316.88 -1 -9.99 -0.99
34 │ 1960 12 1960.96 316.19 316.96 -1 -9.99 -0.99
35 │ 1961 1 1961.04 316.89 316.86 -1 -9.99 -0.99
36 │ 1961 2 1961.13 317.7 317.08 -1 -9.99 -0.99
37 │ 1961 3 1961.2 318.54 317.26 -1 -9.99 -0.99
38 │ 1961 4 1961.29 319.48 317.16 -1 -9.99 -0.99
39 │ 1961 5 1961.37 320.58 317.76 -1 -9.99 -0.99
40 │ 1961 6 1961.45 319.77 317.63 -1 -9.99 -0.99
41 │ 1961 7 1961.54 318.57 317.88 -1 -9.99 -0.99
42 │ 1961 8 1961.62 316.79 318.06 -1 -9.99 -0.99
43 │ 1961 9 1961.71 314.99 317.9 -1 -9.99 -0.99
44 │ 1961 10 1961.79 315.31 318.32 -1 -9.99 -0.99
45 │ 1961 11 1961.87 316.1 317.99 -1 -9.99 -0.99
46 │ 1961 12 1961.96 317.01 317.79 -1 -9.99 -0.99
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
738 │ 2019 8 2019.62 410.18 412.1 29 0.33 0.12
739 │ 2019 9 2019.71 408.76 412.26 30 0.38 0.13
740 │ 2019 10 2019.79 408.75 412.1 29 0.31 0.11
741 │ 2019 11 2019.88 410.48 412.52 26 0.4 0.15
742 │ 2019 12 2019.96 411.98 412.74 31 0.4 0.14
743 │ 2020 1 2020.04 413.61 413.26 29 0.73 0.26
744 │ 2020 2 2020.12 414.34 413.39 28 0.69 0.25
745 │ 2020 3 2020.21 414.74 413.26 26 0.33 0.12
746 │ 2020 4 2020.29 416.45 413.76 28 0.66 0.24
747 │ 2020 5 2020.38 417.31 413.9 27 0.61 0.23
748 │ 2020 6 2020.46 416.6 414.23 27 0.44 0.16
749 │ 2020 7 2020.54 414.62 414.31 30 0.55 0.19
750 │ 2020 8 2020.62 412.78 414.73 25 0.25 0.1
751 │ 2020 9 2020.71 411.52 415.07 29 0.31 0.11
752 │ 2020 10 2020.79 411.51 414.86 30 0.22 0.08
753 │ 2020 11 2020.88 413.12 415.12 27 0.81 0.3
754 │ 2020 12 2020.96 414.26 414.97 30 0.47 0.17
755 │ 2021 1 2021.04 415.52 415.17 29 0.44 0.16
756 │ 2021 2 2021.12 416.75 415.8 28 1.02 0.37
757 │ 2021 3 2021.21 417.64 416.17 28 0.86 0.31
758 │ 2021 4 2021.29 419.05 416.36 24 1.12 0.44
759 │ 2021 5 2021.38 419.13 415.71 28 0.9 0.33
760 │ 2021 6 2021.46 418.94 416.57 28 0.65 0.23
761 │ 2021 7 2021.54 416.96 416.65 30 0.71 0.25
762 │ 2021 8 2021.62 414.47 416.43 26 0.72 0.27
763 │ 2021 9 2021.71 413.3 416.85 27 0.29 0.11
764 │ 2021 10 2021.79 413.93 417.28 29 0.35 0.12
765 │ 2021 11 2021.88 415.01 417.01 30 0.36 0.13
766 │ 2021 12 2021.96 416.71 417.42 28 0.48 0.17
767 │ 2022 1 2022.04 418.19 417.84 29 0.73 0.26
768 │ 2022 2 2022.12 419.28 418.32 27 0.92 0.34
769 │ 2022 3 2022.21 418.81 417.33 30 0.78 0.27
770 │ 2022 4 2022.29 420.23 417.54 28 0.85 0.31
771 │ 2022 5 2022.38 420.99 417.58 30 0.76 0.27
772 │ 2022 6 2022.46 420.99 418.62 28 0.3 0.11
773 │ 2022 7 2022.54 418.9 418.59 27 0.57 0.21
774 │ 2022 8 2022.62 417.19 419.15 27 0.37 0.14
775 │ 2022 9 2022.71 415.95 419.5 28 0.41 0.15
776 │ 2022 10 2022.79 415.78 419.13 30 0.27 0.1
777 │ 2022 11 2022.88 417.51 419.51 25 0.52 0.2
778 │ 2022 12 2022.96 418.95 419.66 24 0.5 0.2
779 │ 2023 1 2023.04 419.47 419.12 31 0.4 0.14
780 │ 2023 2 2023.12 420.41 419.46 25 0.62 0.24
781 │ 2023 3 2023.21 421.0 419.53 31 0.72 0.25
782 │ 2023 4 2023.29 423.28 420.59 29 0.63 0.23
691 rows omitted
Wrangling Data
co2_clean_r <- co2_r
co2_clean_r$datestamp <- paste0(co2_clean_r$year, "-", sprintf("%02d", co2_clean_r$month))
co2_clean_r$datestamp <- as.Date(paste0(co2_clean_r$datestamp, "-15"), format = "%Y-%m-%d")
co2_clean_r <- co2_clean_r[, c("datestamp", names(co2_clean_r)[-which(names(co2_clean_r) == "datestamp")])]
str(object=co2_clean_r)
'data.frame': 782 obs. of 9 variables:
$ datestamp : Date, format: "1958-03-15" "1958-04-15" "1958-05-15" "1958-06-15" ...
$ year : int 1958 1958 1958 1958 1958 1958 1958 1958 1958 1958 ...
$ month : int 3 4 5 6 7 8 9 10 11 12 ...
$ decimal.date : num 1958 1958 1958 1958 1959 ...
$ average : num 316 317 318 317 316 ...
$ deseasonalized: num 314 315 315 315 315 ...
$ ndays : int -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
$ sdev : num -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 -9.99 ...
$ unc : num -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 -0.99 ...
head(x=co2_clean_r, n=8)
datestamp year month decimal.date average deseasonalized ndays sdev unc
1 1958-03-15 1958 3 1958 316 314 -1 -10 -0.99
2 1958-04-15 1958 4 1958 317 315 -1 -10 -0.99
3 1958-05-15 1958 5 1958 318 315 -1 -10 -0.99
4 1958-06-15 1958 6 1958 317 315 -1 -10 -0.99
5 1958-07-15 1958 7 1959 316 315 -1 -10 -0.99
6 1958-08-15 1958 8 1959 315 316 -1 -10 -0.99
7 1958-09-15 1958 9 1959 313 316 -1 -10 -0.99
8 1958-10-15 1958 10 1959 312 315 -1 -10 -0.99
tail(x=co2_clean_r, n=8)
datestamp year month decimal.date average deseasonalized ndays sdev unc
775 2022-09-15 2022 9 2023 416 420 28 0.41 0.15
776 2022-10-15 2022 10 2023 416 419 30 0.27 0.10
777 2022-11-15 2022 11 2023 418 420 25 0.52 0.20
778 2022-12-15 2022 12 2023 419 420 24 0.50 0.20
779 2023-01-15 2023 1 2023 419 419 31 0.40 0.14
780 2023-02-15 2023 2 2023 420 419 25 0.62 0.24
781 2023-03-15 2023 3 2023 421 420 31 0.72 0.25
782 2023-04-15 2023 4 2023 423 421 29 0.63 0.23
co2_clean_py = co2_py
# co2_clean_py["datestamp"] = pandas.to_datetime(co2_clean_py["datestamp"]).dt.date
# co2_clean_py.info()
# co2_clean_py.head(n=8)
# co2_clean_py.tail(n=8)
co2_clean_jl = co2_jl;
co2_clean_jl.datestamp = Dates.Date.(co2_clean_jl.year, co2_clean_jl.month, 15);
co2_clean_jl = co2_clean_jl[:, [names(co2_clean_jl)[end]; setdiff(names(co2_clean_jl), [names(co2_clean_jl)[end]])]]
782×9 DataFrame
Row │ datestamp year month decimal date average deseasonalized ndays sdev unc
│ Date Int64 Int64 Float64 Float64 Float64 Int64 Float64 Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────
1 │ 1958-03-15 1958 3 1958.2 315.7 314.43 -1 -9.99 -0.99
2 │ 1958-04-15 1958 4 1958.29 317.45 315.16 -1 -9.99 -0.99
3 │ 1958-05-15 1958 5 1958.37 317.51 314.71 -1 -9.99 -0.99
4 │ 1958-06-15 1958 6 1958.45 317.24 315.14 -1 -9.99 -0.99
5 │ 1958-07-15 1958 7 1958.54 315.86 315.18 -1 -9.99 -0.99
6 │ 1958-08-15 1958 8 1958.62 314.93 316.18 -1 -9.99 -0.99
7 │ 1958-09-15 1958 9 1958.71 313.2 316.08 -1 -9.99 -0.99
8 │ 1958-10-15 1958 10 1958.79 312.43 315.41 -1 -9.99 -0.99
9 │ 1958-11-15 1958 11 1958.87 313.33 315.2 -1 -9.99 -0.99
10 │ 1958-12-15 1958 12 1958.96 314.67 315.43 -1 -9.99 -0.99
11 │ 1959-01-15 1959 1 1959.04 315.58 315.55 -1 -9.99 -0.99
12 │ 1959-02-15 1959 2 1959.13 316.48 315.86 -1 -9.99 -0.99
13 │ 1959-03-15 1959 3 1959.2 316.65 315.38 -1 -9.99 -0.99
14 │ 1959-04-15 1959 4 1959.29 317.72 315.41 -1 -9.99 -0.99
15 │ 1959-05-15 1959 5 1959.37 318.29 315.49 -1 -9.99 -0.99
16 │ 1959-06-15 1959 6 1959.45 318.15 316.03 -1 -9.99 -0.99
17 │ 1959-07-15 1959 7 1959.54 316.54 315.86 -1 -9.99 -0.99
18 │ 1959-08-15 1959 8 1959.62 314.8 316.06 -1 -9.99 -0.99
19 │ 1959-09-15 1959 9 1959.71 313.84 316.73 -1 -9.99 -0.99
20 │ 1959-10-15 1959 10 1959.79 313.33 316.33 -1 -9.99 -0.99
21 │ 1959-11-15 1959 11 1959.87 314.81 316.68 -1 -9.99 -0.99
22 │ 1959-12-15 1959 12 1959.96 315.58 316.35 -1 -9.99 -0.99
23 │ 1960-01-15 1960 1 1960.04 316.43 316.4 -1 -9.99 -0.99
24 │ 1960-02-15 1960 2 1960.13 316.98 316.36 -1 -9.99 -0.99
25 │ 1960-03-15 1960 3 1960.2 317.58 316.28 -1 -9.99 -0.99
26 │ 1960-04-15 1960 4 1960.29 319.03 316.7 -1 -9.99 -0.99
27 │ 1960-05-15 1960 5 1960.37 320.04 317.22 -1 -9.99 -0.99
28 │ 1960-06-15 1960 6 1960.46 319.59 317.47 -1 -9.99 -0.99
29 │ 1960-07-15 1960 7 1960.54 318.18 317.52 -1 -9.99 -0.99
30 │ 1960-08-15 1960 8 1960.62 315.9 317.19 -1 -9.99 -0.99
31 │ 1960-09-15 1960 9 1960.71 314.17 317.08 -1 -9.99 -0.99
32 │ 1960-10-15 1960 10 1960.79 313.83 316.83 -1 -9.99 -0.99
33 │ 1960-11-15 1960 11 1960.87 315.0 316.88 -1 -9.99 -0.99
34 │ 1960-12-15 1960 12 1960.96 316.19 316.96 -1 -9.99 -0.99
35 │ 1961-01-15 1961 1 1961.04 316.89 316.86 -1 -9.99 -0.99
36 │ 1961-02-15 1961 2 1961.13 317.7 317.08 -1 -9.99 -0.99
37 │ 1961-03-15 1961 3 1961.2 318.54 317.26 -1 -9.99 -0.99
38 │ 1961-04-15 1961 4 1961.29 319.48 317.16 -1 -9.99 -0.99
39 │ 1961-05-15 1961 5 1961.37 320.58 317.76 -1 -9.99 -0.99
40 │ 1961-06-15 1961 6 1961.45 319.77 317.63 -1 -9.99 -0.99
41 │ 1961-07-15 1961 7 1961.54 318.57 317.88 -1 -9.99 -0.99
42 │ 1961-08-15 1961 8 1961.62 316.79 318.06 -1 -9.99 -0.99
43 │ 1961-09-15 1961 9 1961.71 314.99 317.9 -1 -9.99 -0.99
44 │ 1961-10-15 1961 10 1961.79 315.31 318.32 -1 -9.99 -0.99
45 │ 1961-11-15 1961 11 1961.87 316.1 317.99 -1 -9.99 -0.99
46 │ 1961-12-15 1961 12 1961.96 317.01 317.79 -1 -9.99 -0.99
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
738 │ 2019-08-15 2019 8 2019.62 410.18 412.1 29 0.33 0.12
739 │ 2019-09-15 2019 9 2019.71 408.76 412.26 30 0.38 0.13
740 │ 2019-10-15 2019 10 2019.79 408.75 412.1 29 0.31 0.11
741 │ 2019-11-15 2019 11 2019.88 410.48 412.52 26 0.4 0.15
742 │ 2019-12-15 2019 12 2019.96 411.98 412.74 31 0.4 0.14
743 │ 2020-01-15 2020 1 2020.04 413.61 413.26 29 0.73 0.26
744 │ 2020-02-15 2020 2 2020.12 414.34 413.39 28 0.69 0.25
745 │ 2020-03-15 2020 3 2020.21 414.74 413.26 26 0.33 0.12
746 │ 2020-04-15 2020 4 2020.29 416.45 413.76 28 0.66 0.24
747 │ 2020-05-15 2020 5 2020.38 417.31 413.9 27 0.61 0.23
748 │ 2020-06-15 2020 6 2020.46 416.6 414.23 27 0.44 0.16
749 │ 2020-07-15 2020 7 2020.54 414.62 414.31 30 0.55 0.19
750 │ 2020-08-15 2020 8 2020.62 412.78 414.73 25 0.25 0.1
751 │ 2020-09-15 2020 9 2020.71 411.52 415.07 29 0.31 0.11
752 │ 2020-10-15 2020 10 2020.79 411.51 414.86 30 0.22 0.08
753 │ 2020-11-15 2020 11 2020.88 413.12 415.12 27 0.81 0.3
754 │ 2020-12-15 2020 12 2020.96 414.26 414.97 30 0.47 0.17
755 │ 2021-01-15 2021 1 2021.04 415.52 415.17 29 0.44 0.16
756 │ 2021-02-15 2021 2 2021.12 416.75 415.8 28 1.02 0.37
757 │ 2021-03-15 2021 3 2021.21 417.64 416.17 28 0.86 0.31
758 │ 2021-04-15 2021 4 2021.29 419.05 416.36 24 1.12 0.44
759 │ 2021-05-15 2021 5 2021.38 419.13 415.71 28 0.9 0.33
760 │ 2021-06-15 2021 6 2021.46 418.94 416.57 28 0.65 0.23
761 │ 2021-07-15 2021 7 2021.54 416.96 416.65 30 0.71 0.25
762 │ 2021-08-15 2021 8 2021.62 414.47 416.43 26 0.72 0.27
763 │ 2021-09-15 2021 9 2021.71 413.3 416.85 27 0.29 0.11
764 │ 2021-10-15 2021 10 2021.79 413.93 417.28 29 0.35 0.12
765 │ 2021-11-15 2021 11 2021.88 415.01 417.01 30 0.36 0.13
766 │ 2021-12-15 2021 12 2021.96 416.71 417.42 28 0.48 0.17
767 │ 2022-01-15 2022 1 2022.04 418.19 417.84 29 0.73 0.26
768 │ 2022-02-15 2022 2 2022.12 419.28 418.32 27 0.92 0.34
769 │ 2022-03-15 2022 3 2022.21 418.81 417.33 30 0.78 0.27
770 │ 2022-04-15 2022 4 2022.29 420.23 417.54 28 0.85 0.31
771 │ 2022-05-15 2022 5 2022.38 420.99 417.58 30 0.76 0.27
772 │ 2022-06-15 2022 6 2022.46 420.99 418.62 28 0.3 0.11
773 │ 2022-07-15 2022 7 2022.54 418.9 418.59 27 0.57 0.21
774 │ 2022-08-15 2022 8 2022.62 417.19 419.15 27 0.37 0.14
775 │ 2022-09-15 2022 9 2022.71 415.95 419.5 28 0.41 0.15
776 │ 2022-10-15 2022 10 2022.79 415.78 419.13 30 0.27 0.1
777 │ 2022-11-15 2022 11 2022.88 417.51 419.51 25 0.52 0.2
778 │ 2022-12-15 2022 12 2022.96 418.95 419.66 24 0.5 0.2
779 │ 2023-01-15 2023 1 2023.04 419.47 419.12 31 0.4 0.14
780 │ 2023-02-15 2023 2 2023.12 420.41 419.46 25 0.62 0.24
781 │ 2023-03-15 2023 3 2023.21 421.0 419.53 31 0.72 0.25
782 │ 2023-04-15 2023 4 2023.29 423.28 420.59 29 0.63 0.23
691 rows omitted
co2_clean_jl
782×9 DataFrame
Row │ datestamp year month decimal date average deseasonalized ndays sdev unc
│ Date Int64 Int64 Float64 Float64 Float64 Int64 Float64 Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────
1 │ 1958-03-15 1958 3 1958.2 315.7 314.43 -1 -9.99 -0.99
2 │ 1958-04-15 1958 4 1958.29 317.45 315.16 -1 -9.99 -0.99
3 │ 1958-05-15 1958 5 1958.37 317.51 314.71 -1 -9.99 -0.99
4 │ 1958-06-15 1958 6 1958.45 317.24 315.14 -1 -9.99 -0.99
5 │ 1958-07-15 1958 7 1958.54 315.86 315.18 -1 -9.99 -0.99
6 │ 1958-08-15 1958 8 1958.62 314.93 316.18 -1 -9.99 -0.99
7 │ 1958-09-15 1958 9 1958.71 313.2 316.08 -1 -9.99 -0.99
8 │ 1958-10-15 1958 10 1958.79 312.43 315.41 -1 -9.99 -0.99
9 │ 1958-11-15 1958 11 1958.87 313.33 315.2 -1 -9.99 -0.99
10 │ 1958-12-15 1958 12 1958.96 314.67 315.43 -1 -9.99 -0.99
11 │ 1959-01-15 1959 1 1959.04 315.58 315.55 -1 -9.99 -0.99
12 │ 1959-02-15 1959 2 1959.13 316.48 315.86 -1 -9.99 -0.99
13 │ 1959-03-15 1959 3 1959.2 316.65 315.38 -1 -9.99 -0.99
14 │ 1959-04-15 1959 4 1959.29 317.72 315.41 -1 -9.99 -0.99
15 │ 1959-05-15 1959 5 1959.37 318.29 315.49 -1 -9.99 -0.99
16 │ 1959-06-15 1959 6 1959.45 318.15 316.03 -1 -9.99 -0.99
17 │ 1959-07-15 1959 7 1959.54 316.54 315.86 -1 -9.99 -0.99
18 │ 1959-08-15 1959 8 1959.62 314.8 316.06 -1 -9.99 -0.99
19 │ 1959-09-15 1959 9 1959.71 313.84 316.73 -1 -9.99 -0.99
20 │ 1959-10-15 1959 10 1959.79 313.33 316.33 -1 -9.99 -0.99
21 │ 1959-11-15 1959 11 1959.87 314.81 316.68 -1 -9.99 -0.99
22 │ 1959-12-15 1959 12 1959.96 315.58 316.35 -1 -9.99 -0.99
23 │ 1960-01-15 1960 1 1960.04 316.43 316.4 -1 -9.99 -0.99
24 │ 1960-02-15 1960 2 1960.13 316.98 316.36 -1 -9.99 -0.99
25 │ 1960-03-15 1960 3 1960.2 317.58 316.28 -1 -9.99 -0.99
26 │ 1960-04-15 1960 4 1960.29 319.03 316.7 -1 -9.99 -0.99
27 │ 1960-05-15 1960 5 1960.37 320.04 317.22 -1 -9.99 -0.99
28 │ 1960-06-15 1960 6 1960.46 319.59 317.47 -1 -9.99 -0.99
29 │ 1960-07-15 1960 7 1960.54 318.18 317.52 -1 -9.99 -0.99
30 │ 1960-08-15 1960 8 1960.62 315.9 317.19 -1 -9.99 -0.99
31 │ 1960-09-15 1960 9 1960.71 314.17 317.08 -1 -9.99 -0.99
32 │ 1960-10-15 1960 10 1960.79 313.83 316.83 -1 -9.99 -0.99
33 │ 1960-11-15 1960 11 1960.87 315.0 316.88 -1 -9.99 -0.99
34 │ 1960-12-15 1960 12 1960.96 316.19 316.96 -1 -9.99 -0.99
35 │ 1961-01-15 1961 1 1961.04 316.89 316.86 -1 -9.99 -0.99
36 │ 1961-02-15 1961 2 1961.13 317.7 317.08 -1 -9.99 -0.99
37 │ 1961-03-15 1961 3 1961.2 318.54 317.26 -1 -9.99 -0.99
38 │ 1961-04-15 1961 4 1961.29 319.48 317.16 -1 -9.99 -0.99
39 │ 1961-05-15 1961 5 1961.37 320.58 317.76 -1 -9.99 -0.99
40 │ 1961-06-15 1961 6 1961.45 319.77 317.63 -1 -9.99 -0.99
41 │ 1961-07-15 1961 7 1961.54 318.57 317.88 -1 -9.99 -0.99
42 │ 1961-08-15 1961 8 1961.62 316.79 318.06 -1 -9.99 -0.99
43 │ 1961-09-15 1961 9 1961.71 314.99 317.9 -1 -9.99 -0.99
44 │ 1961-10-15 1961 10 1961.79 315.31 318.32 -1 -9.99 -0.99
45 │ 1961-11-15 1961 11 1961.87 316.1 317.99 -1 -9.99 -0.99
46 │ 1961-12-15 1961 12 1961.96 317.01 317.79 -1 -9.99 -0.99
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
738 │ 2019-08-15 2019 8 2019.62 410.18 412.1 29 0.33 0.12
739 │ 2019-09-15 2019 9 2019.71 408.76 412.26 30 0.38 0.13
740 │ 2019-10-15 2019 10 2019.79 408.75 412.1 29 0.31 0.11
741 │ 2019-11-15 2019 11 2019.88 410.48 412.52 26 0.4 0.15
742 │ 2019-12-15 2019 12 2019.96 411.98 412.74 31 0.4 0.14
743 │ 2020-01-15 2020 1 2020.04 413.61 413.26 29 0.73 0.26
744 │ 2020-02-15 2020 2 2020.12 414.34 413.39 28 0.69 0.25
745 │ 2020-03-15 2020 3 2020.21 414.74 413.26 26 0.33 0.12
746 │ 2020-04-15 2020 4 2020.29 416.45 413.76 28 0.66 0.24
747 │ 2020-05-15 2020 5 2020.38 417.31 413.9 27 0.61 0.23
748 │ 2020-06-15 2020 6 2020.46 416.6 414.23 27 0.44 0.16
749 │ 2020-07-15 2020 7 2020.54 414.62 414.31 30 0.55 0.19
750 │ 2020-08-15 2020 8 2020.62 412.78 414.73 25 0.25 0.1
751 │ 2020-09-15 2020 9 2020.71 411.52 415.07 29 0.31 0.11
752 │ 2020-10-15 2020 10 2020.79 411.51 414.86 30 0.22 0.08
753 │ 2020-11-15 2020 11 2020.88 413.12 415.12 27 0.81 0.3
754 │ 2020-12-15 2020 12 2020.96 414.26 414.97 30 0.47 0.17
755 │ 2021-01-15 2021 1 2021.04 415.52 415.17 29 0.44 0.16
756 │ 2021-02-15 2021 2 2021.12 416.75 415.8 28 1.02 0.37
757 │ 2021-03-15 2021 3 2021.21 417.64 416.17 28 0.86 0.31
758 │ 2021-04-15 2021 4 2021.29 419.05 416.36 24 1.12 0.44
759 │ 2021-05-15 2021 5 2021.38 419.13 415.71 28 0.9 0.33
760 │ 2021-06-15 2021 6 2021.46 418.94 416.57 28 0.65 0.23
761 │ 2021-07-15 2021 7 2021.54 416.96 416.65 30 0.71 0.25
762 │ 2021-08-15 2021 8 2021.62 414.47 416.43 26 0.72 0.27
763 │ 2021-09-15 2021 9 2021.71 413.3 416.85 27 0.29 0.11
764 │ 2021-10-15 2021 10 2021.79 413.93 417.28 29 0.35 0.12
765 │ 2021-11-15 2021 11 2021.88 415.01 417.01 30 0.36 0.13
766 │ 2021-12-15 2021 12 2021.96 416.71 417.42 28 0.48 0.17
767 │ 2022-01-15 2022 1 2022.04 418.19 417.84 29 0.73 0.26
768 │ 2022-02-15 2022 2 2022.12 419.28 418.32 27 0.92 0.34
769 │ 2022-03-15 2022 3 2022.21 418.81 417.33 30 0.78 0.27
770 │ 2022-04-15 2022 4 2022.29 420.23 417.54 28 0.85 0.31
771 │ 2022-05-15 2022 5 2022.38 420.99 417.58 30 0.76 0.27
772 │ 2022-06-15 2022 6 2022.46 420.99 418.62 28 0.3 0.11
773 │ 2022-07-15 2022 7 2022.54 418.9 418.59 27 0.57 0.21
774 │ 2022-08-15 2022 8 2022.62 417.19 419.15 27 0.37 0.14
775 │ 2022-09-15 2022 9 2022.71 415.95 419.5 28 0.41 0.15
776 │ 2022-10-15 2022 10 2022.79 415.78 419.13 30 0.27 0.1
777 │ 2022-11-15 2022 11 2022.88 417.51 419.51 25 0.52 0.2
778 │ 2022-12-15 2022 12 2022.96 418.95 419.66 24 0.5 0.2
779 │ 2023-01-15 2023 1 2023.04 419.47 419.12 31 0.4 0.14
780 │ 2023-02-15 2023 2 2023.12 420.41 419.46 25 0.62 0.24
781 │ 2023-03-15 2023 3 2023.21 421.0 419.53 31 0.72 0.25
782 │ 2023-04-15 2023 4 2023.29 423.28 420.59 29 0.63 0.23
691 rows omitted
Performing Exploratory Data Analysis (EDA)
linechart_co2_r <- ggplot2::ggplot(co2_clean_r) +
geom_line(aes(x=datestamp, y=average), color=palette_michaelmallari_r[3]) +
scale_y_continuous(expand=c(0, 0), position="right") +
labs(
title="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
alt="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
x="",
y="",
caption="Source: National Oceanic & Atmospheric Administration"
) +
theme_michaelmallari_r()
linechart_co2_r
# linechart_co2_py = (plotnine.ggplot(data=co2_clean_py)
# + plotnine.geoms.geom_line(mapping=plotnine.mapping.aes(x="datestamp", y="co2"), color=palette_michaelmallari_py[3])
# + plotnine.labels.labs(
# title="CO2 Mole Fraction (ppm) in Hawaii, 1958-2023",
# x="",
# y=""
# )
# )
#
# linechart_co2_py
linechart_co2_jl = Gadfly.plot(
co2_clean_jl,
Gadfly.layer(x="datestamp", y="average", Geom.line),
Gadfly.Scale.x_continuous(minticks=4, maxticks=6),
Gadfly.Scale.y_continuous(minticks=4, maxticks=6),
Gadfly.Guide.title("CO2 Mole Fraction (ppm) in Hawaii, 1958-2023"),
Gadfly.Guide.xlabel(""),
Gadfly.Guide.ylabel(""),
theme_michaelmallari_jl
);
Statistical Model-Based Forecasting
Machine Learning-Based Forecasting
Deep Learning-Based Forecasting
References
- National Oceanic & Atmospheric Administration. (n.d.). Trends in Atmospheric Carbon Dioxide (Mauna Loa Observatory, Hawaii). Global Monitoring Laboratory. Retrieved May 12, 2023, from https://gml.noaa.gov/ccgg/trends/mlo.html
- Tsay, R. S. (2010). Analysis of Financial Time Series (3rd ed.). Wiley.