Crude oil prices by delivery period define the term structure of the market. The term structure changes shape over time given shifts in price level and slope. Term structure behavior becomes clear by combining discrete futures contracts with similar maturities into a continuous time series. R code is supplied to create continuous prices by delivery period. The purpose is to show term structure behavior and to derive risk and profitability measures for oil production, marketing and trading strategies. The resulting data is tidy, well suited for model training and out-of-sample testing.
Read more... (655 words, 6 images, estimated 2:37 mins reading time)
An animation showing the term structure of NYMEX crude oil. For source code, go here.
Permanent link to this post
(17 words, 1 image, estimated 4 secs reading time)
Robert Hyndman is the author of the forecast package in R. I’ve been using the package for long-term time series forecasts. The package comes with some built in methods for plotting forecast data objects in R that Ive wanted to customize for improved clarity and presentation. The following article achieves that goal and shares two scripts for plotting forecast data objects using ggplot.
Read more... (498 words, 2 images, estimated 2:0 mins reading time)
Slides presented at a recent meeting of Doha R users.
Geospatial Data and Mapping in R (384 downloads)
Permanent link to this post
(20 words, 1 image, estimated 5 secs reading time)
Technology progress is a key to solar growth and pricing. By extension, the ability to model technology progress is essential to understanding future energy supply and demand.
Solar innovation is widespread. Examples include solar cell efficiency, module manufacturing, and learning innovations with solar system installation and operation. Solar pricing and growth are also supported by innovations in enabling technology, such as battery storage, smart grids and electric vehicles.
Read more... (907 words, 6 images, estimated 3:38 mins reading time)
Best subset regression is an technique for model building and variable selection. The method looks at all combinations of independent predictor variables for use in a multiple regression model. Model developers and analysts will often struggle with variable selection, especially when the number of predictors is high. Ideally, each set of predictors is run and the best set is selected using a criteria for model performance. The following article provides custom functions for best subset selection that are fast and easy to use.
Read more... (617 words, 6 images, estimated 2:28 mins reading time)
A new method to extract data tables from PDF files is introduced. The solution combines the R programming language with the open-source Java program Tabula. The result is a convenient method that transforms documents into databases.
The ability to train a machine to extract data tables from PDF files has several benefits:
Read more... (1203 words, 7 images, estimated 4:49 mins reading time)
A common task in spatial data analysis is extracting SpatialPoints inside a set of polygons or buffer zones. Analysts can use standard GIS or map tools to extract a set of points within an area of interest using manual “point-and-click” routines. This method is easy, but will probably prove impractical, especially in cases involving big data. The alternative is to train a machine to automatically extract the points in a polygon or buffer zone. This post achieves that task and presents a case-study with R code.
Read more... (480 words, 2 images, estimated 1:55 mins reading time)
The popularity of R is rapidly increasing and is well on its way to being a top 10 programming language. The TIOBE index is a standard indicator of the popularity of all programming languages. The TIOBE index confirms that a subset of languages – those for computational statistics and data analysis – are gaining increased attention. The clear winner of the pack is the open source programming language R.
Read more... (309 words, 1 image, estimated 1:14 mins reading time)
A common question concerning the safety of photovoltaic (PV) power systems is the impact of reflected sunlight. PV modules have the potential to impact neighboring structures or activities, notably aviation. It is important to know where the reflected light will go and what the intensity of the light will be at any point in time.
Read more... (1999 words, 11 images, estimated 8:0 mins reading time)
Aerosol Optical Depth (AOD) defines the degree to which aerosols prevent the transmission of sunlight by absorption or scattering. AOD is measured using an integrated extinction coefficient over a vertical column of air. The extinction coefficient can be used to analyze solar extinction and the performance of solar power systems as a function of location and time.
Read more... (371 words, 1 image, estimated 1:29 mins reading time)
The maptools package has a pruneMap() function t0 crop map objects in R. In practice, the function extracts data from SpatialPolygon or SpatialLine objects given a boundary box or specific area of interest. Unfortunately, there is no equivalent function for high resolution, large data, raster images, which are common in many Earth Science applications. The following post defines a custom function to crop raster images in R and to extract data from SpatialGridDataFrames. The function is tested using a raster image from the Shuttle Radar Topography Mission (SRTM; shown at left). The resulting data is then mapped using the image() function in R.
Read more... (218 words, 2 images, estimated 52 secs reading time)