How to Build A Data Visualization Dashboard using D3.JS & JavaScript: Upstream Oil & Gas Data (Part 2 of 2)
In the first part of this article (Link) we overview terminologies of upstream data, and data required to build this case study. In this part, we are going into details of how to build the dashboard. Just to refresh your memory, we are going to build following dashboard:
First Panel
This panel contains number of wells in top counties. Since wells are scatter over entire region, here I limited my bar plot to counties with at least 700 wells. To accomplish this i am going to use d3.nest()
command as demonstrated below. d3.nest
is applied to EfData
by pairing county location key words( Location_County
data column contains county names of each well). Then using rollup
command in conjunction with length
we can calculate number of wells within each county. Next step is to sort the nested data using .sort
function. Finally, I excluded counties with lower than 700 wells. Variable newCount_
contains list of counties with number of wells larger than 700 wells.
var Count_ = d3.nest()
.key(function(d) {return d.Location_County;})
.rollup(function(v) {return v.length;})
.entries(EfData);//sort bars based on values
Count_ = Count_.sort(function(a, b) {
return d3.ascending(a.value, b.value);})var newCount_ = Count_.filter(function(x) {
return x.value >= 700;});
Next step is to plot horizontal bar charts containing the already nested data. The first step is to plotting the data is to create svg.
Panel 1 is located by name (top_left
), then an svg is appended to this panel using select
and append ("svg")
to the panel. In order bar chart fill the entire panel, I used clientWidth
and clientHeight
to get panel width and h
var TL_Size = document.getElementById("top_left")
var TL_width = TL_Size.clientWidth - margin.left - margin.right
var TL_height = TL_Size.clientHeight - margin.top - margin.bottomvar TL_svg = d3.select("#top_left").append("svg")
.attr("width", TL_width + margin.left + margin.right)
.attr("height", TL_height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
Finally we can append bar chart and text labels. Bar chart in d3.js
is comprised of rectangles and texts. All we need to do is to map x and y data from our input data ( newCount_
) and plot it. X data corresponds to number of wells for each county and it mapped between [0,max(number of wells)]
. This task is easily accomplished using d3.max( value)
in which value represent number of wells. Y data contains counties, simply mapped by map
function as seen below. Once X and Y data mapped, we can construct rectangles using append("rect")
Other code lines add color and texts to the bar plots.
var x = d3.scaleLinear().range([0, TL_width]);
var y = d3.scaleBand().range([TL_height, 0]).padding(0.1);;
x.domain([0, d3.max(newCount_, function(d) {return d.value;})]);
y.domain(newCount_.map(function(d) {return d.key;}));
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);var TL_g = TL_svg.append("g").attr("class", "y axis").call(yAxis)
var TL_bars = TL_svg.selectAll(".bar").data(newCount_).enter().append("g")//append rectangles (bars)
TL_bars.append("rect")
.attr("class", "bar")
.attr("y", function(d) {return y(d.key);})
.attr("height", y.bandwidth())
.attr("x", 0)
.attr("fill", function(d, i) {return c10(i);})
.attr("width", function(d) {return x(d.value);});//add a values label to the right of each bar
TL_bars.append("text")
.attr("class", "label")
.attr("y", function(d) {
return y(d.key) + y.bandwidth() / 2 + 4;})
.attr("x", function(d) {return x(d.value) + 3;})
.text(function(d) {return formatComma(d.value);});// add text labels
TL_svg.append("text")
.attr("x", (TL_width / 2))
.attr("y", 0 - (margin.top / 10))
.attr("text-anchor", "middle")
.style("font-size", "18px")
.style("font-weight","bold")
.style("text-decoration", "underline")
.text("Total Number of Wells For Each County ");
The resulting visualization can be seen below:
Second Panel
This panel contains map of wells color coded with respect to producing fluid (oil, gas condensate,..). We start by plotting the shapefile of Texas counties (in the interest of time I am going to skip this section). Then, for each well in the database, one circle will be plotted on the map using the following code. Locations of the wells are given by Location_SLatitude
and Location_SLongitude
data columns in the csv dataset. These are surface locations of the wellbores. The color of each circle is then defined based on Prod_Type
data column.
TR_g.selectAll("circle")
.data(EfData)
.enter()
.append("circle")
.attr("cx", function(d) {return projection([d.Location_SLongitude, d.Location_SLatitude])[0];})
.attr("cy", function(d) {return projection([d.Location_SLongitude, d.Location_SLatitude])[1];})
.attr("r", "2px").style("stroke", "black")
.style("stroke-width", "0.1")
.style("fill", function(d) {
var value = d.Prod_Type;
if (value === 'OIL ') {
return "#FF0000";
} else if (value === 'LIQUIDS RICH GAS ') {
return "#ff9900";
} else if (value === 'DRY GAS ') {
return "#ffff00";
} else if (value === 'WET GAS ') {
return "#d9b38c";
} else {return "#cccccc"; }})
Finally, we can add zoom behavior to the map so user can zoom and expand the map. This task can be accomplished by defining the zoom behavior using d3.zoom()
and then appending that functionality to the svg containing the map (Mapsvg
). In this zoom function I attempted to change the font size of map label (font size should become smaller and not over take the entire map as zoom function is called).
var zoom = d3.zoom().scaleExtent([1, 8]).on("zoom", zoomed);function zoomed() {
TR_g.style("stroke-width", 1.0 / d3.event.transform.k + "px");
TR_g.attr("transform", d3.event.transform);
TR_g.selectAll("text")
.style("font-size", 11 - 1 * d3.event.transform.k + "px");
TR_g.selectAll("circle")
.attr("r", 3 / d3.event.transform.k);}Mapsvg.call(zoom)
The resulting visualization can be seen in the following picture. This map is zoomable. In this version of the dashboard, wells on the map can not be selected by user.
You can look at my Block visualization and interact with the zoom functionality of the map online (link).
Third Panel
This panel contains the average cumulative production of wells drilled each year. I am going to use horizontal bar chart to show the data. Therefore, I am only going to discuss how to summarize the data as I showed you how to plot horizontal bar chart in Panel 1. Again we can use d3.nest()
to summarize the data. Here, I am going to filter based on well year
(year>0 ensures no empty year data is entered nesting function) and mean 6 month production calculated from Prod_BOE6Month
data column. Finally, we can extract average production data per year and sort based on year to show in Panel 3 using horizontal bar chart as seen below.
var NestedProduction = d3.nest().key(function(d) {
return d.Year > 0
}).key(function(d) {
return d.Year;
}).rollup(function(v) {
return d3.mean(v, function(d) {
return d.Prod_BOE6Month
});
}).entries(EfData);BOEPerWellPerYear = NestedProduction[0].values//sort bars based on values
BOEPerWellPerYear = BOEPerWellPerYear.sort(function(a, b) {
return d3.ascending(a.key, b.key);
})
Fourth Panel
The last panel of our visualization contains list of most successful operator year by year in terms of average production. to accomplish this task we need to find list of operators each year then calculate their respective average production then find highest average production. This task is a multi step filtering of the data.
First we calculate variable NestedProduction2
using d3.nest()
. This variable filters list of operators ( Info_Operator
) by year ( Year
) then returns average 6 month production of each operator. The resulting nest function produces dictionary containing AVG: average 6 month production
and Count: number of wells owned by each operator.
var NestedProduction2
= d3.nest()
.key(function(d) {return d.Year > 0})
.key(function(d) {return d.Year;})
.key(function(d) {return d.Info_Operator;})
.rollup(function(leaves) {
var AVG = d3.mean(leaves, function(d) {
return d.Prod_BOE6Month})
var Count = d3.nest().key(function(d) {
return d.length})
.entries(leaves);
return {AVG: AVG,Count: Count[0].values.length};})
.entries(EfData);
Finally by filtering over NestedProduction2
we can find operators with highest production each year and with at least 20 wells in the database. The resulting table is shown below:
Conclusion & Future Work
In this two article series I attempted to show you how to use D3.js functionality to build dashboard for upstream oil and gas data. I know I skipped lot of details, but my goal was to show you overall process. You are welcome to use the data and build your own dashboard. In the end I am happy to answer any question regarding the data and methodology.
The data and codes can be found on my Github Page.
This data is scraped from The Railroad Commission of Texas. I scraped the data and cleaned it up.You can use it freely. However, in return you can share with me any improvement that you have done on my version. Remember, sharing is caring!
This work can be greatly improved by adding interactive user experience functionality. For example, one can improve map panel and let user click on well then display information about that well.