How to Build A Data Visualization Dashboard using D3.JS & JavaScript: Upstream Oil & Gas Data (Part 2 of 2)

Amir Nejad
7 min readDec 13, 2018

In the first part of this article (Link) we overview terminologies of upstream data, and data required to build this case study. In this part, we are going into details of how to build the dashboard. Just to refresh your memory, we are going to build following dashboard:

Upstream Oil and Gas Dashboard

First Panel

This panel contains number of wells in top counties. Since wells are scatter over entire region, here I limited my bar plot to counties with at least 700 wells. To accomplish this i am going to use d3.nest() command as demonstrated below. d3.nest is applied to EfData by pairing county location key words( Location_County data column contains county names of each well). Then using rollup command in conjunction with length we can calculate number of wells within each county. Next step is to sort the nested data using .sort function. Finally, I excluded counties with lower than 700 wells. Variable newCount_ contains list of counties with number of wells larger than 700 wells.

var Count_ = d3.nest()
.key(function(d) {return d.Location_County;})
.rollup(function(v) {return v.length;})
.entries(EfData);
//sort bars based on values
Count_ = Count_.sort(function(a, b) {
return d3.ascending(a.value, b.value);})
var newCount_ = Count_.filter(function(x) {
return x.value >= 700;});

Next step is to plot horizontal bar charts containing the already nested data. The first step is to plotting the data is to create svg. Panel 1 is located by name (top_left ), then an svg is appended to this panel using select and append ("svg") to the panel. In order bar chart fill the entire panel, I used clientWidth and clientHeight to get panel width and h

var TL_Size = document.getElementById("top_left")
var TL_width = TL_Size.clientWidth - margin.left - margin.right
var TL_height = TL_Size.clientHeight - margin.top - margin.bottom
var TL_svg = d3.select("#top_left").append("svg")
.attr("width", TL_width + margin.left + margin.right)
.attr("height", TL_height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");

Finally we can append bar chart and text labels. Bar chart in d3.js is comprised of rectangles and texts. All we need to do is to map x and y data from our input data ( newCount_ ) and plot it. X data corresponds to number of wells for each county and it mapped between [0,max(number of wells)] . This task is easily accomplished using d3.max( value) in which value represent number of wells. Y data contains counties, simply mapped by map function as seen below. Once X and Y data mapped, we can construct rectangles using append("rect") Other code lines add color and texts to the bar plots.

var x = d3.scaleLinear().range([0, TL_width]);
var y = d3.scaleBand().range([TL_height, 0]).padding(0.1);;
x.domain([0, d3.max(newCount_, function(d) {return d.value;})]);
y.domain(newCount_.map(function(d) {return d.key;}));
var xAxis = d3.axisBottom(x);
var yAxis = d3.axisLeft(y);
var TL_g = TL_svg.append("g").attr("class", "y axis").call(yAxis)
var TL_bars = TL_svg.selectAll(".bar").data(newCount_).enter().append("g")
//append rectangles (bars)
TL_bars.append("rect")
.attr("class", "bar")
.attr("y", function(d) {return y(d.key);})
.attr("height", y.bandwidth())
.attr("x", 0)
.attr("fill", function(d, i) {return c10(i);})
.attr("width", function(d) {return x(d.value);});
//add a values label to the right of each bar
TL_bars.append("text")
.attr("class", "label")
.attr("y", function(d) {
return y(d.key) + y.bandwidth() / 2 + 4;})
.attr("x", function(d) {return x(d.value) + 3;})
.text(function(d) {return formatComma(d.value);});
// add text labels
TL_svg.append("text")
.attr("x", (TL_width / 2))
.attr("y", 0 - (margin.top / 10))
.attr("text-anchor", "middle")
.style("font-size", "18px")
.style("font-weight","bold")
.style("text-decoration", "underline")
.text("Total Number of Wells For Each County ");

The resulting visualization can be seen below:

Panel 1 Horizontal Bar Chart of #Wells for each Texas County

Second Panel

This panel contains map of wells color coded with respect to producing fluid (oil, gas condensate,..). We start by plotting the shapefile of Texas counties (in the interest of time I am going to skip this section). Then, for each well in the database, one circle will be plotted on the map using the following code. Locations of the wells are given by Location_SLatitude and Location_SLongitude data columns in the csv dataset. These are surface locations of the wellbores. The color of each circle is then defined based on Prod_Type data column.

TR_g.selectAll("circle")
.data(EfData)
.enter()
.append("circle")
.attr("cx", function(d) {return projection([d.Location_SLongitude, d.Location_SLatitude])[0];})
.attr("cy", function(d) {return projection([d.Location_SLongitude, d.Location_SLatitude])[1];})
.attr("r", "2px").style("stroke", "black")
.style("stroke-width", "0.1")
.style("fill", function(d) {
var value = d.Prod_Type;
if (value === 'OIL ') {
return "#FF0000";
} else if (value === 'LIQUIDS RICH GAS ') {
return "#ff9900";
} else if (value === 'DRY GAS ') {
return "#ffff00";
} else if (value === 'WET GAS ') {
return "#d9b38c";
} else {return "#cccccc"; }})

Finally, we can add zoom behavior to the map so user can zoom and expand the map. This task can be accomplished by defining the zoom behavior using d3.zoom() and then appending that functionality to the svg containing the map (Mapsvg ). In this zoom function I attempted to change the font size of map label (font size should become smaller and not over take the entire map as zoom function is called).

var zoom = d3.zoom().scaleExtent([1, 8]).on("zoom", zoomed);function zoomed() {
TR_g.style("stroke-width", 1.0 / d3.event.transform.k + "px");
TR_g.attr("transform", d3.event.transform);
TR_g.selectAll("text")
.style("font-size", 11 - 1 * d3.event.transform.k + "px");
TR_g.selectAll("circle")
.attr("r", 3 / d3.event.transform.k);}
Mapsvg.call(zoom)

The resulting visualization can be seen in the following picture. This map is zoomable. In this version of the dashboard, wells on the map can not be selected by user.

Panel 2 visualization containing map and color coded well locations

You can look at my Block visualization and interact with the zoom functionality of the map online (link).

Third Panel

This panel contains the average cumulative production of wells drilled each year. I am going to use horizontal bar chart to show the data. Therefore, I am only going to discuss how to summarize the data as I showed you how to plot horizontal bar chart in Panel 1. Again we can use d3.nest() to summarize the data. Here, I am going to filter based on well year (year>0 ensures no empty year data is entered nesting function) and mean 6 month production calculated from Prod_BOE6Month data column. Finally, we can extract average production data per year and sort based on year to show in Panel 3 using horizontal bar chart as seen below.

var NestedProduction = d3.nest().key(function(d) {
return d.Year > 0
}).key(function(d) {
return d.Year;
}).rollup(function(v) {
return d3.mean(v, function(d) {
return d.Prod_BOE6Month
});
}).entries(EfData);
BOEPerWellPerYear = NestedProduction[0].values//sort bars based on values
BOEPerWellPerYear = BOEPerWellPerYear.sort(function(a, b) {
return d3.ascending(a.key, b.key);
})
Panel 3 visualization

Fourth Panel

The last panel of our visualization contains list of most successful operator year by year in terms of average production. to accomplish this task we need to find list of operators each year then calculate their respective average production then find highest average production. This task is a multi step filtering of the data.

First we calculate variable NestedProduction2 using d3.nest() . This variable filters list of operators ( Info_Operator ) by year ( Year ) then returns average 6 month production of each operator. The resulting nest function produces dictionary containing AVG: average 6 month production and Count: number of wells owned by each operator.

var NestedProduction2 
= d3.nest()
.key(function(d) {return d.Year > 0})
.key(function(d) {return d.Year;})
.key(function(d) {return d.Info_Operator;})
.rollup(function(leaves) {
var AVG = d3.mean(leaves, function(d) {
return d.Prod_BOE6Month})
var Count = d3.nest().key(function(d) {
return d.length})
.entries(leaves);
return {AVG: AVG,Count: Count[0].values.length};})
.entries(EfData);

Finally by filtering over NestedProduction2 we can find operators with highest production each year and with at least 20 wells in the database. The resulting table is shown below:

Table of most successful operators each year by production with at least 20 wells in the dataset.

Conclusion & Future Work

In this two article series I attempted to show you how to use D3.js functionality to build dashboard for upstream oil and gas data. I know I skipped lot of details, but my goal was to show you overall process. You are welcome to use the data and build your own dashboard. In the end I am happy to answer any question regarding the data and methodology.

The data and codes can be found on my Github Page.

This data is scraped from The Railroad Commission of Texas. I scraped the data and cleaned it up.You can use it freely. However, in return you can share with me any improvement that you have done on my version. Remember, sharing is caring!

This work can be greatly improved by adding interactive user experience functionality. For example, one can improve map panel and let user click on well then display information about that well.

Thanks for reading! My name is Amir Nejad,PhD. I’m a data scientist and editor of QuantJam , and I love share my ideas and to collaborate with other fellow data scientists. You can connect with me on Github, Twitter, and LinkedIn.

QuantJam:

You can see my other writings at:

--

--

Amir Nejad

PhD. Engineer | Data Scientist | Problem Solver | Solution Oriented (twitter: @Dr_Nejad)