CS 526 - Computer Graphics II                                           -- Yiwen Sun


Final Exam: Household Energy Consumption Analysis

In this project I did visualization and analysis on electricity / water / natural gas usage for Andy's household unit. I showed the visualization of the data itself, the relationship between attributes, the repeating seasonal patterns, the long term trends and short term variations, and make some hypothesis as to what changes in the real world could have caused them.

The visualization and analysis was produced based on three parts: my program written in Java using JFreeChart , XmdvTool, and Microsoft Excel.


Data

The data sets is given as an average daily usage for each month from 1998 August to 2009 March. Here is an example:

Year Month Electricity (kWh per Day) Temp (F) Natural Gas (Therms per day) Water (gallons per day)
1998 8 22.4 70 0.5 61.5
1998 9 22.4 70 0.5 61.5
1998 10 8.6 53 2.2 61.5
1998 11 7.6 44 3.7 61.5
1998 12 10 35 5.9 61.5
1999 1 11.6 22 8.2 61.5
1999 2 8.4 34 5.8 74.8


In this work, I mainly visualize three measures:

and use the visualization with other attributes(temperature, year, month) to help analysis the usage patterns and trends of the three measures.


Visualization and Analysis

       In this scenario, three measures and temperature can be visualized in a time-series plots over the 10 years.

   

All of them show a seasonal pattern over time. When temperature is high, electricity and water are high, but gas is low, and vice versa.

To look at detail of the seasonal pattern, I plot them in a year's time frame using Excel. Here each line represents one year's data.

  • The electricity usage is high from June to Sep, especially in July. And usage drops from Nov. to April, with lowest generally in March and April.

  • The gas usage is high from Nov to March, especially in January. And usage drops from May to Sep. And the low usage is identical in June and July, which is almost 1 therms/day.

  • The water usage shows a twin-month phenomenon, because the water is typically billed every 2 months. Water usage is relatively high from June to Sep, and drops from Oct. to Jan. With an outlier in Aug and Sep for 2002.


1. To find pair-wise relationship, use scatter plot matrix

The snapshot generated by XmdvTool shows a 5x5 scatter plot matrix of the three measures, temperature and month. 

This scatter plot matrix shows that:

  • Gas and temperature are highly negatively correlated, which leads to high gas usage in winter, because gas is used for heating in winter.

  • Electricity and temperature are positively correlated, which explains the high usage of electricity in summer, because the air-conditioner is used for cooling in summer, also

  • Water and temperature are weakly positively correlated, which explains the relatively high usage of water in summer, probably because more pond water changes. Also water is weakly related with electricity, because pond needs more electricity in summer.


2. To view relationship among all the attributes, use parallel coordinates

High(in red) and lower(in yellow) values are brushed for electricity, gas, water.

It also shows that temperature-electricity has very strong positive correlation; temperature-gas has negative (inverse) correlation.

An outlier(highlighted in white) in water is detected in Aug and Sep in 2002.


I use two models to show the trends.

First is a statistic baseline model, in which the data are accumulated and averaged monthly for 12 months. In the snapshot, the dark red line shows the baseline model, the shadowed area shows the minimum and maximum.

It reveals that,

  • electricity usage went up in 2000, 2001, 2002, and began to drop in 2004, and went lower on.
  • gas usage doesn't change dramatically, but was higher than average back in 1999, 2000, 2001, and got lower than average from 2006.
  • water usage went up in 2000, and got extremes in 2002 and started to drop in 2008.


Second, a regression model is built to show the expected values and to detect short term variations.

I use 5 independent variables (temperature, month, electricity, gas, water) respectively by using MS Excel to find best fit function , for electricity, gas, water as shown below:

Then use these function to calculate expected value and plot in the time-series in my program (shown in red line):

Electricity expected by temperature

Gas expected by temperature

Gas expected by month (looks more fit than above)

Water expected by Temperature

Water expected by electricity

    These plots help to detect the variations and explain the cause, which is described below.

    The detected variations are:

 


    Use Andy's measurement and the U.S. HOUSEHOLD ELECTRICITY REPORT for 2001, I came up with a basic electricity usage list:

Electricity(kWh/day) usage type time
3.5 Kitchen&Laundry  
1 Space Heating winter
3 Furnace Fan winter
6.5 Home Electronics  
2 pond winter
4.5 pond summer
  Air Conditioning, Lighting & Other summer
  Lighting & others winter

    Also use the information Andy provided and the Water Consumption Calculate, I came up with a basic water usage list:

Daily usage(gallon) Unit (gallon) usage type time
14 50*2person bath weekly
24 2gallon/min*6min*2person shower daily
24 3gallon/flush*4flush*2person toilet flushing daily
20 20 dishwasher daily
8.5 60 toploading washing machine weekly
4.3 30 frontloading washing machine weekly
10 10 drinking,food preparing, other indoor daily
    outdoor(gardening,pond, etc.)  

Using these to draw pie charts of daily composition. Here is an example showing the July 2002 usage:


by Yiwen Sun