CS 526 - Computer Graphics II -- Yiwen Sun
Final Exam: Household Energy Consumption Analysis
In this project I did visualization and analysis on electricity / water / natural gas usage for Andy's household unit. I showed the visualization of the data itself, the relationship between attributes, the repeating seasonal patterns, the long term trends and short term variations, and make some hypothesis as to what changes in the real world could have caused them.
The visualization and analysis was produced based on three parts: my program written in Java using JFreeChart , XmdvTool, and Microsoft Excel.
Data
The data sets is given as an average daily usage for each month from 1998 August to 2009 March. Here is an example:
Year Month Electricity (kWh per Day) Temp (F) Natural Gas (Therms per day) Water (gallons per day) 1998 8 22.4 70 0.5 61.5 1998 9 22.4 70 0.5 61.5 1998 10 8.6 53 2.2 61.5 1998 11 7.6 44 3.7 61.5 1998 12 10 35 5.9 61.5 1999 1 11.6 22 8.2 61.5 1999 2 8.4 34 5.8 74.8
In this work, I mainly visualize three measures:
Electricity
Natural gas
Water
and use the visualization with other attributes(temperature, year, month) to help analysis the usage patterns and trends of the three measures.
Visualization and Analysis
- Visualize Data in Time-series Plot
In this scenario, three measures and temperature can be visualized in a time-series plots over the 10 years.
All of them show a seasonal pattern over time. When temperature is high, electricity and water are high, but gas is low, and vice versa.
To look at detail of the seasonal pattern, I plot them in a year's time frame using Excel. Here each line represents one year's data.
- The electricity usage is high from June to Sep, especially in July. And usage drops from Nov. to April, with lowest generally in March and April.
- The gas usage is high from Nov to March, especially in January. And usage drops from May to Sep. And the low usage is identical in June and July, which is almost 1 therms/day.
- The water usage shows a twin-month phenomenon, because the water is typically billed every 2 months. Water usage is relatively high from June to Sep, and drops from Oct. to Jan. With an outlier in Aug and Sep for 2002.
- Explore the Relationship between Attributes
1. To find pair-wise relationship, use scatter plot matrix
The snapshot generated by XmdvTool shows a 5x5 scatter plot matrix of the three measures, temperature and month.
This scatter plot matrix shows that:
Gas and temperature are highly negatively correlated, which leads to high gas usage in winter, because gas is used for heating in winter.
Electricity and temperature are positively correlated, which explains the high usage of electricity in summer, because the air-conditioner is used for cooling in summer, also
Water and temperature are weakly positively correlated, which explains the relatively high usage of water in summer, probably because more pond water changes. Also water is weakly related with electricity, because pond needs more electricity in summer.
2. To view relationship among all the attributes, use parallel coordinates
High(in red) and lower(in yellow) values are brushed for electricity, gas, water.
It also shows that temperature-electricity has very strong positive correlation; temperature-gas has negative (inverse) correlation.
![]()
![]()
An outlier(highlighted in white) in water is detected in Aug and Sep in 2002.
- Visualize long term trends and short term variations
I use two models to show the trends.
First is a statistic baseline model, in which the data are accumulated and averaged monthly for 12 months. In the snapshot, the dark red line shows the baseline model, the shadowed area shows the minimum and maximum.
It reveals that,
- electricity usage went up in 2000, 2001, 2002, and began to drop in 2004, and went lower on.
- gas usage doesn't change dramatically, but was higher than average back in 1999, 2000, 2001, and got lower than average from 2006.
- water usage went up in 2000, and got extremes in 2002 and started to drop in 2008.
Second, a regression model is built to show the expected values and to detect short term variations.I use 5 independent variables (temperature, month, electricity, gas, water) respectively by using MS Excel to find best fit function , for electricity, gas, water as shown below:
![]()
![]()
Then use these function to calculate expected value and plot in the time-series in my program (shown in red line):
Electricity expected by temperature
Gas expected by temperature
Gas expected by month (looks more fit than above)
Water expected by Temperature
Water expected by electricity
These plots help to detect the variations and explain the cause, which is described below.
- What Lies Beneath - Link the Human Activity with the Trends and Variations
The detected variations are:
Started Pond: a jump of water usage in Aug 2000
Expanded Pond: an extreme high water usage in Aug 2002
Pond Water Changes: over expected usage in summer of 2003, 2004, 2005 and 2007
Change washing machine: a top loading clothes washing machine uses 60 gallons, a front loading machine uses 30 gallons, so it will lead to a roughly daily reduce of 5 gallons. In Dec. 2001, the decrease from last Dec is close to this. And in the following Decembers stays almost that level.
Started gardening aggressively: It would cause long-term water increase, as from Oct. 2006 unexpected increase happens from that on.
Started using mulch: It would cause long-term water drop. In Oct. 2007, water usage drop than expected and keep less than expected.
Started using a themostat: It would cause less elec usage in summer and less gas usage in winter from that on. In the electricity plots in June 2005, there is a big drop. And in gas plot, in Dec 2005, a less than expected is noticable too.
Change an air conditioner: It would cause less electricity usage in summer. Both July 2004, June 2007 show a decrease from previous year and less than expected. I guess one of them should be the right time.
Switched to CF bulbs: According to Wikipedia, "CFLs use between one fifth and one third of the power of equivalent incandescent lamps.[16] Since lighting accounted for approximately 9% of household electricity usage in the United States in 2001,[17] widespread use of CFLs could save as much as 7% of total US household usage". So I looked into none-extreme weather time, to find a similar decrease and which keeps for the following years, and it came to my guess Oct 2003.
- Show the Composition of Daily Usage in Pie Chart
Use Andy's measurement and the U.S. HOUSEHOLD ELECTRICITY REPORT for 2001, I came up with a basic electricity usage list:
Electricity(kWh/day) usage type time 3.5 Kitchen&Laundry 1 Space Heating winter 3 Furnace Fan winter 6.5 Home Electronics 2 pond winter 4.5 pond summer Air Conditioning, Lighting & Other summer Lighting & others winter Also use the information Andy provided and the Water Consumption Calculate, I came up with a basic water usage list:
Daily usage(gallon) Unit (gallon) usage type time 14 50*2person bath weekly 24 2gallon/min*6min*2person shower daily 24 3gallon/flush*4flush*2person toilet flushing daily 20 20 dishwasher daily 8.5 60 toploading washing machine weekly 4.3 30 frontloading washing machine weekly 10 10 drinking,food preparing, other indoor daily outdoor(gardening,pond, etc.) Using these to draw pie charts of daily composition. Here is an example showing the July 2002 usage:
by Yiwen Sun