The next tab contains data on houses sold in Baton Rouge, Louisiana in mid-2005. The dataset contains the following variables:

Price: The price the house sold for

Square Feet: The size of the house in square feet

Beds: The number of bedrooms in the house

Baths: The number of bathrooms in the house

Year Built: The year in which the house was built

Age: The age of the house in years

Pool: Does the house have a pool (yes/no)

Fireplace: Does the house have a fireplace (yes/no)

Waterfront: Is the house on the waterfront (yes/no)

Days on Market: How many days the house spent on the market before it was ultimately sold

Occupancy: Is the house occupied by the owner, a tenant, or is it vacant

Style: What is the architectural style of the house?

Use the data provided to answer the following questions. Problem 1.1 comes from material in week 1, Problems 1.2 and 1.3 relate to material from week 2, and problem 1.4 is related to the concepts and topics from week 3. Should you encounter any difficulties with these problems, the optional problems below are very similar to the questions in this problem set, and the answers to these questions can be found in the back of the textbook. You can also request that the tutor work extensively with you on the optional problems.

Problem 1.1

“Use the Baton Rouge house price data to do the following:

(a) Which of the variables are categorical and which are numerical?

(b) Which of the categorical variables are nominal and which are ordinal?

(c) Which of the numerical variables are ratio and which are interval?

”

Problem 1.2

“Use the Baton Rouge house price data to do the following:

(a) Create a frequency table of the architectural styles of the homes sold.

(b) Construct a bar chart, a pie chart, and a Pareto diagram.

(c) Which graphical method do you think is best to portray these data?

(d) Based on this data, what conclusions can you make about architectural styles in Baton Rouge?

Excel tips:

The easiest way to create the frequency summary table is with the COUNTIF command. The command works like this: =COUNTIF(x,y), where x is the set of cells you want to look in for a particular value, and y is the value you are looking for. For example, =COUNTIF(K:K, “”Vacant””) would look in the K column for all instances of the term “”Vacant””, count them up, and give you the number.

Excel doesn’t have a “”canned”” option to create a Pareto Diagram, so this is what you need to do: First, construct a column chart with data for both percentage and cumulative percentage. Then, click on one of the columns that represents data from one of the cumulative percentages, choose “”change series chart type,”” and set it to line. ”

Problem 1.3

“Use the Baton Rouge house price data to do the following:

(a) Construct a frequency distribution and a percentage distribution of the number of bedrooms in the sold houses.

(b) Construct a histogram and a percentage polygon.

(c) Plot a cumulative percentage polygon.

Excel tips:

As a general rule of thumb, the fewer times you type out a formula, the better. If you can accomplish a task by writing one formula and then filling it down with the fill bar, do it that way so you can minimize your chances of making mistakes. One feature that is very useful for accomplishing this task is making use of relative and absolute cell references. Say, for example, in cell C1 you have the expression =A1+B1. If you fill C1 down to C2, C2 will have the equation =A2+B2…when filling down, Excel viewed your cell references as relative, as if you said in C1 to make C1 equal to the sum of the two cells to the left of it. When you fill down to C2, Excel said that C2 should equal the sum of the two cells to the left of it, in this case A2 and B2. In some cases this is exactly what you want Excel to do, but in others you do not want relative cell references. For example, cell D1 may have total nationwide sales for your company, A1:A51 may have the names of the 50 states (plus DC!), and B1:B51 may have total sales within each state. In column C you want to have the percentage of total sales within that state. If in C1 you type =B1/D1, you will get the correct result, but then if you fill D1 down to D51, you will get error messages (or wrong answers) everywhere else. The easiest fix is to use an absolute reference in your equation, which you accomplish with the dollar sign ($). The $ is essentially a way of “”locking”” in either a row or column in a cell reference. If you type in C1 =B1/D$1, and then fill down, Excel will “”lock in”” the first row in the reference, so all of your formulae will compute correctly! With knowledge and appropriate application of relative and absolute references, it is possible to create the table in part (a) by typing exactly 4 equations (and filling them down) and nothing else! ”

Problem 1.4

“Use the Baton Rouge house price data to do the following:

(a) Compute the mean, median, first quartile, and third quartile for the price variable.

(b) Compute the variance, standard deviation, range, interquartile range, coefficient of variation, skewness, and Z scores for the price variable.

(c) Are the data skewed? If so, how?

(d) Based on the results of (a) through (c), what conclusions can you reach concerning price?

(e) Calculate the proportion of house prices that are +/- 1, +/- 2, and +/- 3 standard deviations of the mean.

(f) Compare and contrast your findings with what would be expected on the basis of the empirical rule.

Excel Tips:

Excel has built-in functions to calculate the mean (AVERAGE), median (MEDIAN), quartiles (QUARTILE), variance (VAR for samples, VARP for populations), and standard deviation (STDEV for samples, STDEVP for populations). You might also think to use the MIN and MAX commands in calculating range. Search the Excel helpfile for the appropriate syntax of these commands. You should note that the method Excel uses to calculate quartiles differs slightly from the method outlined in the book.

I mentioned the COUNTIF command above, and you may have thought to use COUNTIF to accomplish part (e). However, a single COUNTIF command cannot handle more than one condition, so if you want to use COUNTIF you should absolutely (hint hint) think outside the box a bit. Or, if you are using Excel 2007 or later, there is a COUNTIFS command that handles multiple conditions.”