# 20.8 Statistics

## This activity is from the Sourcebook for Teaching Science.

- IntroductionWhat is the difference between linear, polynomial and exponential functions?
- Use Desmos to plot linear, polynomial and exponential functions.
- What is a best fit?
- What is the meaning of R
^{2}?

**Best Fit**

Obtain the best fit equations for the following using

- Spreadsheet for 20.8 (copy the spreadsheet)Statistics on air quality (20.8.1)
- Car acceleration in 20.8.2
- Pendulums in 20.8.1
- Relationship of Temperature and pressure in gases
- Relationship of atmospheric carbon dioxide to time

In 1989, two researchers announced that they had achieved nuclear fusion with a simple apparatus at room temperature. Fusion, the process in which two atomic nuclei combine to form a larger nucleus, has been touted as the answer to the world’s energy problems, but it has only been achieved in high temperature, high energy environments. The announcement of “cold fusion” was of interest to scientists and energy planners worldwide, but unfortunately, no one was ever able to replicate the researchers purported findings. Although the researchers may have been earnest in their report, they did not have any independent confirmation of their work. A sample size of one is not sufficient to prove anything in science, and the researchers should not have presented their findings to the media without sufficient verification from repeated experimentation.

Scientific research relies on independent confirmation and *statistics*, the branch of mathematics that deals with the analysis and interpretation of numerical data. The school science laboratory is an excellent place to employ statistics because many students and lab groups may collect data on the same experiment. Rather than relying on one data point from one group, it is better to take the mean of all groups. An average (or mean) is perhaps the most common statistical measure, but there are others that can also assist scientists in their interpretation of data. Spreadsheet programs provide tools to perform many statistical tests, but we shall focus on those most commonly used in science, namely basic descriptive measures *(percent, per capita, mean, median, mode, maximum, minimum*) and *curve fitting*.

**Activity 20.8.1 – Descriptive statistics: Making sense of the data**

In 1952, a sulfur-laden smog covered London, England, leading to the deaths of approximately 4000 people. In 1963 an air pollution inversion occurred in New York City, leading to 168 deaths. Shocking tragedies such as these lead to the passage of the Air Quality Control Act in the United States, and similar measures in other parts of the world. Since the passage of this landmark act in 1967, agencies have been commissioned to measure pollution and set standards.

Figure 20.30[i] shows the number of “unhealthful air” days per year in some of the major cities in America in 1999. To determine the percentage of days that are considered to have “unhealthful air”, divide the number of unhealthy days by 365 days per year and convert to percent. Once the formula has been entered in the top cell, it can be copied to the remaining cells. When you have completed this calculation, determine the *average* (=AVERAGE(*first cell: last cell*)) and *median* (=MEDIAN(*first cell: last cell*)) number of unhealthful days for the cities listed. Finally, determine the city with the largest number of unhealthful days (=MAX(*first cell: last cell*)) and the city with the least (=MIN(*first cell: last cell*)).

**Activity 20.8.2 – Trendlines: Discovering relationships in the data**

A *trendline* is a best-fit line through a series of data points. A trendline can be a *linear*, *exponential*, *power*, *logarithmic*, or *polynomial* function. Trendlines help researchers visualize relationships. The best trendline is the one that best fits the data.

(1) __Motion__ – Table 20.7A lists time and distance data for an accelerating automobile. Graph this data and determine the best trendline. (linear, polynomial, or exponential)Try all types to see which fits the data best. You can hover over the trendline to see the equation calculated by Google Charts

(2) __Pendulums__ – In 1656, Christian Huygens, a Dutch scientist, invented the first pendulum clock. What formulas govern the movement of pendulums? Plot the experimental data from table 20.7B and determine the best trendline. Is the relationship linear, exponential, power, logarithmic, or polynomial . What is the basic equation of the pendulum?

[i] Environmental Protection Agency. (2000). *National Air Quality and Emissions Trends Report, 1999*. Research Triangle Park, NC: Air Quality Trends Analysis Group.