Pages

Sunday, March 17, 2013

Extracting Information From Objects Using Names()

One of the big differences between a language like Stata compared to R is the ability in R to handle many different types of objects at once, and combine them together or pull them apart.  I had a post about objects last year, but I thought I'd show in this post how to extract information from objects you create in R.

For this example, I'll go back to a dataset I've used in the past called mydata.Rdata and it's in the Code and Data Download site.

One function that is extremely useful to know is names().  The names() function will show you everything that is stored in R under that object name.  So, for example, if you do





where mydata is a dataframe object, you will get the names of the columns, which are the vectors that comprise the dataframe. Note that names(mydata) is an object itself (because everything is an object in R) - it is a character vector of length 7.  You can save this vector and print out the class to verify this.








But names() can be useful for much more than just column names, as we'll see in a moment.

But before we go on, let's take a moment to remember how subsetting works. In subsetting, you use square brackets to pull out exactly the element of an object that you want. So if I want to subset a dataframe, I can say

mydata.subset<-mydata[,c(1:2)]

which is saving into the new object mydata.subset, all the rows and only the first two columns of the mydata dataframe.

Now, let's combine the concept of using the names() function with the concept of subsetting to change one of the column names of our dataset:

names(mydata)[4]<-"Weight_lbs"

Here we are saying, of the names(mydata) object, take the fourth component and make it "Weight_lbs".  Now, if you run the names() on our dataframe, we find the change has been made:




Ok, so now we'll see how the names() function can be used in other applications.

1. Summary objects

There are two ways to extract information from objects in R, using subsetting and using the "$" operator. 

Below, we summarize the Age vector and store the results in sum.vec.  We print out the sum.vec object and the print out the corresponding names.  Now we can extract the 1st element of the summary vector of Age in the following way using the [ ] operator.













This gives us the first element, which is the minimum. We could also do:

sum.vec[c(2,3,5)] 

for the 25th, 50th, and 75th percentiles.


The other way to extract is by using "$".  For example, the summary() function on a table object gives you a Chi squared test:












Here, you can extract any of the pieces of information that came out in the test, including the number of cases, the number of variables, the test statistic, etc.  We can extract the pvalue of the test statistic by using the "$" operator, like this:






Let's see how this can be useful in the next example.

2. Regressions and statistical tests

The standard linear regression that we run in R is using lm().  It looks like this:











But there's a lot more that R has calculated that is not shown here. We can see this by saving this linear regression as an object and running names() on it:




So we see that saved under the reg.object are the coefficients, the residuals, fitted values, degrees of freedom, and a lot more.   To find out everything that names() provides for a given object, look it up by doing ?lm.  Now, to extract any of these components, like the residuals, use the "$" operator like this:

reg.object$residuals

You can make use of this extraction by taking the mean of the residuals





or plotting their distribution:

hist(reg.object$residuals, main="Distribution of Residuals" ,xlab="Residuals")

Don't forget that you can summarize regression objects using summary(), and get the names() of that summary too, like this:

summary(reg.object)
names(summary(reg.object))

which will give you more objects you can extract from your regression. You can use the names() function on any statistical model or function such as aov(), t.test(), chisq.test(), etc.

3.  Histograms and boxplots

Finally, let's go back to that histogram and save that into an object. There are objects under names() of the histogram object now:





I showed how you can manipulate those in my post on histograms.

Similarly, for boxplot:













Here I've extracted the stats object which gives you the lower whisker, the lower hinge, the median, the upper hinge, and the upper whisker for each group, which you can see below.



Monday, March 11, 2013

Salmonella

Salmonella infections don’t just come from contaminated food—they can come from contact with animals, too. Many Salmonella infections occur in people who have contact with certain types of animals. In 2012, there were two records involving outbreaks of human Salmonella infections linked to live poultry: 
  1. Eight outbreaks were reported which was more than any year in history and these outbreaks resulted in more than 450 illnesses –and-
  2. The largest outbreak of human Salmonella infections linked to backyard flocks in a single year occurred.
Chicks, ducklings, and other poultry can carry Salmonella. Live poultry may have Salmonella germs in their droppings and on their bodies (feathers, feet, and beaks) even when they appear healthy and clean.
 
While it usually doesn't make the birds sick, Salmonella can cause serious illness when it is passed to people. Salmonella germs can cause a diarrheal illness in people that can be mild, severe, or even life threatening. Infants, seniors, and those with weakened immune systems are more likely than others to develop severe illness. These simple steps will help protect yourself and others from getting sick:
  • Wash hands thoroughly with soap and water right after touching live poultry or anything in the area where they live and roam. Adults should supervise hand washing for young children.
  • Clean any equipment or materials associated with raising or caring for live poultry outside the house, such as cages or feed or water containers.
  • Never bring live poultry inside the house, in bathrooms, or especially in areas where food or drink is prepared, served, or stored, such as kitchens, or outdoor patios.
 
Learn more about the risk of human Salmonella Infections from live poultry here.
 
 

Friday, March 8, 2013

Groundwater: Out of sight, but not out of mind

National Groundwater Awareness Week, March 10-16, 2013, is a good time for the owners of household drinking water wells to test their water as managers of their own, personal drinking water system. 
 
The Maine CDC recommends that private well owners test their water annually for bacteria, nitrate, and nitrite and every three to five years for arsenic, radon, uranium, lead, and fluoride.  
 
Well owners should check their water more often than annually if:
  • There is a change in the taste, odor, or appearance of the water
  • A problem occurs such as a broken well cap or a new contamination source
  • Family members or houseguests have recurrent incidents of gastrointestinal illness
  • An infant is living in the home
  • There is a need to monitor the efficiency and performance of home water treatment equipment.
 
For a list of certified drinking water testing laboratories in Maine, see: Maine Certified Commercial Laboratories.  
 
If your drinking water is supplied by a public water system, you can be assured that the water you receive is regularly monitored and tested to ensure that it meets federal and state drinking water standards and is safe to drink.   
 
Whether you have your own private well or are supplied by a public water system, there are several things you can do to protect groundwater: 
  • Properly maintain your septic system: make sure to have your septic tank pumped every 3 to 5 years and check for signs that your septic system is not working
  • Handle gasoline, motor oil, fertilizers, pesticides and other hazardous chemicals with care, making sure not to dump them on the ground or pour them down the sink. When you’re done with them, dispose of them properly at a recycling center
  • Inspect your heating oil tank and its piping to make sure it’s not leaking, starting to corrode or rust, or in danger of tipping over
  • Don’t throw away or flush unused or unwanted medications down the drain. There are several law enforcement agencies throughout the state that will accept unused prescription drugs for proper and safe disposal. For more information, visit: Maine State Map of Law Enforcement Agencies Accepting Unused, Unwanted Consumer Prescription Drugs for Disposal
For more information about private wells, visit http://wellwater.maine.gov. For information about public water systems, visit www.medwp.com

Friday, March 1, 2013

March is Colorectal Cancer Awareness Month

We know that colorectal cancer is one of the few cancers that can be prevented as well as detected early with screening. Colon cancer starts as a polyp, or small collection of abnormal cells. Colon polyps become more common as we age. The recommendation for screening at age 50 is based upon this science. Don’t delay your screening appointment if you are turning 50 – and consult your doctor about screening if you are younger than 50 but have a family history of colon cancer or precancerous polyps.

 
Colon cancer is most treatable when found in the earliest stages. Often, people have few symptoms until polyps have progressed to cancer.
 
Screening saves lives, so get screened and encourage others to be screened as well.
 
The United States Preventive Services Task Force and the American Cancer Society recommend three types of tests as options for people without a family history of colon cancer:
 
  • High-sensitivity fecal occult blood testing or fecal immunochemical testing (FIT) every year: This can be obtained from your doctor’s office and can be done in the privacy of your home.
  • Flexible sigmoidoscopy every 5 years combined with a high-sensitivity stool test or FIT every 3 years
  • Colonoscopy every 10 years
For more information about colon cancer prevention visit the Maine CDC Colorectal Cancer Control Program.