Thursday, 23 April 2015

Importing a set of csv files and putting together in a list of data frames...

I now know how to do this.
This is the code:


# tell me where the files are
setwd("")

# give me a list of file names (it is possible to get these too)
file.list <- c("Drug1.csv", "Drug2.csv", "Drug3.csv", "Drug4.csv")

# with a loop (could be a better way), go through file.list
# read the csv file using the name in file.list
# add to a predefined list entitled data.list (created first)
# list() function in the read.csv file keeps it all together
data.list <- list()
for (i in 1:length(file.list)){
  data.list<-c(data.list, list(read.csv(file.list[i])))  
}

This seems to work which is useful.

Lists of data.frames

So it is possible to make a list of data frames for R.

You can access each data frame using a single square bracket - list[1]

Then you can access the data frames in the list using - list[[1]][1,1]

Interesting. 

source: http://www.r-tutor.com/r-introduction/list

Reading in a list of files...

Sometimes my data is in a group of separate files.
For that reason, I want to find a way to read in multiple files.
These two posts seem to provide some useful code for me to try:

  • http://brianmannmath.github.io/blog/2014/01/20/using-lapply-to-import-files-to-r/
  • http://www.ats.ucla.edu/stat/r/pages/read_multiple.htm
I wondered if you could have a list of data frames and the second link seems to suggest that you can. Interesting. 


Wednesday, 22 April 2015

Learning more about fitting non-linear lines...

Many of the graphs used in biochemistry are non-linear curves. These include the calculation of LD50 and IC50 as well as the majority of curves associated with enzyme kinetics.

For these reasons, I am trying to understand more about the nls() function in R.

A Google search has revealed the following links which I am reading through:

  • https://stat.ethz.ch/R-manual/R-patched/library/stats/html/nls.html
    • key documentation supplied with R. 
  • http://robinlovelace.net/2013/10/23/nls-demonstation.html
    • contains info about how to calculate a r^2 value. 
  • http://www.walkingrandomly.com/?p=5254
    • Shows an example of an irregular set of points and a nicely fitted line.
  • https://rmazing.wordpress.com/2012/07/05/a-better-nls/#comments
Hmm, this is all quite technical and maybe even a bit more complicated than I need but hey....

Changing from Scientific Notation

This looks relevant:
http://stackoverflow.com/questions/5352099/how-to-disable-scientific-notation-in-r


Monday, 20 April 2015

Adding text to graphs...

This is more of a challenge that it should be, I suspect but I found this link which is very interesting:
http://lukemiller.org/index.php/2012/10/adding-p-values-and-r-squared-values-to-a-plot-using-expression/

What all data analysis scripts should do....

They should read in some data at the beginning:
read.csv("myData.csv")

They should do some analysis and/or draw some graphs.

They should save some output:
write.csv(MyData, file = "MyData.csv")

from here:
http://rprogramming.net/write-csv-in-r/

Tuesday, 14 April 2015

Plotting Venn Diagrams in R....

One of the challenges with R is that there is always more than one way to do things. Often there are many ways to do things. Today, I have set myself the task of trying to work out how to draw Venn Diagrams in R.

A Google search using "r", "venn" and "diagram" revealed the following options:

With five ways to do something, which is the best? Well the short answer to that is the one that you like and thus the personal preference element which is key to R becomes the most important issue. 

I have spent a little while trying to work out the differences between the various packages. 

There is one key point: do you know the numbers for your Venn Diagram or do you want to find these out.

The package VennDiagram generally requires you to have all the numbers already. It produces nice diagrams and the examples include ways to generate high quality TIFF plots. 
Here is an example of the four way Venn Diagram made with the example code:


I think it's pretty and I can appreciate how to change and add the information. This is the code (from  :

# Reference four-set diagram
install.packages("VennDiagram")

library(VennDiagram)

venn.plot <- draw.quad.venn( area1 = 72, 
                             area2 = 86,
                             area3 = 50, 
                             area4 = 52, 
                             n12 = 44, 
                             n13 = 27, 
                             n14 = 32, 
                             n23 = 38, 
                             n24 = 32,
                             n34 = 20,
                             n123 = 18,
                             n124 = 17,
                             n134 = 11, 
                             n234 = 13,
                             n1234 = 6,
                             category = c("First", "Second", "Third", "Fourth"),
                             fill = c("orange", "red", "green", "blue"),
                             lty = "dashed",
                             cex = 2, cat.cex = 2, cat.col = c("orange", "red", "green", "blue") );
                             


 

Thursday, 9 April 2015

Plotting inhibitory data....

In biochemistry, we often investigate drug effects on enzymes or cells. These drug effects are often inhibitory effects.

I have been trying to find good resources about how to do this in R.

The key seems to use the Nonlinear Least Squares function:

nls()

This is going to take practice to understand and appreciate.

I found this link: http://www.carlyhuitema.com/r.html which looks useful. It deals with IC50, has example data and some example scripts. Nice.

Resources:
https://stat.ethz.ch/R-manual/R-patched/library/stats/html/nls.html

Here's something I managed to draw:


I like the look of this. It implies that I have the correct script and info in place. 
Good :-)



Wednesday, 8 April 2015

Trending in github...

This is an interesting list.
https://github.com/trending?l=r

Looping through a data frame to make graphs with error bars...

Inspired by Jo Welton, I have written an R script to allow me to loop through and make lots of histograms with error bars.

The following links were useful in working all this out:



The data is arranged in rows with the name of the protein on the left and with means and SDs of two different groups in the data frame.


File called sampleData.csv



I have written a script that produces these graphs in one output.


Here is the script:

# I used the Excel file to create a csv file that I then imported into R. 
data <- read.csv("sampleData.csv")
#This creates a data frame.
str(data)

#Based on the data frame:
#create a vector m, with the data in it 
m<-c(data[1,2], data[1,3])
#then draw a bar plot
barplot(m, horiz=TRUE, 
        main = data[1,1], 
        names.arg=c("Control", "Treated"), 
        cex.names=1 )

#this arrows function looks useful
#draw Control error bar first. 
arrows(data[1,2]-data[1,4], # says where to START the line across the bottom
       0.7,  # says where to start the line on the y co-ordinate
       data[1,2]+data[1,4],# says where to END the line across the bottom
       0.7, 
       angle=90, #draws a line instead of an arrow
       code=3, #draws lines at both ends
       length=0.25, #size of error bar
       lwd = 1.5)  #weight of line 

#draw Treated error bar second 
arrows(data[1,3]-data[1,5],
       1.9,
       data[1,3]+data[1,5],
       1.9, 
       angle=90, 
       code=3, 
       length=0.25,
       lwd = 1.5)


par(mfrow=c(3,1)) #layout three rows one column - so all on the same page

#Then turn it into a loop and create 3 plots...
for (i in 1:nrow(data)){
m<-c(data[i,3], data[i,2])
colors = c("red", "white")
sum = data[i,2] + data[i,3] + data[i,4] + data[i,5]
barplot(m, horiz=TRUE, 
        main = data[i,1], 
        names.arg=c("Treated", "Control"),
        las = 1,
        col=colors,
        cex.names=1,
        xlim = c(0, sum))

#draw the error bars
arrows(data[i,2]-data[i,4],
       1.9,
       data[i,2]+data[i,4],
       1.9, 
       angle=90,
       code=3, 
       length=0.1,
       lwd = 1.5) 


arrows(data[i,3]-data[i,5],
       0.7,
       data[i,3]+data[i,5],
       0.7, 
       angle=90, 
       code=3, 
       length=0.1,
       lwd = 1.5)
}

Wednesday, 1 April 2015

Adding legends to plots

Adding a legend is a key part of making graphs for publication.
Here is a nice link for that:
http://stackoverflow.com/questions/15997121/add-pch-symbol-in-r-plot-legend

Turning negative values to zero....

I am working with Jo Welton to explore her proteomics data.
Our first step is to turn the negative values from her data into zeros. The negative values are there because she has subtracted backgrounds.

As ever, StackOverflow is our friend. I found this link:
http://stackoverflow.com/questions/11275187/r-replacing-negative-values-by-zero

Steps we followed:

  1. Convert the column of our data frame into a vector
  2. Go through the vector, find the values less than zero and make them zero
  3. Convert our vector back to a data frame. 
  4. Bind our data frames together. 
Here is the code:

> mUr.zero <- (dataframe$col) #makes the vector
> mUr.zero[mUr.zero <0] <- 0 # loops through the vector, finds values less than zero and replaces them with zero. 
> mUr.zero <- as.data.frame(mUr.zero) # converts into a data frame
> newdataframe <- cbind(dataframe, mUr.zero) # binds this into the data frame

Happy days - this seems to work :-)