Calculate differences between sequences in vector, for distance matrix in R

Hi all I am trying to create a distance matrix from a random created sequence. #set the code

DNA <- c("A","G","T","C")
randomDNA <- c()

#create the vector of 64 elements

for (i in 1:64){
  randomDNA[i] <- paste0(sample(DNA, 6, replace = T), sep = "", collapse = "")
  warnings()
}
sizeofDNA <- length(randomDNA)

#this part that I want to iterate between vector's components

split_vector <- c()
DNAdiff <- c()
for (i in 1:length(randomDNA)){
  split_vector <- strsplit(randomDNA[i], "")[[1]]
  #print(split_vector)
  for (j in 1:length(randomDNA)){
  split_vector2 <- strsplit(randomDNA[j], "")[[1]]
  #print(split_vector2)
  DNAdiff[i,j] <- setdiff(split_vector,split_vector2)
  #or
  #DNAdiff[i] <- lenght(setdiff(strsplit(randomDNA[22], "")[[1]],strsplit(randomDNA[33], "")[[1]]))
  }
}

What it does not work is A: the setdiff does not work as I expect B: no array is created

Question how do I export the results of the setdiff (if it will work) to an array so that I will have the distance matrix like array? Any recommendation is highly welcomed. Thank you all



Read more here: https://stackoverflow.com/questions/67374285/calculate-differences-between-sequences-in-vector-for-distance-matrix-in-r

Content Attribution

This content was originally published by sunta3iouxos at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: