r - Read quoted values in txt file with data.table::fread() -
i have simple txt file: (values in quotes , separated tabs)
"col1" "col2" "col3" "a" "1,1" "c" "b" "2,1" "c" "c" "3,1" "c" i read file using fread(). since middle column should numeric, use dec = ",".
however, command:
fread("myfile.txt", sep = "\t", dec = ",", header = true, stringsasfactors = false)
fails read col2 numeric. specifying colclasses = c("character", "numeric", "character") not make difference.
is there way accurately read file using fread() (without post-processing)?
any appreciated
i'm going backtrack little bit on previous comments; looks read.table handle situation successfully.
demonstrating following object,
df <- data.frame( col1 = letters[1:3], col2 = sub(".", ",", 1:3 + 0.1, fixed = true), col3 = rep("c", 3), stringsasfactors = false ) which looks on disk:
write.table( df, sep = "\t", row.names = false ) # "col1" "col2" "col3" # "a" "1,1" "c" # "b" "2,1" "c" # "c" "3,1" "c" writing temporary file,
tf <- tempfile() write.table( df, file = tf, sep = "\t", row.names = false ) read.table process second column numeric when proper arguments provided:
str(read.table(tf, header = true, sep = "\t", dec = ",")) # 'data.frame': 3 obs. of 3 variables: # $ col1: chr "a" "b" "c" # $ col2: num 1.1 2.1 3.1 # $ col3: chr "c" "c" "c" more conveniently, read.delim2 may used also:
str(read.delim2(tf, header = true)) # 'data.frame': 3 obs. of 3 variables: # $ col1: chr "a" "b" "c" # $ col2: num 1.1 2.1 3.1 # $ col3: chr "c" "c" "c" i can't why fread not handle this, if sufficiently common scenario package maintainers may want account it. might consider opening issue ticket on github repository , inquiring this.
Comments
Post a Comment