r - Read quoted values in txt file with data.table::fread() -
i have simple txt file: (values in quotes , separated tabs)
"col1" "col2" "col3" "a" "1,1" "c" "b" "2,1" "c" "c" "3,1" "c"
i read file using fread()
. since middle column should numeric, use dec = ","
.
however, command:
fread("myfile.txt", sep = "\t", dec = ",", header = true, stringsasfactors = false)
fails read col2 numeric. specifying colclasses = c("character", "numeric", "character")
not make difference.
is there way accurately read file using fread()
(without post-processing)?
any appreciated
i'm going backtrack little bit on previous comments; looks read.table
handle situation successfully.
demonstrating following object,
df <- data.frame( col1 = letters[1:3], col2 = sub(".", ",", 1:3 + 0.1, fixed = true), col3 = rep("c", 3), stringsasfactors = false )
which looks on disk:
write.table( df, sep = "\t", row.names = false ) # "col1" "col2" "col3" # "a" "1,1" "c" # "b" "2,1" "c" # "c" "3,1" "c"
writing temporary file,
tf <- tempfile() write.table( df, file = tf, sep = "\t", row.names = false )
read.table
process second column numeric
when proper arguments provided:
str(read.table(tf, header = true, sep = "\t", dec = ",")) # 'data.frame': 3 obs. of 3 variables: # $ col1: chr "a" "b" "c" # $ col2: num 1.1 2.1 3.1 # $ col3: chr "c" "c" "c"
more conveniently, read.delim2
may used also:
str(read.delim2(tf, header = true)) # 'data.frame': 3 obs. of 3 variables: # $ col1: chr "a" "b" "c" # $ col2: num 1.1 2.1 3.1 # $ col3: chr "c" "c" "c"
i can't why fread
not handle this, if sufficiently common scenario package maintainers may want account it. might consider opening issue ticket on github repository , inquiring this.
Comments
Post a Comment