Read.csv warning ‘EOF within quoted string’ in R but successful read in EXCEL

csv, r

I try to read in a csv file download from here

I read with the following code

storm_data = read.csv('./data/repdata/StormData.csv',sep=",", stringsAsFactors=F)

It returns 692288 observations and an error message

Warning message:In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : EOF within quoted string

And the result is wrong because columns values are mixed with each other.

Then I tried the read.table method

storm_data = read.table('./data/repdata/StormData.csv',sep=",", head=T, stringsAsFactors=F)

And it returns an error message of

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : line 547364 did not have 37 elements

I pick up the adjacent lines from 547364 and read in a separate text file and it reads OK. So the problem is not really there but somewhere above.

Last, I tried to read it with excel, and it reads just fine (So is my coursera TA with read.csv). It runs OK and got 903871 lines.

I am totally lost on how to debug the script.

Here is my environment information.
My R version is 3.1.1, Rstudio version is 0.98.1028.(32 bit), operating system is win 8.1(64 bit).

PS:I tried all the related method on stack overflow and none works. If I set quote="", lines are wrong. fread won't work because I got \" in the csv file.

Best Solution

I run into the very same error and after hours of searching, I think this will surly do you some benefits.

Sys.setlocale("LC_ALL", "English")

Details can be found here:

coursera