When reading a text file with base read.table,
"If row.names is not specified and the header line has one less entry than the number of columns, the first column is taken to be the row names. This allows data frames to be read in from the format in which they are printed. If row.names is specified and does not refer to the first column, that column is discarded from such files."
But how can I read such a file using tidyverse's readr ?
Consider this file (let's call it test.txt):
col1 col2
sample1 2 3
sample2 2 5
it is tab-separated, the first line has two items separated by a tab, the 2nd and 3rd lines have 3 items separated by two tabs.
Base R:
> read.table("test.txt")
col1 col2
sample1 2 3
sample2 2 5
R with readr:
> read_delim("test.txt",delim="")
-- Column specification --------------------------------------------------------------------------------------------------
cols(
col1 = col_character(),
col2 = col_double()
)
Warning: 2 parsing failures.
row col expected actual file
1 -- 2 columns 3 columns 'C:Usersmoje4671Desktopest.txt'
2 -- 2 columns 3 columns 'C:Usersmoje4671Desktopest.txt'
# A tibble: 2 x 2
col1 col2
<chr> <dbl>
1 sample1 2
2 sample2 2
Unfortunately I do have quite a few files floating around that obey this convention (I won't discuss its merits).
I find it hard to imagine that there is no simple readr way to read this sort of file .. which is, after all, a legitimate R file format (so to speak);
Of course, a workaround is along the lines of
> as.tibble(read.table("test.txt"))
# A tibble: 2 x 2
col1 col2
<int> <int>
1 2 3
2 2 5
(plus some magic to preserve the rownames, alright)
.. but this is sort of defeating the purpose of using readr (faster, no automatic type conversion, etc...). Any better way ?
question from:
https://stackoverflow.com/questions/65860981/dealing-with-r-type-text-files-with-rownames-as-the-first-unamed-col-in-readr 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…