Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
203 views
in Technique[技术] by (71.8m points)

r - Is there a faster way to trim the tail of each element to be 15 characters

I have a vector A with over 60 thousand elements.

Input 
head(A, n=10) = 
"ENSG00000000003.15" "ENSG00000000005.6"  "ENSG00000000419.13"
"ENSG00000000457.14" "ENSG00000000460.17" "ENSG00000000938.13"
"ENSG00000000971.16" "ENSG00000001036.14" "ENSG00000001084.13"
"ENSG00000001167.14" "ENSG00000002586.20_PAR_Y"

I want to trim the tails so that nchar(a) = 15

Output
head(A, n = 10) = 
"ENSG00000000003" "ENSG00000000005"  "ENSG00000000419"
"ENSG00000000457" "ENSG00000000460" "ENSG00000000938"
"ENSG00000000971" "ENSG00000001036" "ENSG00000001084"
"ENSG00000001167" "ENSG00000002586"

I could try using gsub to tackle this but I need to be careful not to cut off anything before the . and I'm not great at regex, so I've opted to use a while loop.

for(a in A){
while(nchar(a)>15){ 
AA= substr(A, 1, nchar(a) -1}
}

This is obviously not ideal as it takes a long time to run. Does anyone see an alternative?

question from:https://stackoverflow.com/questions/66051217/is-there-a-faster-way-to-trim-the-tail-of-each-element-to-be-15-characters

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

If you only need the part of the string before the point you can use gsub:

gsub("\..*", "", A)

If you need the first 15 characters you can use substr:

substr(A, 1, 15)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...