That is not how it works. The transformers library provides different types of tokenizers. In the case of distilbert it is a wordpiece tokenizer that has a defined vocabulary that was used to train the corresponding model and therefore does not offer such modifications (as far as I know). Something you can do is using the split() method of the python string:
text = "Don't you love ?? Transformers? We sure do."
tokens = text.split()
print("Tokens: ", tokens)
Output:
Tokens: ["Don't", 'you', 'love', '??', 'Transformers?', 'We', 'sure', 'do.']
In case you are looking for a bit more complex tokenization that also takes the punctuation into account, you can utilize the basic_tokenizer:
from transformers import DistilBertTokenizer
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased')
tokens = tokenizer.basic_tokenizer.tokenize(text)
print("Tokens: ", tokens)
Output:
Tokens: ['Don', "'", 't', 'you', 'love', '??', 'Transformers', '?', 'We', 'sure', 'do', '.']
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…