You have noted that columns are a fixed format
- transpose as it's simpler to manipulate rows than columns
- use regex to parse out components
- re-construct with your requirements
set_index()
with new value
- transpose to return to original structure
import re
myre = re.compile("^([A-Z]+)([0-9]+) - ([A-Z]+)([0-9]+) ([A-Z,a-z]+)$")
data = [[1, 2, 4], [3, 5, 4], [2, 7, 6]]
df = pd.DataFrame(data, columns = ["ARG15 - ILE10 vdW", "VAL16 - ILE10 vdW", "VAL16 - VAL19 vdW"])
df = (df.T.assign(bits=lambda dfa: dfa.index)
.assign(bits=lambda dfa: dfa.bits.apply(lambda s: "".join([f"{a}{int(b)+1} - {c}{int(d)+1} {e}"
for a, b, c, d, e in re.findall(myre, s)])))
.set_index("bits")
.T
)
|
ARG16 - ILE11 vdW |
VAL17 - ILE11 vdW |
VAL17 - VAL20 vdW |
0 |
1 |
2 |
4 |
1 |
3 |
5 |
4 |
2 |
2 |
7 |
6 |
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…