Assuming your csv is well-formed (ie no "
besides those used to delimit string fields, or besides ones escaped like "
), you can split on a comma that's followed by an even number of non-escaped "-marks. (If you're inside a set of "" there's only an odd number left in the line).
Your regex you've tried looks like you're almost there.
The following looks for a comma followed by an even number of any sort of quote marks:
,(?=([^"]*"[^"]*")*[^"]*$)
To modify it to look for an even number of non-escaped quote marks (assuming quote marks are escaped with backslash like "
), I replace each [^"]
with ([^"\]|\.)
. This means "match a character that isn't a " and isn't a blackslash, OR match a backslash and the character immediately following it".
,(?=(([^"\]|\.)*"([^"\]|\.)*")*([^"\]|\.)*$)
See it in action here.
(The reason the backslash is doubled is I want to match a literal backslash).
Now to get it into vb.net you just need to double all your quote marks:
splitRegex = ",(?=(([^""\]|\.)*""([^""\]|\.)*"")*([^""\]|\.)*$)"
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…