Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
671 views
in Technique[技术] by (71.8m points)

notepad++ - Regex match all lines that don't end with ,0 and ,1

I have a malformed CSV file which has two columns: Text,Value

The value is either 1 or 0, but some lines are malformed and span two lines:

1. "This line is fine, but there are some that are not like this",0
2. "Another good line",1
4. "Oh, I'm so bad!!
5. I spanned two lines!",0
6. "Why did you break me? FileHelpers can't read two lines!!",1

Line 4 and 5 are supposed to be one line, but the CSV file I got is broken and they span two lines, this causes the FileHelpers engine to fail while reading the csv file.

I have two CSV files with about 3000 lines each and I will only need to fix them once. I want to use notepad++ to find all the lines that are not ending in ,0 or ,1, what kind of regex can I use for that? Or maybe to regular expressions, one for the ,0 case the other one for the ,1 case.

Update:
Dan's answer works without the comma [^01]$ instead of ,[^01]$, but it only matches lines that are not ending with 0 or 1... it works sufficiently well in my case, but it does skip lines that are broken and actually end with 0 or 1.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The expression you would use is

([^,].|,[^01])$

But unfortunately, notepad++ does not support alternation (the | operator). [1] You can match the broken lines with these two expressions then:

[^,].$
,[^01]$

Except, of course, if the "Text" part does end in ,0 or ,1 itself. :-)

[1] http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Unsupported_Regex_Operators


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...