For sorting by header & removing the blank rows, try this piece of code: (this requires you to know "Lot ID" will be the first column)
private void Button1_Click(object sender, EventArgs e)
{
if (openFile.ShowDialog() == DialogResult.OK)
{
List<string[]> rows = File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
DataTable dt = new DataTable();
List<string> headerNames = rows[0].ToList();
foreach (var headers in rows[0])
{
dt.Columns.Add(headers);
}
foreach (var x in rows.Skip(1).OrderBy(r => r.First())) //sort based on first column of each row
{
if (x.SequenceEqual(headerNames)) //linq to check if 2 lists are have the same elements (perfect for strings)
continue; //skip the row with repeated headers
if (x.All(val => string.IsNullOrWhiteSpace(val))) //if all columns of the row are whitespace / empty, skip this row
continue;
dt.Rows.Add(x);
}
dataGridView1.DataSource = dt;
}
}
As a kind of hackish way to remove a duplicated header line, you could try this:
if (x[0] == "Lot ID")
continue;
instead of
if (x.SequenceEqual(headerNames))
continue;
It's not very elegant, but it will work.
I'll add some explanation to the linq methods used:
File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
Reads all the lines in the file, the .Select goes through each line and splits based on commma (since it is csv). Split by default returns an array of splitted values, and finally ToList() means this line returns a List of array of strings. The array contains individual cell values while the list contains rows.
List<string> headerNames = rows[0].ToList();
This saves the first row, which contains all the header names into a separate List which we can use later.
foreach (var x in rows.Skip(1).OrderBy(r => r.First()))
Skip() method ignores the first element in the list (and takes all the others), and OrderBy() sorts alphabetically, r => r.First() just means for each row "r", sort based on the First column inside "r.First()". "x" represents each row.
if (x[0] == "Lot ID")
This is not LINQ anymore, it just checks if the first column of this row is "Lot ID" and if it is, "continue" skips to the next row in foreach.
Hope my explanations helped you learn! A link to some basic LINQ is in the comments.