Your current algorithm seem to be O(n*m*s*s)
where n = number of existing items, m = number number of potential matches and s = average number of suppliers for each existingItem/PotentialMatch. You could reduce the running time to O(n*m*s)
by using a hash-set for the matching of suppliers.
A generic version would look like this
public static IEnumerable<(T1, T2)> SetJoin<T1, T2, TKey>(
IEnumerable<T1> t1s,
IEnumerable<T2> t2s,
Func<T1, IEnumerable<TKey>> t1Key,
Func<T2, IEnumerable<TKey>> t2Key) where TKey : IEquatable<TKey>
{
foreach (var t1 in t1s)
{
var t1Keys = new HashSet<TKey>(t1Key(t1));
foreach (var t2 in t2s)
{
// t2Key(t2) would be called many times,
// might be worth pre-computing it for each t2.
if (t2Key(t2).Any(t1Keys.Contains))
{
yield return (t1, t2);
}
}
}
}
And call it like
SetJoin<ExistingItems, PotentialMatches, int>(
existingItems,
potentialMatches,
e=> e.Suppliers.Select(s => s.Id),
p => p.Suppliers.Select(s => s.Id))
Also, while linq result in compact and nice code, it is often faster to write the equivalent logic using regular loops if performance is important.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…