LINQ Mysteries: The Distinct Function
Recently I had to use the Distinct function introduced in LINQ. My surprise was that depending on where you put the Distinct clause you will receive different results.
Let us take the following example: Let
dtAnswers be a DataTable that has two columns, named
answer_comment. What I was seeking as a result was to return the count of the different
answer_value values from all the rows. So knowing how it is done in SQL, I’ve wrote the following LINQ query in code:
Dim nDifferentValues As Integer = _ (From answer in dtAnswers.Rows _ Distinct Select answer("answer_value").ToString()).Count()
My surprise was that this will return always the same thing, no matter what rows I have and what values I have in the
answer_value column. So after struggling several hours I’ve decided to try the Function syntax of LINQ:
Dim nDifferentValues As Integer = _ (From answer in dtAnswers.Rows _ Select answer("answer_value").ToString()).Distinct.Count()
Now this returns results correctly.
My guess is that the first query computes distinct on the rows first (by reference) and after that from the result selects
answer("answer_value").ToString(). While the second query first selects
answer("answer_value").ToString() and after that computes distinct.
So be careful where you put you LINQ functions 😉
Leave a ReplyWant to join the discussion?
Feel free to contribute!