I can't speak to whether using this field this way is good or not.
I would set some kind of threshold for a similarity metric (eg Levenshtein distance, maybe divided by string length) below which you only display one. Maybe that's too clever. But it would also catch cases where there was just some punctuation mark difference or something.