World's most popular travel blog for travel bloggers.

Join large list of pairs

, , No Comments
Problem Detail: 

I have a list of millions of pairs of strings, and I want to join all of the pairs that have matching members into lists without duplicates.

Example input:

[["A", "B"],  ["A", "D"],  ["M", "Q"],  ["A", "F"],  ["D", "E"],  ["Q", "Z"]] 

Example output:

[["A", "B", "D", "E", "F"],  ["M", Q", "Z"]] 

Does anyone know of an efficient algorithm for this? I'm somewhat constrained by memory. Anything that would square the memory from the input would not be an option.

Asked By : Cory Gagliardi
Answered By : FrankW

You can use a two pass approach:

In the first pass, identify all the different strings appearing in your input. (This can be done in various ways, e.g. hashing, trie, BST)

For the second pass initialize a Disjoint-set data structure with the strings found in the first pass and perform a union operation for each pair in the input.

Best Answer from StackOverflow

Question Source :

3200 people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback