Problem Statement:-
Two strings X
and Y
are similar if we can swap two letters (in different positions) of X
, so that it equals Y
. Also two strings X
and Y
are similar if they are equal.
For example, "tars"
and "rats"
are similar (swapping at positions 0
and 2
), and "rats"
and "arts"
are similar, but "star"
is not similar to "tars"
, "rats"
, or "arts"
.
Together, these form two connected groups by similarity: {"tars", "rats", "arts"}
and {"star"}
. Notice that "tars"
and "arts"
are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list strs
of strings where every string in strs
is an anagram of every other string in strs
. How many groups are there?
Link: https://leetcode.com/problems/similar-string-groups/description/
Problem Explanation with examples:-
Example 1
Input: strs = ["tars","rats","arts","star"]
Output: 2
Explanation: "tars" and "arts" are in the same connected component, so they are similar. "rats" and "tars" are also in the same connected component, so they are similar. Notice that "star" is not similar to "tars", "rats", or "arts". So, there are 2 connected components.
Example 2
Input: strs = ["omv","ovm"]
Output: 1
Explanation: "omv" and "ovm" are similar. So, there is only 1 connected component.
Constraints
1 <= strs.length <= 300
1 <= strs[i].length <= 300
strs[i]
consists of lowercase letters only.All words in
strs
have the same length and are anagrams of each other.
Intuition:-
We can think of it like a graph problem where each string is a node and there is an edge between two nodes if they are similar.
So, we just need to find the number of connected components in the graph.
There can be a DFS function that will mark all the nodes connected to a node as visited and while calling this we can increase the count of connected components which will be the answer.
Solution:-
Define a dfs function that takes the index of the current node, the list of strings, and the visited array as parameters.
It marks the current node as visited and then iterates over all the nodes and if the node is not visited and is similar to the current node, then we call the dfs function on that node.
Next, we define a function that checks if two strings are similar or not. It iterates over the strings and if the characters at the same index are not equal, then it increases the count. If the count is 2 or 0, then the strings are similar and we return True else we return False.
Next, we define a variable group which will store the number of connected components and initialize it to 0.
Also define a variable n which will store the length of the list of strings.
Create a visited array of size n and initialize all the values to False.
Iterate over all the nodes and if the node is not visited, then we increase the count of connected components and call the dfs function on that node.
Finally, we return the value of groups.
Code:-
JAVA Solution
public class Solution {
public int numSimilarGroups(String[] strs) {
boolean[] vis = new boolean[strs.length];
int groups = 0;
for (int i = 0; i < strs.length; i++) {
if (vis[i]) continue;
groups++;
dfs(i, strs, vis);
}
return groups;
}
private void dfs(int i, String[] strs, boolean[] vis) {
vis[i] = true;
for (int j = 0; j < strs.length; j++) {
if (vis[j]) continue;
if (isSimilar(strs[i], strs[j])) {
dfs(j, strs, vis);
}
}
}
private boolean isSimilar(String a, String b) {
int count = 0;
for (int i = 0; i < a.length(); i++) {
if (a.charAt(i) != b.charAt(i)) count++;
}
return count == 2 || count == 0;
}
}
Python Solution
class Solution:
def numSimilarGroups(self, strs: List[str]) -> int:
def dfs(i, strs, vis):
vis[i] = True
for j in range(len(strs)):
if vis[j]: continue
if is_similar(strs[i], strs[j]):
dfs(j, strs, vis)
def is_similar(a,b):
count = 0
for i in range(len(a)):
if a[i] != b[i]: count += 1
return count == 2 or count == 0
groups = 0
n = len(strs)
vis = [False] * n
for i in range(n):
if vis[i]: continue
groups += 1
dfs(i, strs, vis)
return groups
Complexity Analysis:-
TIME:-
The time complexity of the code is O(n^2 * m), where n is the length of the input array strs
, and m is the length of the strings in strs
. This is because for each string in strs
, we check it against all other strings in strs
to see if they are similar, and each string comparison takes O(m) time.
SPACE:-
The space complexity of the code is O(n), where n is the length of the input array strs
. This is because we use a boolean array vis
to keep track of visited strings during the DFS traversal, which has length n
.