website/content/ChapterFour/0600~0699/0609.Find-Duplicate-File-in-System.md
Given a list paths of directory info, including the directory path, and all the files with contents in this directory, return all the duplicate files in the file system in terms of their paths. You may return the answer in any order.
A group of duplicate files consists of at least two files that have the same content.
A single directory info string in the input list has the following format:
"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"It means there are n files (f1.txt, f2.txt ... fn.txt) with content (f1_content, f2_content ... fn_content) respectively in the directory "root/d1/d2/.../dm". Note that n >= 1 and m >= 0. If m = 0, it means the directory is just the root directory.
The output is a list of groups of duplicate file paths. For each group, it contains all the file paths of the files that have the same content. A file path is a string that has the following format:
"directory_path/file_name.txt"Example 1:
Input: paths = ["root/a 1.txt(abcd) 2.txt(efgh)","root/c 3.txt(abcd)","root/c/d 4.txt(efgh)","root 4.txt(efgh)"]
Output: [["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]
Example 2:
Input: paths = ["root/a 1.txt(abcd) 2.txt(efgh)","root/c 3.txt(abcd)","root/c/d 4.txt(efgh)"]
Output: [["root/a/2.txt","root/c/d/4.txt"],["root/a/1.txt","root/c/3.txt"]]
Constraints:
1 <= paths.length <= 2 * 1041 <= paths[i].length <= 30001 <= sum(paths[i].length) <= 5 * 105paths[i] consist of English letters, digits, '/', '.', '(', ')', and ' '.Follow up:
给定一个目录信息列表,包括目录路径,以及该目录中的所有包含内容的文件,您需要找到文件系统中的所有重复文件组的路径。一组重复的文件至少包括二个具有完全相同内容的文件。输入列表中的单个目录信息字符串的格式如下:"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"。这意味着有 n 个文件(f1.txt, f2.txt ... fn.txt 的内容分别是 f1_content, f2_content ... fn_content)在目录 root/d1/d2/.../dm 下。注意:n>=1 且 m>=0。如果 m=0,则表示该目录是根目录。该输出是重复文件路径组的列表。对于每个组,它包含具有相同内容的文件的所有文件路径。文件路径是具有下列格式的字符串:"directory_path/file_name.txt"
package leetcode
import "strings"
func findDuplicate(paths []string) [][]string {
cache := make(map[string][]string)
for _, path := range paths {
parts := strings.Split(path, " ")
dir := parts[0]
for i := 1; i < len(parts); i++ {
bracketPosition := strings.IndexByte(parts[i], '(')
content := parts[i][bracketPosition+1 : len(parts[i])-1]
cache[content] = append(cache[content], dir+"/"+parts[i][:bracketPosition])
}
}
res := make([][]string, 0, len(cache))
for _, group := range cache {
if len(group) >= 2 {
res = append(res, group)
}
}
return res
}