I solvedthis problem from LeetCode:
Given a pattern and a string str, find if str follows the same pattern.
Here follow means a full match, such that there is a bijection between a letter in pattern and a non-empty word in str
Examples:
pattern = "abba", str = "dog cat cat dog" should return true.
pattern = "abba", str = "dog cat cat fish" should return false.
pattern = "aaaa", str = "dog cat cat dog" should return false.
pattern = "abba", str = "dog dog dog dog" should return false.
Notes:
You may assume pattern contains only lowercase letters, and str contains lowercase letters separated by a single space.
I was able to solve this problem by using aDictionary to keep track of the relation between thepattern and thestr:
public class Solution {public bool WordPattern(string pattern, string str) { var result = str.Split(' ').ToList(); var mapPattern = new Dictionary<char, string>(); if (result.Count != pattern.Count()) { return false; } string matchstr; int index = 0; foreach (var c in pattern) { if (mapPattern.TryGetValue(pattern[index], out matchstr)) { if (matchstr != result[index]) { return false; } } else { if(mapPattern.ContainsValue(result[index])) { return false; } mapPattern.Add(c, result[index]); } ++index; } return true; }}The approach I have currently looks up theDictionary twice to see if it already exists as akey or as avalue in theDictionary, which I believe would be the reason for slowing this by a significant amount.
Can you determine if I can do away with looking up twice? This solution only runs better than around 36% of the other submissions for this current problem. So, I believe there must be a better and faster way.
- 1\$\begingroup\$1) The .ToList() on the
Split()can be removed. (Not sure how much it will save you). 2) Try resubmitting the same code a few times. I have received a large variation on the performance\$\endgroup\$AlanT– AlanT2016-11-25 15:30:53 +00:00CommentedNov 25, 2016 at 15:30
3 Answers3
Overall, this is a nice, clean solution, congratulations!
Performance
Can you suggest If I can do away with looking up twice. Currently this solution only runs better that around 36% of the other submissions for this current problem. So, I believe there must be a better and faster way.
"Looking up twice" is not the appropriate term. You do one lookup by key, and another lookup by value. These are very different operations, and lumping them together as "two lookups" hides important details. Imagine a program that does one lookup in an array and one lookup with a Google search on the internet, and the author would wonder if slowness might be caused by doing "two lookups". It would hide a crucial detail that one of the "lookups" is clearly much slower than the other.
The lookup in the dictionary by key is very fast, an \$O(1)\$ operation.
The lookup by value in a typical dictionary (or hash map) implementation is much slower, \$O(n)\$, because dictionaries areindexed by key, not by value. They are designed for fast lookups by key, not by value.
To make the lookup by value faster, you can add a set for values.
Scope
It's good to limit the scope of variables to the minimum needed.
matchstrshould be declared inside the loop, as it is not needed outsidemapPatternshould be initialized after the early return, as it might not be needed
- \$\begingroup\$Thanks for the tip. My initial solution had two
Dictionarylike you had suggest but it execute a bit slower than this one. I'll try that one once again as well. Also leetcode run time can be unpredictable at times .\$\endgroup\$thebenman– thebenman2016-11-25 07:32:45 +00:00CommentedNov 25, 2016 at 7:32
Just a few small issues, the main thing (using aHashSet to keep track of the values) has already been covered.
var result = str.Split(' ').ToList();You don't need this to be aList - it's fine asstring[].
result is a bad name. May I suggestwords ortokens?
var mapPattern = new Dictionary<char, string>();if (result.Count != pattern.Count())You should check for the early returnbefore creating your dictionary.Withresult renamed towords and as an array, theif would be better as:
if (words.Length != pattern.Length)string matchstr;word ortoken would be a better choice depending on what you call the result ofstring.Split().
if (mapPattern.TryGetValue(pattern[index], out matchstr)){ if (matchstr != result[index]) {You can combine these:
if (mapPattern.TryGetValue(pattern[index], out word) && word != words[index]){ return false;}which saves you some indentation.
Overall, it's a good approach IMO. I submitted a similar approach with an extra hashset and first got a speed ~15% and then submitted it again and got a speed at ~85% so I think it's just pot luck.
Alternate solution using HashSet:
public bool WordPattern(string pattern, string str){ var words = str.Split(' '); if (words.Length != pattern.Length) { return false; } var seenWords = new HashSet<string>(); var letterToWordLookup = new Dictionary<char, string>(); for (var i = 0; i < pattern.Length; i++) { var letter = pattern[i]; string word; if (!letterToWordLookup.TryGetValue(letter, out word)) { word = words[i]; if (!seenWords.Add(word)) { return false; } letterToWordLookup[letter] = word; continue; } if (word != words[i]) { return false; } } return true;}Tested with:
var pattern = "aaabbbccc";var test1 = "cat cat cat bat bat bat dog dog dog";var test2 = "cat bat cat bat bat bat dog dog dog";Approx twice as fast asPaparazzi's solution ontest1. Increases to more than 3 times faster ontest2. YMMV.
- \$\begingroup\$@Paparazzi - did you check with the test cases I did? That's why I chose longer test cases because the ones given are so short constant terms are likely most important for perf.\$\endgroup\$RobH– RobH2016-11-25 16:32:24 +00:00CommentedNov 25, 2016 at 16:32
Not much but you could change index
foreach (int index = 0; index < pattern.Count; index++){Use a HashSet to store current values
A related approach with no O(n) lookup
public static bool PatterMatch(string pattern, string match){ Dictionary<char, int> dlPattern = new Dictionary<char, int>(); Dictionary<string, int> dlMatch = new Dictionary<string, int>(); List<int> lPattern = new List<int>(); List<int> lMatch = new List<int>(); int index = 0; int indexOut; foreach(char p in pattern.ToCharArray()) { if (dlPattern.TryGetValue(p, out indexOut)) { lPattern.Add(indexOut); } else { dlPattern.Add(p, index); lPattern.Add(index); index++; } } index = 0; foreach (string m in match.Split(' ')) { if (dlMatch.TryGetValue(m, out indexOut)) { lMatch.Add(indexOut); } else { dlMatch.Add(m, index); lMatch.Add(index); index++; } } return lPattern.SequenceEqual(lMatch);}- \$\begingroup\$-1 for the Hungarian notation and lack of reasoning about how that approach is different and why it might be preferred.\$\endgroup\$RobH– RobH2016-11-25 11:26:03 +00:00CommentedNov 25, 2016 at 11:26
- \$\begingroup\$@RobH Sorry the difference and why it might be preferred is not apparent to you.\$\endgroup\$paparazzo– paparazzo2016-11-25 14:20:45 +00:00CommentedNov 25, 2016 at 14:20
- \$\begingroup\$I can see the difference but am wonderingwhy I would prefer this over the OP's solution? You allocate more memory and process the entire pattern and string to match even if it is incorrect early on. I see this as aworse solution in at least 2 ways - not to mention readability.\$\endgroup\$RobH– RobH2016-11-25 15:15:16 +00:00CommentedNov 25, 2016 at 15:15
- \$\begingroup\$@RobH Then see what you see. There is no O(n) lookup. Checking for a difference early is an expense and dictionary doe not offer an ordinal index. Did you test against your solution for speed.\$\endgroup\$paparazzo– paparazzo2016-11-25 15:20:01 +00:00CommentedNov 25, 2016 at 15:20
- \$\begingroup\$I was thinking of my own code which used a
HashSetfor checking for the values instead of the O(n) lookup. Sorry about that. Yes, my solution is faster.\$\endgroup\$RobH– RobH2016-11-25 15:23:16 +00:00CommentedNov 25, 2016 at 15:23
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.



