6

I'm trying to find a repeating pattern in a string of binary digits.

eg. 0010010010 or 1110111011 = ok

not. 0100101101 = bad

The strings are 10 digits long (as above) & i guess 2 iterations of the 'pattern' are the minimum.

I started manually setting a 'bank' of patterns that the program could match it with but there must be a better way using an algorithm?

Searching got me nowhere - i think the language & terminology i'm using is incorrect..

askedMay 10, 2013 at 3:50
boomshanka's user avatar
7
  • Do you mean that you want to know if the entire string can be represented as some shorter string repeated two or more times? What's your definition of a "repeating pattern"?CommentedMay 10, 2013 at 3:57
  • Any string can be considered as having a pattern that repeats but has longer period that the number of bits can shown.CommentedMay 10, 2013 at 4:16
  • +1 for choosing language according to problem. I wonder if prolog would beat javascript here.CommentedMay 10, 2013 at 6:10
  • 1
    As far as I can tell there are 62 possibilities. Please see my answer.CommentedMay 10, 2013 at 13:33
  • @groovy I make it 52 not 62? 0000000000 0000100001 0001000010 0001000100 0001100011 0010000100 0010001000 0010010010 0010100101 0011000110 0011001100 0011100111 0100001000 0100010001 0100100100 0100101001 0101001010 0101010101 0101101011 0110001100 0110011001 0110101101 0110110110 0111001110 0111011101 0111101111 1000010000 1000100010 1000110001 1001001001 1001010010 1001100110 1001110011 1010010100 1010101010 1010110101 1011010110 1011011011 1011101110 1011110111 1100011000 1100110011 1100111001 1101011010 1101101101 1101110111 1101111011 1110011100 1110111011 1110111101 1111011110 1111111111CommentedMay 10, 2013 at 20:07

7 Answers7

2

Quite a challenge. What about this function?

function findPattern(n) {    var maxlen = parseInt(n.length/2);    NEXT:    for(var i=1; i<=maxlen; ++i) {        var len=0, k=0, prev="", sub;        do {            sub = n.substring(k,k+i);            k+= i;            len = sub.length;            if(len!=i) break;            if(prev.length && sub.length==i && prev!=sub) continue NEXT;            if(!prev.length) prev = sub;        } while(sub.length);        var trail = n.substr(n.length-len);        if(!len || len && trail==n.substr(0,len)) return n.substr(0,i);    }    return false;}

It even works for any length strings with any contents. Seethe fiddle

Inspired by Jack's and Zim-Zam's answer, here is the list for brute force algorithm:

var oksubs =["001","010","011","100","101","110","0001","0010","0011","0100","0101","0110","0111","1000","1001","1010","1011","1100","1101","1110","00000","00001","00011","00101","00110","00111","01000","01001","01010","01011","01100","01101","01110","01111","10000","10001","10011","10101","10110","10111","11000","11001","11010","11011","11100","11101","11110","11111"];

Thanks to Jack's comment, here is both short and effective code:

function findPattern(n) {    var oksubs = [n.substr(0,5),n.substr(0,4),n.substr(0,3)];    for(var i=0; i<oksubs.length; ++i) {        var sub = oksubs[i];        if((sub+sub+sub+sub).substr(0,10)==n) return sub;    }    return false;}
answeredMay 10, 2013 at 5:01
Jan Turoň's user avatar
Sign up to request clarification or add additional context in comments.

2 Comments

Hey Jan, Thanks for quickly posting the code. I think, It is possible to find patterns without using oksubs. Take the first 5 or 4 or 3 digits from the original string itself. by five I mean Stringlength/2
Oh, thanks Jack! Seems like we managed to put the math, brute force and JS together!
1

You've only got 2^10 patterns, that's a small enough number that you can just precompute all of the valid strings and store the results in a 1024-element boolean array; if a string is valid, then convert it to an integer (e.g. "0000001111" = 15) and store "true" in the resulting array index. To check if a string is valid, convert it to an integer and look up the index in the precomputed boolean array.

If your strings are longer than 10 digits then you'll need to be more clever about determining if a string is valid, but since you only have 1024 strings you might as well be lazy about this.

answeredMay 10, 2013 at 4:02
Zim-Zam O'Pootertoot's user avatar

1 Comment

I'm assuming you have a minimum pattern length of 3, so check if str.substring(0, 2) = str.substring(3, 5) = str.substring(6, 8), and if str.substring(0, 0) = str.substring(9, 9). If that's false, then try a pattern of length 4; str.substring(0, 3) = str.substring(4, 7), and str.substring(0, 1) = str.substring(8, 9). If that fails, then try a pattern of length 5. A pattern of length 6 would be str.substring(0, 3) = str.substring(6, 9), but I'm assuming you've got a max pattern length of 5.
1
  1. maintaining an array of 2^10 wont help becuase it wont indicate which strings have repeating patterns.
  2. To have a repeating pattern the pattern length can be only be <= 5

  3. there can be a pattern of length 1.but pattern of length five will cover it. [STEP EDITED]

  4. if there a pattern with length 2, there is always a pattern ith length 4.

  5. from (1),(2), (3) and (4) , it is only necessary to check patterns of length 3,4 and 5

  6. that means if first three digits match with next three digits proceed till end of string else break and go to 7

  7. else match first four digits with next four if match proceed till end of stringelse break and go to 8

  8. else match first five digits with next four if match proceed till end of stringelse break and go to 9

  9. if one of 6, 7,8 is false, return failure

answeredMay 10, 2013 at 5:42
Jack's user avatar

4 Comments

It can be generalized if we consider max pattern length = max stringlength/2 and start comparing from 3 to stringthlength/2
step 3 is wrong and redundant:1111111111 has a pattern of length 1, but this is also pattern of length 5, so the conclusion in step 5 is correct with some luck
i agree. good find Jan. however, the algo. still holds good, as you rightly said. I realized one more thing, Steps 6,7 and 8 should be applied in reverse order to ensure we find the pattern with max. length.
Thanks for inspiration for the brute force algorithm, see my updated answer. I love when brute force wins :-)
1

My brute-force approach would be:

by example

  1. givenString:0010010010:

  2. create list of possible patterns for givenString0010010010:

    possiblePatterns = [00, 010, 0010, 00100, 01, 001, 0100, 10, 100]
  3. repeat them to make strings with length >= 10

    testPattern0 = 0000000000    // 00 00 00 00 00testPattern1 = 010010010010  // 010 010 010 010testPattern2 = 001000100010  // 0010 0010 0010...
  4. and check...

    for each testPattern:    if '0010010010' is a substring of testPattern ==> pattern found

    one of the matching strings:

    testPattern2: 010010010010givenString :   0010010010
  5. found patterns:

    foundPatterns = [010, 001, 100]

As one can see, this is a possibly redundant list as all patterns are basically the same, just shifted. But depending on the use case, this may actually be what you want.


Code:

function findPatterns(str){    var len = str.length;    var possible_patterns = {};  // save as keys to prevent duplicates    var testPattern;    var foundPatterns = [];    // 1) create collection of possible patterns where:    //    1 < possiblePattern.length <= str.length/2    for(var i = 0; i <= len/2; i++){        for(var j = i+2; j <= len/2; j++){            possible_patterns[str.substring(i, j)] = 0;        }    }    // 2) create testPattern to test against given str where:    //    testPattern.length >= str.length    for(var pattern in possible_patterns){        testPattern = new Array(Math.ceil(len/pattern.length)+1).join(pattern);        if(testPattern.indexOf(str) >= 0){            foundPatterns.push(pattern);        }    }    return foundPatterns;}

==>fiddle

answeredMay 10, 2013 at 5:08
tsterker's user avatar

2 Comments

Your possible patterns list is incomplete: what about0011001100? Feel free to copy the patterns from my answer.
my list of possible patterns does only contain patterns found in the given example string 0010010010. my method constructs the list of possible patterns dynamically based on the string at hand. or am I missing something?
1

As far as I can tell, there are 62 patterned binary strings of length 10 =>2^1 + 2^2 + 2^3 + 2^4 + 2^5. Here's some code that lists them and matches a patterned string:

function binComb (n){  var answer = []  for (var i=0; i<Math.pow(2,n);i++){    var str = i.toString(2)    for (var j=str.length; j<n; j++){      str = "0" + str    }    answer.push(str)  }  return answer}function allCycles(){  var answer = {}, cycled = ""  for (var i=1; i<=5; i++){    var arr = binComb(i)    for (var j=0; j<arr.length; j++){      while(cycled.length < 10){        cycled += arr[j]        if (10 - cycled.length < arr[j].length)          cycled += arr[j].substr(0,10 - cycled.length)      }      if (answer[cycled])         answer[cycled].push(arr[j])      else answer[cycled] = [arr[j]]      cycled = ""    }  }  return answer}function getPattern (str){  var patterns = allCycles()  if (patterns[str])     return patterns[str]  else return "Pattern not found."}

OUTPUT:

console.log(getPattern("0010010010"))console.log(getPattern("1110111011"))console.log(getPattern("0100101101"))console.log(getPattern("1111111111"))["001"]["1110"]Pattern not found.["1", "11", "111", "1111", "11111"]
answeredMay 10, 2013 at 13:31
גלעד ברקן's user avatar

Comments

0

In Python (again) but without regexps:

def is_repeated(text):    'check if the first part of the string is repeated throughout the string'    len_text = len(text)    for rep_len in range(len_text // 2,  0, -1):        reps = (len_text + rep_len) // rep_len        if (text[:rep_len] * reps).startswith(text):            return rep_len  # equivalent to boolean True as will not be zero    return 0     # equivalent to boolean Falsematchstr = """\100111001111101110110010010010101010101011111111110100101101"""for line in matchstr.split():    print('%r has a repetition length of %i' % (line, is_repeated(line)))

Output:

'1001110011' has a repetition length of 5'1110111011' has a repetition length of 4'0010010010' has a repetition length of 3'1010101010' has a repetition length of 4'1111111111' has a repetition length of 5'0100101101' has a repetition length of 0
answeredMay 10, 2013 at 6:49
Paddy3118's user avatar

Comments

0

This answer uses a Python regular expression to be compiled with the VERBOSE flag set which allows multi-line regexps with comments and non-significant spaces. I also use named groups in the regexp.

The regexp was developed with the aid of the Kodos tool.

The regexp searches for the longest repeating strings of length five then four then three repeating characters. (two repeating characters is redundant as it is equal to the longer four; and similarly one repeating is made redundant by five).

import rerawstr = r"""(?P<r5> .{5})    (?P=r5) |(?P<r4>(?P<_42> .{2}) .{2})   (?P=r4)   (?P=_42) |(?P<r3>(?P<_31> .{1}) .{2})   (?P=r3){2}   (?P=_31) """matchstr = """\100111001111101110110010010010101010101011111111110100101101"""for matchobj in re.finditer(rawstr, matchstr, re.VERBOSE):    grp, rep = [(g, r) for g, r in matchobj.groupdict().items()                   if g[0] != '_' and r is not None][0]    print('%r is a repetition of %r' % (matchobj.group().strip(), rep))

Gives the output:

'1001110011' is a repetition of '10011''1110111011' is a repetition of '1110''0010010010' is a repetition of '001''1010101010' is a repetition of '1010''1111111111' is a repetition of '11111'
answeredMay 10, 2013 at 6:24
Paddy3118's user avatar

2 Comments

Shouldn't 1001110011 show as having a repeating pattern of 10011?
@cHao, fixed. Thanks. I added an extra '\' when I copied the regexp from Kodos by mistake.

Your Answer

Sign up orlog in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to ourterms of service and acknowledge you have read ourprivacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.