Find the odd character out in a pattern

Question 1

Input

The first line will be a certain string repeated any amount of times. For example, it could beabcabcabcabc,[];[];[];, etc. It may be cut off; for example:1231231231. Always find the shortest string; for example, if the line is22222, then the string is2, not22 or22222 or anything else. The string will always be repeated at least 2 full times.

All subsequent lines will be that pattern offset by any number. For example, it could be:

abcabcabccabcabcabbcabcabca

(offset by 1), or it could be:

abcdefabcdefabcdefabccdefabcdefabcdefabcdeefabcdefabcdefabcdefa

(offset by 4).

One of the characters in the input will be wrong. (It is guaranteed not to be on the first line.) For example, in this input:

a=1a=1a=1=1a=1a=1a1a=11=1a=a=1a=1a=1=1a=1a=1a

the1 on line 3 is the odd one out.

Output

You must output the (zero-based, starting from the top left) coordinates of the odd one out. For example, in the above input, the corresponding output is4,2. You may also output4 2, or"4""2", or even[[4],[2]], or any other format, as long as you can tell what the output is supposed to be.

Test cases

Input:

codegolfcodegolfcoegolfcodegolfcodeglfcodegolfcodegoffodegolfcodegolfcodgolfcodegolfcodegofcodegolfcodegolfc

Output:16,2

Input:

][[][][[][][[][][[][][[[][][[][][[][][[][][[][[][[][][[][][[][][[][][[[][][[]]][[][][[][][[]

Output:8,3

Input:

.... ....

Output:1,1

Input:

abababababababbbabab

Output:4,2

Go!

Question 2

What characters can be contained in the string? Printable ASCII? ASCII? Unicode?

Question 3

@Dennis Just printable ASCII (that can basically be assumed for any challenge involving strings; otherwise we would have to specify that for almost every challenge :P)

Question 4

I guessed so. I'm thinking about an approach that would require an unused char, so I thought I'd ask.

Question 5

Should we check case like this:abc/cab/abc - and output0 2 here?

Question 6

@VadimR No, since only one character will be wrong.

Question 7

Perl,212191181 168 bytes

$_=<>;/^(((.*?)(.*?))\2+)\3$/;$x=$1x4;while(<>){chop;$x=~/\Q$_\E/&&next;for$i(0..y///c-1){for$r(split//,$x){$b=$_;$b=~s/(.{$i})./$1$r/;$x=~/\Q$b\E/&&die$i,$",$.-1,$/}}}

This version uses an optimized trick for catching the reply unit, learned in Dennis'answer.
Optimization by using the property that all lines have equal lengths.
Line end is also needed for the last line, otherwisechomp instead ofchop should be used.
Optimizations of ardnew'scomment added.

Old version, 212 bytes:

$_=<>;chop;/^(.+?)\1+(??{".{0,".(-1+length$1).'}'})$/;$;=$1;while(<>){$x=$;x length;chop;$x=~/\Q$_\E/&&next;for$i(0..-1+length$_){for$r(split//,$;){$b=$_;$b=~s/(.{$i})./$1$r/;$x=~/\Q$b\E/&&exit print$i,$",$.-1}}}

Ungolfed version:

$_ = <>;  # read first line/^(((.*?)(.*?))\2+)\3$/;# The repeat unit \2 consists of \3 and \4,# and the start part \2 can be added at the end (as partial or even full unit).$x = $1 x 4; # $x is long enough to cover each following line# Old version:# /^(.+?)\1+(??{ ".{0," . (-1 + length $1) . '}' })$/;# $a = $1; # $a is the repeat unit.# The unit is caught by a non-greedy pattern (.+?) that is# repeated at least once: \1+# The remaining characters must be less than the unit length.# The unit length is known at run-time, therefore a "postponed"# regular expression is used for the remainder.# process the following lines until the error is foundwhile (<>) {    # old version:    # $x = $a x length;    # $x contains the repeated string unit, by at least one unit longer    # than the string in the current line    chop; # remove line end of current line    $x =~ /\Q$_\E/ && next;          # go to next line, if current string is a substring of the repeated units;          # \Q...\E prevents the interpretation of special characters    # now each string position $x is checked, if it contains the wrong character:    for $i (0 .. y///c - 1) {  # y///c yields the length of $_        for $r (split //, $x) { #/ (old version uses $a)            # replace the character at position $i with a            # character from the repeat unit            $b = $_;            $b =~ s/(.{$i})./$1$r/;            $x =~ /\Q$b\E/               && die $i, $", $. - 1, $/;               # $" sets a space and the newline is added by $/;               # the newline prevents "die" from outputting line numbers        }    }}

Question 8

Great solution and comments, I need to learn more regex ;)

Question 9

the firstchop is unnecessary- should be removed. the finalexit print can be replaced withdie (add,$/ to hide the extra stuff (if required)). alsolength$_ can be replaced withy///c

Question 10

@ardnew: Many thanks, I have removed the firstchop, because$ matches before the newline at the end of the string. Hiding the extra stuff ofdie via the added newline seems necessary to me. Alsoy///c is much shorter thanlength$_ and one byte shorter thanlength without the unnecessary$_.

Question 11

@ardnew: I had forgotten aboutdie's verbosity. It even includes prints the line number! I'll use that in my next update.

Question 12

Bash Perl,231229218178164166138106 74 bytes

/^(((.*).*)\2+)\3$/;$_.=$1x2;$.--,die$+[1]if/^(.*)(.)(.*).*\1(?!\2).\3/

The script requires using the-n switch, which accounts for two of the bytes.

The idea of appending two copies of all full repetitions of the pattern has been taken fromMT0's answer.

In contrast to all other answers, this approach attempts to extract the pattern of the current input line in each iteration; it will fail on the line containing the odd character (and use the pattern of the previous line instead). This is done to include the pattern extraction in the loop, which manages to save a few bytes.

Ungolfed version

#!/usr/bin/perl -n# The `-n' switch makes Perl execute the entire script once for each input line, just like# wrapping `while(<>){…}' around the script would do./^(((.*).*)\2+)\3$/;# This regular expression matches if `((.*).*)' - accessible via the backreference `\2' -# is repeated at least once, followed by a single repetition of `\3" - zero or more of the# leftmost characters of `\2' - followed by the end of line. This means `\1' will contain# all full repetitions of the pattern. Even in the next loop, the contents of `\1' will be# available in the variable `$1'.$_.=$1x2;# Append two copies of `$1' to the current line. For the line, containing the odd# character, the regular expression will not have matched and the pattern of the previous# line will get appended.## Since the pattern is repeated at least two full times, the partial pattern repetition at# the end of the previous line will be shorter than the string before it. This means that# the entire line will the shorter than 1.5 times the full repetitions of the pattern, # making the two copies of the full repetitions of the pattern at least three times as # long as the input lines.$.-- , die $+[1] if# If the regular expression below matches, do the following:##   1. Decrement the variable `$.', which contains the input line number.##      This is done to obtain zero-based coordinates.##   2. Print `$+[1]' - the position of the last character of the first subpattern of the#      regular expression - plus some additional information to STDERR and exit.##      Notably, `die' prints the (decremented) current line number./^(.*)(.)(.*).*\1(?!\2).\3/;# `(.*)(.)(.*)', enclosed by `^' and a newline, divides the current input line into three# parts, which will be accesible via the backreferences `\1' to `\3'. Note that `\2'# contains a single character.## `.*\1(?!\2).\3' matches the current input line, except for the single character between# `\1' and `\3' which has to be different from that in `\2', at any position of the line# containing the pattern repetitions. Since this line is at least thrice as long as# `\1(?!\2).\3', it will be matched regardless of by how many characters the line has been# rotated.

Example

For the test case

codegolfcodegolfcoegolfcodegolfcodeglfcodegolfcodegoffodegolfcodegolfcodgolfcodegolfcodegofcodegolfcodegolfc

the output of the golfed version is

16 at script.pl line 1, <> line 2.

meaning that the odd character has the coordinates16,2.

This~~blatantly abuses~~ takes advantage of the liberal output format.

Just before exiting, the contents of some of Perl's special variables are:

$_  = lfcodegolfcodegoff\ncodegolfcodegolfcodegolfcodegolf$1  = lfcodegolfcodego$2  = f$3  = f

($n contains the match of the subpattern accessible via the backreference\n.)

Question 13

Clever catching of the reply unit. It can be optimized by one byte:^((.*?)(.*?))(?=\1+\2$)

Question 14

I switched to the language the popular kids are using. Can probably be golfed down further; this is my first Perl script in over a decade...

Question 15

...and you're over a decade late if you think perl is what the popular kids are using

Question 16

this answer isn't getting the love it deserves. looks to me like the winner @Doorknob

Question 17

C, 187 bytes

Limitations.

Don't use input strings longer than 98 characters :)

Golfed version

char s[99],z[99],*k,p,i,I,o,a;c(){for(i=0;k[i]==s[(i+o)%p];i++);return k[i];}main(){for(gets(k=s);c(p++););for(;!(o=o>p&&printf("%d,%d\n",I,a))&&gets(k=z);a++)while(o++<p&&c())I=I<i?i:I;}

Ungolfed version

char s[99],z[99],*k,p,i,I,o,a;c(){    for(i=0       ;k[i]==s[(i+o)%p]       ;i++)       ;    return k[i];}main(){    for(gets(k=s);c(p++);)         ;    for(;!(o=o>p&&printf("%d,%d\n",I,a)) && gets(k=z);a++)           while(o++ < p && c())            I=I<i?i:I;}

Question 18

Python,303 292

r=raw_inputR=ranges=r()l=len(s)m=1g=s[:[all((lambda x:x[1:]==x[:-1])(s[n::k])for n in R(k))for k in R(1,l)].index(True)+1]*l*2while 1: t=r() z=[map(lambda p:p[0]==p[1],zip(t,g[n:l+n]))for n in R(l)] any(all(y)for y in z)or exit("%d,%d"%(max(map(lambda b:b.index(False),z)),m)) m+=1

Input goes through stdin. I'll explain it if there's any demand, but it doesn't look like I'm going to win anyway.

Question 19

Perl,157 154

Edit: -3 thanks to ardnew' suggestion.

<>=~/^(((.*?).*?)\2+)\3$/;$p=$2;$n=$+[1];while(<>){s/.{$n}/$&$&/;/(\Q$p\E)+/g;$s=$p;1while/./g*$s=~/\G\Q$&/g;print$n>--($m=pos)?$m:$m-$n,$",$.-1,$/if pos}

It took me some time (on and off, of course, not 5 days ;-) ), and idea about algorithm was initially elusive (though I felt it was there), but finally (and suddenly) it became all clear.

If string length is multiple of pattern length, and even if the string doesn't start with the beginning of the pattern, concatenating string with itself will produce pattern in place of concatenation (imagine infinite repetition of a word on circular ribbon - the place of welding is not important). So, the idea is to trim the line to multiple unit length and concatenate original to it. The result, even for the string containing the wrong character, is guaranteed to match to pattern at least once. From there on it's easy to find position of offending character.

First line is shamelessly borrowed from Heiko Oberdiek's answer :-)

<>=~/^(((.*?).*?)\2+)\3$/;      # Read first line, find the repeating unit$p=$2;                          # and length of whole number of units.$n=$+[1];                       # Store as $p and $n.while(<>){                      # Repeat for each line.    s/.{$n}/$&$&/;              # Extract first $n chars and                                # append original line to them.    /(\Q$p\E)+/g;               # Match until failure (not necessarily from the                                # beginning - doesn't matter).    $s=$p;                      # This is just to reset global match position                                # for $s (which is $p) - we could do without $s,                                # $p.=''; but it's one char longer.                                # From here, whole pattern doesn't match -    1while/./g*$s=~/\G\Q$&/g;   # check by single char.                                # Extract next char (if possible), match to                                 # appropriate position in a pattern (position                                 # maintained by \G assertion and g modifier).                                # We either exhaust the string (then pos is                                 # undefined and this was not the string we're                                # looking for) or find offending char position.    print$n>--($m=pos)?$m:$m-$n,$",$.-1,$/if pos}

Question 20

nice work. i think you can replace/.{$n}/;$_=$&.$_; withs/.{$n}/$&$&/;

Question 21

JavaScript (ES6) -147133 136 Characters

s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Expects the string to be tested to be in the variables and outputs the result to the console.

var repetitionRE = /^(((.*).*)\2+)\3\n/;                                        // Regular expression to find repeating sequence                                        // without any trailing sub-string of the sequence.var sequence = repetitionRE.exec(s)[1]; // Find the sequence string.s.split('\n')                           // Split the input into an array. .map(   ( row, index ) =>                    // Anonymous function using ES6 arrow syntax   {     var testStr = row + '᛫'+ sequence + sequence;                                        // Concatenate the current row, a character which won't                                        // appear in the input and two copies of the repetitions                                        // of the sequence from the first line.     var match = /^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(testStr);                                        // Left of the ᛫ finds sub-matches for a single                                        // character and the sub-strings before and after.                                        // Right of the ᛫ looks for any number of characters                                        // then the before and after sub-matches with a                                        // different character between.      if ( match )       console.log( match[1].length, index );                                        // Output the index of the non-matching character                                        // and the row.   }          );

Test Case 1

s="codegolfcodegolfco\negolfcodegolfcodeg\nlfcodegolfcodegoff\nodegolfcodegolfcod\ngolfcodegolfcodego\nfcodegolfcodegolfc"s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

16 2

Test Case 2

s="][[][][[][][[][][[][][[\n[][][[][][[][][[][][[][\n[][[][][[][][[][][[][][\n[[][][[]]][[][][[][][[]"s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

8 3

Test Case 3

s="...\n. .\n..."s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

1 1

Test Case 4

s="ababa\nbabab\nababb\nbabab"s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

4 2

Test Case 5

s="xyxy\nyyxy"s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

0 1

Test Case 6

s="ababaababa\nababaaaaba"s.split('\n').map((x,i)=>(v=/^(.*)(.)(.*)᛫.*\1(?!\2).\3/.exec(x+'᛫'+(a=/^(((.*).*)\2+)\3\n/.exec(s)[1])+a))&&console.log(v[1].length,i))

Outputs

6 1

Question 22

Sadly, this approach fails if, e.g.,s="xyxy\nyyxy". For the second line,match[4] will beyy; it should be justy.

Question 23

Re-worked and shortened by 14 characters.

Question 24

Very nice! I had tried the very same second regex at some point, but I appended the minimal pattern twice instead of the maximal one (and, thus, failed miserable). One minor issue: The first regex will reportabab ast the pattern ofababaababa; you need to use^…$.

Question 25

/^…\n/ works or/^…$/m

Question 26

It may not need the leading^ (at least it doesn't for any of the 6 test case I've listed - but there is probably a counter example where it does so I've left it in).

Heiko Oberdiek 4,0511 gold badge19 silver badges28 bronze badges · Accepted Answer · 2017-04-13 12:39:01Z

Perl,212191181 168 bytes

$_=<>;/^(((.*?)(.*?))\2+)\3$/;$x=$1x4;while(<>){chop;$x=~/\Q$_\E/&&next;for$i(0..y///c-1){for$r(split//,$x){$b=$_;$b=~s/(.{$i})./$1$r/;$x=~/\Q$b\E/&&die$i,$",$.-1,$/}}}

This version uses an optimized trick for catching the reply unit, learned in Dennis'answer.
Optimization by using the property that all lines have equal lengths.
Line end is also needed for the last line, otherwisechomp instead ofchop should be used.
Optimizations of ardnew'scomment added.

Old version, 212 bytes:

$_=<>;chop;/^(.+?)\1+(??{".{0,".(-1+length$1).'}'})$/;$;=$1;while(<>){$x=$;x length;chop;$x=~/\Q$_\E/&&next;for$i(0..-1+length$_){for$r(split//,$;){$b=$_;$b=~s/(.{$i})./$1$r/;$x=~/\Q$b\E/&&exit print$i,$",$.-1}}}

Ungolfed version:

$_ = <>;  # read first line/^(((.*?)(.*?))\2+)\3$/;# The repeat unit \2 consists of \3 and \4,# and the start part \2 can be added at the end (as partial or even full unit).$x = $1 x 4; # $x is long enough to cover each following line# Old version:# /^(.+?)\1+(??{ ".{0," . (-1 + length $1) . '}' })$/;# $a = $1; # $a is the repeat unit.# The unit is caught by a non-greedy pattern (.+?) that is# repeated at least once: \1+# The remaining characters must be less than the unit length.# The unit length is known at run-time, therefore a "postponed"# regular expression is used for the remainder.# process the following lines until the error is foundwhile (<>) {    # old version:    # $x = $a x length;    # $x contains the repeated string unit, by at least one unit longer    # than the string in the current line    chop; # remove line end of current line    $x =~ /\Q$_\E/ && next;          # go to next line, if current string is a substring of the repeated units;          # \Q...\E prevents the interpretation of special characters    # now each string position $x is checked, if it contains the wrong character:    for $i (0 .. y///c - 1) {  # y///c yields the length of $_        for $r (split //, $x) { #/ (old version uses $a)            # replace the character at position $i with a            # character from the repeat unit            $b = $_;            $b =~ s/(.{$i})./$1$r/;            $x =~ /\Q$b\E/               && die $i, $", $. - 1, $/;               # $" sets a space and the newline is added by $/;               # the newline prevents "die" from outputting line numbers        }    }}

the firstchop is unnecessary- should be removed. the finalexit print can be replaced withdie (add,$/ to hide the extra stuff (if required)). alsolength$_ can be replaced withy///c
@ardnew: Many thanks, I have removed the firstchop, because$ matches before the newline at the end of the string. Hiding the extra stuff ofdie via the added newline seems necessary to me. Alsoy///c is much shorter thanlength$_ and one byte shorter thanlength without the unnecessary$_.
@ardnew: I had forgotten aboutdie's verbosity. It even includes prints the line number! I'll use that in my next update.

Movatterモバイル変換

Stack Exchange Network

Find the odd character out in a pattern

Input

Output

Test cases

Go!

6 Answers6

Perl,212191181 168 bytes

Ungolfed version:

Bash Perl,231229218178164166138106 74 bytes

Ungolfed version

Example

C, 187 bytes

Python,303 292

Perl,157 154

JavaScript (ES6) -147133 136 Characters

Your Answer

Sign up orlog in

Post as a guest

Linked

Related

Hot Network Questions

Subscribe to RSS