Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitc444472

Browse files
committed
Fix backslash-escaping multibyte chars in COPY FROM.
If a multi-byte character is escaped with a backslash in TEXT mode input,and the encoding is one of the client-only encodings where the bytes afterthe first one can have an ASCII byte "embedded" in the char, we didn'tskip the character correctly. After a backslash, we only skipped the firstbyte of the next character, so if it was a multi-byte character, we wouldtry to process its second byte as if it was a separate character. If itwas one of the characters with special meaning, like '\n', '\r', oranother '\\', that would cause trouble.One such exmple is the byte sequence '\x5ca45c2e666f6f' in Big5 encoding.That's supposed to be [backslash][two-byte character][.][f][o][o], butbecause the second byte of the two-byte character is 0x5c, we incorrectlytreat it as another backslash. And because the next character is a dot, weparse it as end-of-copy marker, and throw an "end-of-copy marker corrupt"error.Backpatch to all supported versions.Reviewed-by: John Naylor, Kyotaro HoriguchiDiscussion:https://www.postgresql.org/message-id/a897f84f-8dca-8798-3139-07da5bb38728%40iki.fi
1 parent5e7fa18 commitc444472

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

‎src/backend/commands/copyfromparse.c

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1084,7 +1084,7 @@ CopyReadLineText(CopyFromState cstate)
10841084
break;
10851085
}
10861086
elseif (!cstate->opts.csv_mode)
1087-
1087+
{
10881088
/*
10891089
* If we are here, it means we found a backslash followed by
10901090
* something other than a period. In non-CSV mode, anything
@@ -1095,8 +1095,16 @@ CopyReadLineText(CopyFromState cstate)
10951095
* backslashes are not special, so we want to process the
10961096
* character after the backslash just like a normal character,
10971097
* so we don't increment in those cases.
1098+
*
1099+
* Set 'c' to skip whole character correctly in multi-byte
1100+
* encodings. If we don't have the whole character in the
1101+
* buffer yet, we might loop back to process it, after all,
1102+
* but that's OK because multi-byte characters cannot have any
1103+
* special meaning.
10981104
*/
10991105
raw_buf_ptr++;
1106+
c=c2;
1107+
}
11001108
}
11011109

11021110
/*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp