forked frompostgres/postgres
- Notifications
You must be signed in to change notification settings - Fork6
Commit4b7d9e5
unaccent: Add support for quoted translated characters
As reported in bug #18057, the extension unaccent removes in its rulefile whitespace characters that are intentionally specified whenbuilding unaccent.rules from UnicodeData.txt, causing an incorrecttranslation for some characters like numeric symbols. This is caused bythe fact that all whitespaces before and after the origin and targetcharacters are all discarded (this limitation is documented).This commit makes possible the use of quotes around target characters,so as whitespaces can be considered part of target characters. Sometarget characters use a double quote, these require an extra doublequote.The documentation is updated to show how to use quoted areas,generate_unaccent_rules.py is updated to generate unaccent.rules and acouple of tests are added for numeric symbols. While working on thispatch, I have implemented a fake rule file to test the parsing logicimplemented, which is not included here as it would just consume extracycles in the tests, and it requires the manipulation of an installationtree to be able to work correctly.As this requires a change of format in unaccent.rules, this cannot bebackpatched, unfortunately. The idea to use double quotes as escapedcharacters comes from Tom Lane.Reported-by: Martin SchlossarekAuthor: Michael PaquierDiscussion:https://postgr.es/m/18057-62712cad01bd202c@postgresql.org1 parent5086199 commit4b7d9e5
File tree
6 files changed
+166
-38
lines changed- contrib/unaccent
- expected
- sql
- doc/src/sgml
6 files changed
+166
-38
lines changedLines changed: 36 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
51 | 51 |
| |
52 | 52 |
| |
53 | 53 |
| |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
54 | 66 |
| |
55 | 67 |
| |
56 | 68 |
| |
| |||
93 | 105 |
| |
94 | 106 |
| |
95 | 107 |
| |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
96 | 120 |
| |
97 | 121 |
| |
98 | 122 |
| |
| |||
135 | 159 |
| |
136 | 160 |
| |
137 | 161 |
| |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
138 | 174 |
| |
139 | 175 |
| |
140 | 176 |
| |
|
Lines changed: 4 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
58 | 58 |
| |
59 | 59 |
| |
60 | 60 |
| |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
61 | 65 |
| |
62 | 66 |
| |
63 | 67 |
| |
|
Lines changed: 6 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
20 | 20 |
| |
21 | 21 |
| |
22 | 22 |
| |
| 23 | + | |
| 24 | + | |
23 | 25 |
| |
24 | 26 |
| |
25 | 27 |
| |
| |||
28 | 30 |
| |
29 | 31 |
| |
30 | 32 |
| |
| 33 | + | |
| 34 | + | |
31 | 35 |
| |
32 | 36 |
| |
33 | 37 |
| |
| |||
36 | 40 |
| |
37 | 41 |
| |
38 | 42 |
| |
| 43 | + | |
| 44 | + | |
39 | 45 |
| |
40 | 46 |
| |
41 | 47 |
| |
|
Lines changed: 76 additions & 10 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
127 | 127 |
| |
128 | 128 |
| |
129 | 129 |
| |
130 |
| - | |
| 130 | + | |
| 131 | + | |
131 | 132 |
| |
132 | 133 |
| |
133 | 134 |
| |
134 | 135 |
| |
135 | 136 |
| |
136 |
| - | |
137 |
| - | |
138 |
| - | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
139 | 142 |
| |
140 | 143 |
| |
141 | 144 |
| |
142 | 145 |
| |
143 | 146 |
| |
144 | 147 |
| |
| 148 | + | |
145 | 149 |
| |
146 | 150 |
| |
147 | 151 |
| |
| 152 | + | |
| 153 | + | |
148 | 154 |
| |
149 | 155 |
| |
150 | 156 |
| |
| |||
156 | 162 |
| |
157 | 163 |
| |
158 | 164 |
| |
159 |
| - | |
160 |
| - | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
161 | 169 |
| |
162 | 170 |
| |
163 | 171 |
| |
| |||
173 | 181 |
| |
174 | 182 |
| |
175 | 183 |
| |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
176 | 192 |
| |
177 | 193 |
| |
178 |
| - | |
179 | 194 |
| |
180 | 195 |
| |
181 |
| - | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
182 | 201 |
| |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
183 | 218 |
| |
184 | 219 |
| |
185 | 220 |
| |
| |||
195 | 230 |
| |
196 | 231 |
| |
197 | 232 |
| |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
198 | 259 |
| |
199 | 260 |
| |
200 | 261 |
| |
201 |
| - | |
202 |
| - | |
| 262 | + | |
| 263 | + | |
203 | 264 |
| |
204 | 265 |
| |
205 | 266 |
| |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
206 | 271 |
| |
| 272 | + | |
207 | 273 |
| |
208 | 274 |
| |
209 | 275 |
| |
|
Lines changed: 28 additions & 28 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
5 | 5 |
| |
6 | 6 |
| |
7 | 7 |
| |
8 |
| - | |
9 |
| - | |
10 |
| - | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
11 | 11 |
| |
12 | 12 |
| |
13 | 13 |
| |
| |||
403 | 403 |
| |
404 | 404 |
| |
405 | 405 |
| |
406 |
| - | |
| 406 | + | |
407 | 407 |
| |
408 | 408 |
| |
409 | 409 |
| |
| |||
1058 | 1058 |
| |
1059 | 1059 |
| |
1060 | 1060 |
| |
1061 |
| - | |
1062 |
| - | |
| 1061 | + | |
| 1062 | + | |
1063 | 1063 |
| |
1064 |
| - | |
| 1064 | + | |
1065 | 1065 |
| |
1066 | 1066 |
| |
1067 | 1067 |
| |
1068 | 1068 |
| |
1069 |
| - | |
| 1069 | + | |
1070 | 1070 |
| |
1071 | 1071 |
| |
1072 | 1072 |
| |
| |||
1134 | 1134 |
| |
1135 | 1135 |
| |
1136 | 1136 |
| |
1137 |
| - | |
1138 |
| - | |
1139 |
| - | |
1140 |
| - | |
1141 |
| - | |
1142 |
| - | |
1143 |
| - | |
1144 |
| - | |
1145 |
| - | |
1146 |
| - | |
1147 |
| - | |
1148 |
| - | |
1149 |
| - | |
1150 |
| - | |
1151 |
| - | |
1152 |
| - | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
| 1143 | + | |
| 1144 | + | |
| 1145 | + | |
| 1146 | + | |
| 1147 | + | |
| 1148 | + | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
| 1152 | + | |
1153 | 1153 |
| |
1154 | 1154 |
| |
1155 | 1155 |
| |
| |||
1182 | 1182 |
| |
1183 | 1183 |
| |
1184 | 1184 |
| |
1185 |
| - | |
| 1185 | + | |
1186 | 1186 |
| |
1187 | 1187 |
| |
1188 | 1188 |
| |
| |||
1296 | 1296 |
| |
1297 | 1297 |
| |
1298 | 1298 |
| |
1299 |
| - | |
1300 |
| - | |
| 1299 | + | |
| 1300 | + | |
1301 | 1301 |
| |
1302 | 1302 |
| |
1303 | 1303 |
| |
| |||
1512 | 1512 |
| |
1513 | 1513 |
| |
1514 | 1514 |
| |
1515 |
| - | |
| 1515 | + | |
1516 | 1516 |
| |
1517 | 1517 |
| |
1518 | 1518 |
| |
|
Lines changed: 16 additions & 0 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
84 | 84 |
| |
85 | 85 |
| |
86 | 86 |
| |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
87 | 103 |
| |
88 | 104 |
| |
89 | 105 |
| |
|
0 commit comments
Comments
(0)