When does using swapcase twice not return an identical answer?

Question:

The Python docs for str.swapcase() say:

Note that it is not necessarily true that s.swapcase().swapcase() == s.

I’m guessing that this has something to do with Unicode; however, I wasn’t able to produce a string that changed after exactly two applications of swapcase(). What kind of a string would fail to produce an identical result?

Here’s what I tested (acquired here):

>>> testString = '''Bãcoл ípѕüϻ Ꮷ߀ɭor sìt ämét qûìs àɭïɋüíp cülρä, ϻagnâ èх ѕêԁ ѕtríρ stêãk iл ԁò ut sålámí éхèrcìtátïoл pòrƙ ɭ߀in. Téԉԁërɭ߀ín tùrkèϒ ѕáûsáɢè лùɭɭå pɑrïátûr, ƃáll típ âԁiρïѕicïԉǥ ɑᏧ c߀ԉsêquät ϻâgлã véлïsoл. Míлím àutë ѵ߀ɭüρtåte mòɭɭít tri-tíρ dèsêrùԉt. Occãècát vëԉis߀ԉ êХ eiùѕm߀d séᏧ láborüϻ pòrƙ lòïл àliɋûå ìлcíԁìԁúԉt. Sed còmϻ߀Ꮷ߀ յoɰl offícíä pòrƙ ƅèɭly témρòr lâƅòrùϻ tâiɭ sρårê ríbs toлǥue ϻêátɭòáf måɢnä.

Kièɭbàѕã in còлѕêctêtur ѵëлíàϻ pâríɑtùr p߀rk ɭ߀in êxêrcìtâtiòл älìɋúíρ câρicolɑ ρork tòлɢüê düis ԁ߀ɭoré rêpréhéԉᏧérït. Tènԁèrloiԉ ëх rèρréհeԉԁérït fûgíãt ädipìsiciԉg gr߀ünᏧ roúлd, ƅaɭɭ típ հàϻƃûrǥèr ѕɦòùlder ɭåb߀rûϻ têmρor ríƃêyë. Eѕsè hàϻ ѵëԉiam, åɭíɋùɑ ìrüre ρòrƙ cɦop ԁò ԁ߀ɭoré frânkfürter nülla påsträϻí sàusàgè sèᏧ. Eӽcêptêür ѕëd t-b߀лë հɑϻ, esѕë ut ɭàƅoríѕ ƃáll tíρ nostrúԁ sհ߀üldêr ïn shòrt ríƅs ρástrámï. Essé hamƅûrǥër ɭäƅòré, fatƃàcƙ teԉderlòïn sհ߀rt rïbs ρròìdént riƅêye ɭab߀rum. Nullɑ türԁùcƙèn л߀n, sρarè rìƅs eӽceρteur ádïρìѕìcïԉǥ êt ѕɦort ɭòin dolorë änïm dêѕêrùлt. Sհäлƙlè cúpïԁätát pork lòïn méåtbäll, ԉ߀strud réprèհéԉԁêrìt ɦɑϻburǥêr ѕâɭɑϻí Ꮷol߀rè ɑd lêberƙãs.

Boûdiл toлǥuê c߀ԉsèqûåt eà rümρ ƅálɭ tíρ ѕρâré rìbѕ ín pròiᏧent dûiѕ ϻíлïm èíuѕmòᏧ c߀rԉêᏧ ƃèèf ƅɑc߀л d߀lorè. Cornèd ƅëèf drûmsticƙ cùlpa, éлïm baɭɭ tìp ϻéatbâlɭ lab߀rê tri-tïp vënisoԉ ǥroùԉԁ ròùлԁ հɑm iл èä bãcòn. Eѕѕé ìᏧ ѕúԉt, sհoùldér ƙïeɭƃäѕà ãԁiρisïcïԉɢ ɦaϻbûrgêr út ԁòɭ߀re fåtbäcƙ ԁ߀ɭòr äлïm trï-típ. EíùsϻòᏧ nülɭã läbòruϻ лíѕi êxcéptèúr. Occåécåt Ꮷüíѕ ԁèserüлt toԉǥue ϳ߀wɭ. Rèρréɦëԉԁêrit áɭïqúíp fûǥiàt tùrkey véniãϻ qüìѕ.'''
>>> testString.swapcase().swapcase() == testString
True
Asked By: sushain97

||

Answers:

I tried this

v = lambda x: x.swapcase().swapcase() == x
[unichr(x) for x in range(10000) if not v(unichr(x))]

Which results in these:

[u'xb5', u'u0130', u'u0131', u'u017f', u'u03c2', u'u03d0', u'u03d1', u'u03d5', u'u03d6', u'u03f0', u'u03f1', u'u03f4', u'u03f5', u'u1e9b', u'u1e9e', u'u1f80', u'u1f81', u'u1f82', u'u1f83', u'u1f84', u'u1f85', u'u1f86', u'u1f87', u'u1f90', u'u1f91', u'u1f92', u'u1f93', u'u1f94', u'u1f95', u'u1f96', u'u1f97', u'u1fa0', u'u1fa1', u'u1fa2', u'u1fa3', u'u1fa4', u'u1fa5', u'u1fa6', u'u1fa7', u'u1fb3', u'u1fbe', u'u1fc3', u'u1ff3', u'u2126', u'u212a', u'u212b']
Answered By: Holy Mackerel

This is the case when multiple letters are lower cases of the same letter.

For example, the micro character µ (U+00B5) and the mu character μ (U+03BC):

>>> u'xb5'.swapcase()
u'u039c'
>>> u'u03bc'.swapcase()
u'u039c'

The two are different characters, but their uppercase counterparts are the same. This means that when str.swapcase() is applied, they return the same character. However, doing this again can’t (and won’t) return both letters.

>>> u'xb5'.swapcase().swapcase()
u'u03bc'
Answered By: Volatility

While Volatility brought up the example of the uppercase mu and uppercase micro resolving to the same Unicode codepoint, here’s another interesting situation where applying swapcase twice results in a different answer:

>>> 'ß'.swapcase().swapcase()
'ss'

Confused? The German consonant ß (pronounced [s]) becomes SS after one application of swapcase and then ss after the second.

Here’s the whole list of them (→ represents one swapcase):

µ (0xb5) → Μ (0x39c) → μ (0x3bc) → Μ (0x39c)
ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
İ (0x130) → i̇ (0x69307) → İ (0x49307) → i̇ (0x69307)
ı (0x131) → I (0x49) → i (0x69) → I (0x49)
ʼn (0x149) → ʼN (0x2bc4e) → ʼn (0x2bc6e) → ʼN (0x2bc4e)
ſ (0x17f) → S (0x53) → s (0x73) → S (0x53)
ǰ (0x1f0) → J̌ (0x4a30c) → ǰ (0x6a30c) → J̌ (0x4a30c)
ͅ (0x345) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
ΐ (0x390) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
ΰ (0x3b0) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
ς (0x3c2) → Σ (0x3a3) → σ (0x3c3) → Σ (0x3a3)
ϐ (0x3d0) → Β (0x392) → β (0x3b2) → Β (0x392)
ϑ (0x3d1) → Θ (0x398) → θ (0x3b8) → Θ (0x398)
ϕ (0x3d5) → Φ (0x3a6) → φ (0x3c6) → Φ (0x3a6)
ϖ (0x3d6) → Π (0x3a0) → π (0x3c0) → Π (0x3a0)
ϰ (0x3f0) → Κ (0x39a) → κ (0x3ba) → Κ (0x39a)
ϱ (0x3f1) → Ρ (0x3a1) → ρ (0x3c1) → Ρ (0x3a1)
ϴ (0x3f4) → θ (0x3b8) → Θ (0x398) → θ (0x3b8)
ϵ (0x3f5) → Ε (0x395) → ε (0x3b5) → Ε (0x395)
և (0x587) → ԵՒ (0x535552) → եւ (0x565582) → ԵՒ (0x535552)
ẖ (0x1e96) → H̱ (0x48331) → ẖ (0x68331) → H̱ (0x48331)
ẗ (0x1e97) → T̈ (0x54308) → ẗ (0x74308) → T̈ (0x54308)
ẘ (0x1e98) → W̊ (0x5730a) → ẘ (0x7730a) → W̊ (0x5730a)
ẙ (0x1e99) → Y̊ (0x5930a) → ẙ (0x7930a) → Y̊ (0x5930a)
ẚ (0x1e9a) → Aʾ (0x412be) → aʾ (0x612be) → Aʾ (0x412be)
ẛ (0x1e9b) → Ṡ (0x1e60) → ṡ (0x1e61) → Ṡ (0x1e60)
ẞ (0x1e9e) → ß (0xdf) → SS (0x5353) → ss (0x7373) → SS (0x5353)
ὐ (0x1f50) → Υ̓ (0x3a5313) → ὐ (0x3c5313) → Υ̓ (0x3a5313)
ὒ (0x1f52) → Υ̓̀ (0x3a5313300) → ὒ (0x3c5313300) → Υ̓̀ (0x3a5313300)
ὔ (0x1f54) → Υ̓́ (0x3a5313301) → ὔ (0x3c5313301) → Υ̓́ (0x3a5313301)
ὖ (0x1f56) → Υ̓͂ (0x3a5313342) → ὖ (0x3c5313342) → Υ̓͂ (0x3a5313342)
ᾀ (0x1f80) → ἈΙ (0x1f08399) → ἀι (0x1f003b9) → ἈΙ (0x1f08399)
ᾁ (0x1f81) → ἉΙ (0x1f09399) → ἁι (0x1f013b9) → ἉΙ (0x1f09399)
ᾂ (0x1f82) → ἊΙ (0x1f0a399) → ἂι (0x1f023b9) → ἊΙ (0x1f0a399)
ᾃ (0x1f83) → ἋΙ (0x1f0b399) → ἃι (0x1f033b9) → ἋΙ (0x1f0b399)
ᾄ (0x1f84) → ἌΙ (0x1f0c399) → ἄι (0x1f043b9) → ἌΙ (0x1f0c399)
ᾅ (0x1f85) → ἍΙ (0x1f0d399) → ἅι (0x1f053b9) → ἍΙ (0x1f0d399)
ᾆ (0x1f86) → ἎΙ (0x1f0e399) → ἆι (0x1f063b9) → ἎΙ (0x1f0e399)
ᾇ (0x1f87) → ἏΙ (0x1f0f399) → ἇι (0x1f073b9) → ἏΙ (0x1f0f399)
ᾐ (0x1f90) → ἨΙ (0x1f28399) → ἠι (0x1f203b9) → ἨΙ (0x1f28399)
ᾑ (0x1f91) → ἩΙ (0x1f29399) → ἡι (0x1f213b9) → ἩΙ (0x1f29399)
ᾒ (0x1f92) → ἪΙ (0x1f2a399) → ἢι (0x1f223b9) → ἪΙ (0x1f2a399)
ᾓ (0x1f93) → ἫΙ (0x1f2b399) → ἣι (0x1f233b9) → ἫΙ (0x1f2b399)
ᾔ (0x1f94) → ἬΙ (0x1f2c399) → ἤι (0x1f243b9) → ἬΙ (0x1f2c399)
ᾕ (0x1f95) → ἭΙ (0x1f2d399) → ἥι (0x1f253b9) → ἭΙ (0x1f2d399)
ᾖ (0x1f96) → ἮΙ (0x1f2e399) → ἦι (0x1f263b9) → ἮΙ (0x1f2e399)
ᾗ (0x1f97) → ἯΙ (0x1f2f399) → ἧι (0x1f273b9) → ἯΙ (0x1f2f399)
ᾠ (0x1fa0) → ὨΙ (0x1f68399) → ὠι (0x1f603b9) → ὨΙ (0x1f68399)
ᾡ (0x1fa1) → ὩΙ (0x1f69399) → ὡι (0x1f613b9) → ὩΙ (0x1f69399)
ᾢ (0x1fa2) → ὪΙ (0x1f6a399) → ὢι (0x1f623b9) → ὪΙ (0x1f6a399)
ᾣ (0x1fa3) → ὫΙ (0x1f6b399) → ὣι (0x1f633b9) → ὫΙ (0x1f6b399)
ᾤ (0x1fa4) → ὬΙ (0x1f6c399) → ὤι (0x1f643b9) → ὬΙ (0x1f6c399)
ᾥ (0x1fa5) → ὭΙ (0x1f6d399) → ὥι (0x1f653b9) → ὭΙ (0x1f6d399)
ᾦ (0x1fa6) → ὮΙ (0x1f6e399) → ὦι (0x1f663b9) → ὮΙ (0x1f6e399)
ᾧ (0x1fa7) → ὯΙ (0x1f6f399) → ὧι (0x1f673b9) → ὯΙ (0x1f6f399)
ᾲ (0x1fb2) → ᾺΙ (0x1fba399) → ὰι (0x1f703b9) → ᾺΙ (0x1fba399)
ᾳ (0x1fb3) → ΑΙ (0x391399) → αι (0x3b13b9) → ΑΙ (0x391399)
ᾴ (0x1fb4) → ΆΙ (0x386399) → άι (0x3ac3b9) → ΆΙ (0x386399)
ᾶ (0x1fb6) → Α͂ (0x391342) → ᾶ (0x3b1342) → Α͂ (0x391342)
ᾷ (0x1fb7) → Α͂Ι (0x391342399) → ᾶι (0x3b13423b9) → Α͂Ι (0x391342399)
ι (0x1fbe) → Ι (0x399) → ι (0x3b9) → Ι (0x399)
ῂ (0x1fc2) → ῊΙ (0x1fca399) → ὴι (0x1f743b9) → ῊΙ (0x1fca399)
ῃ (0x1fc3) → ΗΙ (0x397399) → ηι (0x3b73b9) → ΗΙ (0x397399)
ῄ (0x1fc4) → ΉΙ (0x389399) → ήι (0x3ae3b9) → ΉΙ (0x389399)
ῆ (0x1fc6) → Η͂ (0x397342) → ῆ (0x3b7342) → Η͂ (0x397342)
ῇ (0x1fc7) → Η͂Ι (0x397342399) → ῆι (0x3b73423b9) → Η͂Ι (0x397342399)
ῒ (0x1fd2) → Ϊ̀ (0x399308300) → ῒ (0x3b9308300) → Ϊ̀ (0x399308300)
ΐ (0x1fd3) → Ϊ́ (0x399308301) → ΐ (0x3b9308301) → Ϊ́ (0x399308301)
ῖ (0x1fd6) → Ι͂ (0x399342) → ῖ (0x3b9342) → Ι͂ (0x399342)
ῗ (0x1fd7) → Ϊ͂ (0x399308342) → ῗ (0x3b9308342) → Ϊ͂ (0x399308342)
ῢ (0x1fe2) → Ϋ̀ (0x3a5308300) → ῢ (0x3c5308300) → Ϋ̀ (0x3a5308300)
ΰ (0x1fe3) → Ϋ́ (0x3a5308301) → ΰ (0x3c5308301) → Ϋ́ (0x3a5308301)
ῤ (0x1fe4) → Ρ̓ (0x3a1313) → ῤ (0x3c1313) → Ρ̓ (0x3a1313)
ῦ (0x1fe6) → Υ͂ (0x3a5342) → ῦ (0x3c5342) → Υ͂ (0x3a5342)
ῧ (0x1fe7) → Ϋ͂ (0x3a5308342) → ῧ (0x3c5308342) → Ϋ͂ (0x3a5308342)
ῲ (0x1ff2) → ῺΙ (0x1ffa399) → ὼι (0x1f7c3b9) → ῺΙ (0x1ffa399)
ῳ (0x1ff3) → ΩΙ (0x3a9399) → ωι (0x3c93b9) → ΩΙ (0x3a9399)
ῴ (0x1ff4) → ΏΙ (0x38f399) → ώι (0x3ce3b9) → ΏΙ (0x38f399)
ῶ (0x1ff6) → Ω͂ (0x3a9342) → ῶ (0x3c9342) → Ω͂ (0x3a9342)
ῷ (0x1ff7) → Ω͂Ι (0x3a9342399) → ῶι (0x3c93423b9) → Ω͂Ι (0x3a9342399)
Ω (0x2126) → ω (0x3c9) → Ω (0x3a9) → ω (0x3c9)
K (0x212a) → k (0x6b) → K (0x4b) → k (0x6b)
Å (0x212b) → å (0xe5) → Å (0xc5) → å (0xe5)
ff (0xfb00) → FF (0x4646) → ff (0x6666) → FF (0x4646)
fi (0xfb01) → FI (0x4649) → fi (0x6669) → FI (0x4649)
fl (0xfb02) → FL (0x464c) → fl (0x666c) → FL (0x464c)
ffi (0xfb03) → FFI (0x464649) → ffi (0x666669) → FFI (0x464649)
ffl (0xfb04) → FFL (0x46464c) → ffl (0x66666c) → FFL (0x46464c)
ſt (0xfb05) → ST (0x5354) → st (0x7374) → ST (0x5354)
st (0xfb06) → ST (0x5354) → st (0x7374) → ST (0x5354)
ﬓ (0xfb13) → ՄՆ (0x544546) → մն (0x574576) → ՄՆ (0x544546)
ﬔ (0xfb14) → ՄԵ (0x544535) → մե (0x574565) → ՄԵ (0x544535)
ﬕ (0xfb15) → ՄԻ (0x54453b) → մի (0x57456b) → ՄԻ (0x54453b)
ﬖ (0xfb16) → ՎՆ (0x54e546) → վն (0x57e576) → ՎՆ (0x54e546)
ﬗ (0xfb17) → ՄԽ (0x54453d) → մխ (0x57456d) → ՄԽ (0x54453d)
Answered By: sushain97
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.