支持HW团队,就支付宝领取下面的红包吧! (打开支付宝就能领取!er1OEj73Uj), (打开支付宝收索“516503473”), 你领取消费,HW有奖励。红包使用无条件限制,有条件请注意是不是有病毒。

Login or Sign up | Validate
| Search

博主:初学MPEG

初学MPEG 本博客-采用Python的web框架Django与Mysql数据库,致力于对Python、Django的了解 与研究
Django技术QQ群:XXXXXXX
Python技术QQ群:XXXXXXX

Category

Keywords

本站最新博文

友情链接  

Python UnicodeEncodeError: 'ascii' codec can't encode character

类别:python 状态:6,可回,会员可关联(良好) 阅读:11236 评论:0 时间:Sept. 16, 2011, 12:19 p.m.
关键字:


UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1' 
in position 0: ordinal not in range(128)

If you've ever gotten this error, Django's smart_str function might be able to help. I found this from James Bennett's article, Unicode in the real world. He provides a very good explanation of Python's Unicode and bytestrings, their use in Django, and using Django's Unicode utilities for working with non-Unicode-friendly Python libraries. Here are my notes from his article as it applies to the above error. Much of the wording is directly from James Bennett's article.

This error occurs when you pass a Unicode string containing non-English characters (Unicode characters beyond 128) to something that expects an ASCII bytestring. The default encoding for a Python bytestring is ASCII, "which handles exactly 128 (English) characters". This is why trying to convert Unicode characters beyond 128 produces the error.

The good news is that you can encode Python bytestrings in other encodings besides ASCII. Django's smart_str function in the django.utils.encoding module, converts a Unicode string to a bytestring using a default encoding of UTF-8.

Here is an example using the built-in function, str:



a = u'\xa1'
print str(a) # this throws an exception

Results:



Traceback (most recent call last):
  File "unicode_ex.py", line 3, in      print str(a) # this throws an exception UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1' in position 0: ordinal not in range(128)

Here is an example using smart_str:



from django.utils.encoding import smart_str, smart_unicode

a = u'\xa1'
print smart_str(a)

Results:



¡

Definitions

  • Unicode string: sequence of Unicode characters
  • Python bytestring: a series of bytes which represent a sequence of characters. It's default encoding is ASCII. This is the "normal", non-Unicode string in Python <3.0.
  • encoding: a code that pairs a sequence of characters with a series of bytes
  • ASCII: an encoding which handles 128 English characters
  • UTF-8: a popular encoding used for Unicode strings which is backwards compatible with ASCII for the first 128 characters. It uses one to four bytes for each character.
操作:

Please Login (or Sign Up) to leave a comment