python基础-字符串-猿码集

1. 字符串简介

Python中的字符串是不可变序列，由一系列字符组成。在Python中，字符串使用单引号、双引号或三引号来表示。其中单引号和双引号没有区别，可以混合使用。三引号则可以表示多行字符串。

1.1 字符串的创建

字符串的创建可以通过直接赋值或调用字符串构造函数来完成。


# 直接赋值
s1 = 'hello world'
s2 = "hello world"
 
# 调用字符串构造函数
s3 = str(123)
s4 = str([1, 2, 3])
s5 = str((1, 2, 3))
s6 = str({1: 'a', 2: 'b', 3: 'c'})

1.2 字符串的索引和切片

字符串中的每个字符可以通过索引来访问。字符串的第一个字符的索引为0，最后一个字符索引为-1。


s = 'hello world'
print(s[0])     # 输出'h'
print(s[-1])    # 输出'd'

字符串可以通过切片的方式获取子串。语法为`string[start:end:step]`。其中，`start`表示切片的起始位置，`end`表示切片的结束位置（不包含在内），`step`表示步长，默认为1。


s = 'hello world'
print(s[2:5])   # 输出'llo'
print(s[:5])    # 输出'hell'
print(s[6:])    # 输出'world'
print(s[::2])   # 输出'hlowrd'

2. 字符串方法

2.1 字符串大小写转换

字符串大小写转换可以通过`upper()`、`lower()`等方法完成。


s = 'Hello World'
print(s.upper())    # 输出'HELLO WORLD'
print(s.lower())    # 输出'hello world'

2.2 字符串查找和替换

字符串查找和替换可以通过`find()`、`index()`、`replace()`等方法完成。

`find(sub[, start[, end]])` 方法返回字符串中第一个匹配子串 `sub` 的位置，如果不匹配返回 -1。可以指定起始查找位置和结束位置。

`index(sub[, start[, end]])` 方法返回字符串中第一个匹配子串 `sub` 的位置，如果不匹配抛出异常。可以指定起始查找位置和结束位置。

`replace(old, new[, count])` 方法返回将字符串中所有匹配 `old` 的子串替换成 `new` 后的新字符串。可以指定替换次数。


s = 'Hello World'
print(s.find('o'))               # 输出4
print(s.find('x'))               # 输出-1
print(s.index('o'))              # 输出4
print(s.index('x'))              # 抛出异常
print(s.replace('o', '*'))       # 输出'Hell* W*rld'
print(s.replace('o', '*', 1))    # 输出'Hell* World'

2.3 字符串分割和拼接

字符串分割和拼接可以通过`split()`、`join()`等方法完成。

`split(sep[, maxsplit])` 方法返回把字符串按照 `sep` 分割后的子串列表，可以指定最大分割次数（默认为-1，即全部分割）。如果 `sep` 为空字符串，则会把字符串切割成单个字符的列表。

`join(iterable)` 方法将 `iterable` 中的所有字符串连接成一个新字符串，连接符为调用该方法的字符串。


s = 'Hello World'
print(s.split())                    # 输出['Hello', 'World']
print(s.split('o', 1))              # 输出['Hell', ' World']
print(''.join(['a', 'b', 'c']))     # 输出'abc'

2.4 字符串格式化

字符串格式化可以通过`format()`、`%`等方法完成。

`format(*args, **kwargs)` 方法通过占位符 `{}` 和 `format()` 方法的参数来对字符串进行格式化。占位符的位置和数量可以根据参数的位置和关键字来确定。

`%` 是旧式字符串格式化操作符，在设计上有些问题，不太提倡使用。它的使用方式是在字符串中使用 `%` 占位符，然后在字符串后面加上一个元组，里面包含了占位符的值。


name = 'World'
age = 18
print('Hello, {}'.format(name))                 # 输出'Hello, World'
print('Hello, {0}, {1}'.format(name, age))      # 输出'Hello, World, 18'
print('Hello, {name}, {age}'.format(name=name, age=age))  # 输出'Hello, World, 18'
print('My name is %s, I\'m %d years old.' % (name, age)) # 输出'My name is World, I'm 18 years old.'

3. 字符串判断方法

字符串的判断方法有很多，比如`isalpha()`、`isdigit()`、`isnumeric()`等，都是返回一个布尔值表示字符串是否符合相应的条件。下面介绍几个常用的判断方法。

3.1 判断字符串是否只包含字母和数字

该判断方法可以通过字符串的`isalnum()`方法完成，返回值为True表示字符串只包含字母和数字，否则返回False。


s1 = '123abc'
s2 = '123-abc'
print(s1.isalnum())     # 输出True
print(s2.isalnum())     # 输出False

3.2 判断字符串是否只包含数字

该判断方法可以通过字符串的`isdigit()`方法完成，返回值为True表示字符串只包含数字，否则返回False。需要注意的是，该方法不能判断一个负数字符串是否只包含数字。


s1 = '123'
s2 = '123abc'
print(s1.isdigit())     # 输出True
print(s2.isdigit())     # 输出False

3.3 判断字符串是否只包含数值字符

该判断方法可以通过字符串的`isnumeric()`方法完成，返回值为True表示字符串只包含数值字符，否则返回False。


s1 = '123'
s2 = 'Ⅲ'
print(s1.isnumeric())   # 输出True
print(s2.isnumeric())   # 输出True

4. 字符串高级操作

4.1 字符串编码与解码

字符编码是指将字符转换为字节序列的过程，常见的编码方式有ASCII、utf-8、gbk等。在Python中，字符串默认使用utf-8编码。我们可以使用`encode()`方法将字符串编码为字节序列，使用`decode()`方法将字节序列解码为字符串。


s1 = 'hello world'
b = s1.encode('utf-8')      # 编码为字节序列
print(b)                    # 输出b'hello world'
s2 = b.decode('utf-8')      # 解码为字符串
print(s2)                   # 输出'hello world'

4.2 字符串格式验证

有时候我们需要对字符串进行格式验证，比如验证邮箱格式、IP地址格式等。在Python中，我们可以使用正则表达式来进行格式验证。Python提供了`re`模块来支持正则表达式。

下面以验证邮箱格式为例，介绍正则表达式的用法。


import re
def is_valid_email(email):
    """判断邮箱是否合法"""
    pattern = r'^[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\.[a-zA-Z0-9_-]+)+$'
    return bool(re.match(pattern, email))
 
# 测试
print(is_valid_email('123@qq.com'))      # 输出True
print(is_valid_email('123@qq'))         # 输出False

4.3 字符串加密与解密

加密是将明文转换为密文，解密则是将密文还原为明文。常见的加密方式有对称加密和非对称加密。在Python中，我们可以使用`pycryptodome`模块来进行加密和解密操作。其中，AES、DES、RSA是常见的加密算法。

下面以AES加密和解密为例，介绍`pycryptodome`模块的用法。

首先需要安装`pycryptodome`模块。

pip install pycryptodome

然后，我们可以用以下代码进行AES加密和解密。


from Crypto.Cipher import AES
def encrypt(plaintext, key):
    cipher = AES.new(key, AES.MODE_EAX)
    nonce = cipher.nonce
    ciphertext, tag = cipher.encrypt_and_digest(plaintext.encode('utf-8'))
    return nonce, ciphertext, tag
def decrypt(nonce, ciphertext, tag, key):
    cipher = AES.new(key, AES.MODE_EAX, nonce=nonce)
    plaintext = cipher.decrypt(ciphertext)
    try:
        cipher.verify(tag)
        return plaintext.decode('utf-8')
    except ValueError:
        return None
 
# 测试
key = b'1234567890123456'
plaintext = 'hello world'
nonce, ciphertext, tag = encrypt(plaintext, key)
print(nonce, ciphertext, tag)
print(decrypt(nonce, ciphertext, tag, key))

5. 总结

Python中的字符串是一种不可变序列，有着丰富的方法和操作。文章中介绍了字符串的创建、索引和切片、大小写转换、查找和替换、分割和拼接、格式化、判断方法、编码与解码、加密与解密等方面的内容。对于Python初学者来说，掌握这些字符串的常见用法和操作，能够快速提高代码的编写效率。

python基础--字符串