计算机二级练习仓库
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 

975 lines
66 KiB

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 实践8 文本数据\n",
"\n",
"\n",
"**学习目标**\n",
"1. 熟练书写字符串数据的常量表示 \n",
"2.能掌握简单的字符串操作和函数实现计算 \n",
"3. 能掌握文本数据的典型问题的算法设计 \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.字符串常量的表示 \n",
"**理解字符串常量三种表示方式,和使用场合** \n",
"**理解转义字符(\\n和\\t)和空格的作用**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"To be or not to be, that's a question.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print('古云:\"临渊羡鱼,不如退而结网。\"')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print('''富贵必从勤苦得,\n",
"男儿须读五车书。\n",
"--杜甫''')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#换行符\\n\n",
"print('富贵必从勤苦得,\\n男儿须读五车书。\\n--杜甫')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#空格和制表符\\t的区别\n",
"from random import randint \n",
"n=int(input(\"n=\"))\n",
"for i in range(1,n+1):\n",
" print(randint(0,10000),end=\" \") #使用空格间隔每一个数\n",
" if i % 10 == 0:\n",
" print()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"n=int(input(\"n=\"))\n",
"for i in range(1,n+1):\n",
" print(randint(0,10000),end=\"\\t\") #使用制表符间隔每一个数\n",
" if i % 10 == 0:\n",
" print()"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a 的编码是 97\n"
]
}
],
"source": [
"txt = 'a'\n",
"code = ord(txt)\n",
"print (txt,'的编码是',code)\n"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 的编码是 48\n"
]
}
],
"source": [
"txt = '0'\n",
"code = ord(txt)\n",
"print (txt,'的编码是',code)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"101 对应的字符是 e\n"
]
}
],
"source": [
"code = 101\n",
"txt = chr(code)\n",
"print(code,\"对应的字符是\",txt)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2.字符串的切片访问方式\n",
"**运行并理解下面表达式中切片的作用**\n",
"\n",
"`str[start:end:step]` \n",
"1.下标可以使用正序下标,也可以使用逆序下标。第一个是0,最后一个是-1。 \n",
"2.end的下标是不包含在截取串中。 \n",
"3.step值为-1,表示逆序。 \n",
"4.逆序切片时,start表示源串的右面的位置下标,end表示源串的左面位置下标,且不包含。 \n",
"例如:要取s中Python world进行逆序。start的值为-2,end值为5,step为-1。 \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s=\"Hello,Python world!\"\n",
"s[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[5]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[:5]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[6:-7]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[:-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[::-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[:5:-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[-2::-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s[-2:5:-1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.字符串运算"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s1=\"Hello,\"\n",
"s2=\" Python\"\n",
"s=s1+s2\n",
"s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s1 in s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s=s1+s2*3\n",
"s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#\"program\" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#\"prolan\" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#\"amamam\""
]
},
{
"attachments": {
"image-2.png": {
"image/png": ""
}
},
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.字符串处理函数和内置字符串处理方法\n",
"![image-2.png](attachment:image-2.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#\n",
"s=\"Python String\"\n",
"s.upper()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.lower()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.find('i')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.replace('ing','gni')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"t = s.split()\n",
"t"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s1 = \"-\".join(t)\n",
"s1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"t = s1.split(\"-\")\n",
"t"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.find('t')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.find('t',3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.find('t',9)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.index('t')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.index('t',9)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.string模块\n",
"import string \n",
"string.digits 可返回'0123456789' \n",
"string.ascii_lowercase 可返回'abcdefghijklmnopqrstuvwxyz' \n",
"string.ascii_uppercase 可返回'ABCDEFGHIJKLMNOPQRSTUVWXYZ' \n",
"string.punctuation 可返回'!\"#$%&\\'()*+,-./:;<=>?@[\\\\]^_`{|}~' "
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [],
"source": [
"import string"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'0123456789'"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"string.digits "
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"! \" # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \\ ] ^ _ ` { | } ~ "
]
}
],
"source": [
"for ch in string.punctuation:\n",
" print(ch,end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 试一试\n",
"\n",
"利用s1、s2和字符串操作(使用切片、连接、复制、字符串函数),写出能产生下列结果的表达式。 \n",
"s1='programming' \n",
"s2='language' \n",
"(1)\"program\" \n",
"(2)\"prolan\" \t\n",
"(3)\"amamam\" \n",
"(4)\"progr@mming l@ngu@ge\" \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s1='programming'\n",
"s2='language'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#(1)\"program\" \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#(2)\"prolan\" \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#(3)\"amamam\"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#(4)\"progr@mming l@ngu@ge\"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#(5)'Programming Language'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 试一试:\n",
"寻找一个源字符串s中的子串sub的所有位置。 \n",
"试使用不同的方法实现 \n",
"运行示例: \n",
" ` \n",
"s=do not,for one repuls,forgo the purpose that you resolved to effort \n",
"sub=o \n",
"1\t4\t8\t11\t23\t26\t36\t46\t52\t59\t64\tover ` "
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"方法一 使用find函数,while语句实现\n",
"循环的构建\n",
"(1)循环通项\n",
"index=s.find(sub,start)\n",
"print(index)\n",
"start=index+1\n",
"(2)循环控制:index为-1 循环结束\n",
"index的初值,第一次执行find\n",
"index的终值-1\n",
"index的变化,再次执行find\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"方法二 while True...if ...break 算法模式实现\n"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"方法三 使用index函数\n",
"\n",
"(1)循环通项\n",
"index=s.find(sub,start)\n",
"print(index)\n",
"start=index+1\n",
"(2)循环控制 :当异常ValueError发生,break跳出循环"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6.字符串典型算法设计\n",
"(1)编码解析\n",
"(2)逆序数\n",
"(3)分类统计\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### (1)编码解析 \n",
"*使用切片获取子串* \n",
"*使用format函数、连接运算(+)构造字符串* \n",
"【例】身份证解析: \n",
"输入一个昵称和身份证号的信息, \n",
"从身份证中提取出生日期和性别的信息,输出昵称和出生日期, \n",
"且在6月份出生的人员后标注“准备礼物”, \n",
"如果该用户是女性,则再加上“+鲜花”进行标注。"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"请输入昵称:红太狼\n",
"请输入身份证号码:309012199606230083\n",
"红太狼\t1996年06月23日\n",
"准备礼物+鲜花\n"
]
}
],
"source": [
"nickname=input(\"请输入昵称:\")\n",
"ids=input(\"请输入身份证号码:\")\n",
"\n",
"birthDay=\"{}年{}月{}日\".format(ids[6:10],ids[10:12],ids[12:14]) #构造生日\n",
"msg=\"{}\\t{}\\n\".format(nickname,birthDay) #构造输出字符串\n",
"if ids[-8:-6]=='06': #生日为6月\n",
" msg+=\"准备礼物\"\n",
" if int(ids[-2])%2==0: #是女性\n",
" msg+=\"+鲜花\"\n",
"print(msg)\n"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"str的format函数可以用于构造一个格式字符串,格式字符串中{}对应的内容由参数列表中的参数值按格式规定显示。\n",
"ids[10:12]和ids[-8:-6]都是月份对应的子串,前者使用正序索引,后者使用逆序索引。\n",
"msg是输出字符串变量,注意msg的逐步构造的方法:先通过赋值语句获得第一行昵称和出生年月,然后通过连接操作,追加第2行的输出文本内容。使用一个字符串变量可以操作多行文本。\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### (2)求逆序数 \n",
"整数类型和字符串可以相互转换,使用字符串操作方便实现逆序功能。 "
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"【例】输入若干个整数quit结束,求所有整数的逆序数之和。\n",
"运行示例\n",
"input x:45\n",
"input x:-12\n",
"input x:30\n",
"input x:quit\n",
"54+(-21)+3 = 36\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s=0 #累加器\n",
"outstr=\"\" #输出字符串\n",
"while True:\n",
" x=input(\"input x:\") #循环控制结构\n",
" if x==\"quit\":\n",
" break\n",
" #字符串切片操作求逆序数\n",
" if x[0]==\"-\":\n",
" x=int( x[:0:-1])*-1\n",
" else:\n",
" x=int(x[::-1])\n",
" #使用格式字符串和连接操作构造输出字符串\n",
" if x<0:\n",
" outstr+=\"({})+\".format(x)\n",
" else:\n",
" outstr+=str(x)+\"+\"\n",
" #累加逆序数\n",
" s=s+x\n",
"print(outstr[:-1],\"=\",s) \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"说明:\n",
"1.input函数输入得到字符串,实现逆序,构造输出字符串的括号和负号,使用字符串方便。\n",
"2.求累加和是整数的运算,使用int转换数据类型。\n",
"3.理解切片:x[:0:-1]和x[::-1]区别,是否保留字符\"-\"。负数字符串要先除去字符\"-\",才能逆序操作。\n",
"4.outstr变量用于构造最后的输出字符串,每一次循环连接一个数和符号\"+\"。最后一个数后的\"+\"要除去,使用切片操作:outstr[:-1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### (3) 字符统计\n",
"遍历字符串的算法模式:\n",
"```python \n",
"for ch in s:\n",
" ...ch...\n",
"```\n",
"判断字符分类的方法可以不同的方法\n",
"* ASCII的值判断\n",
"* str函数判断\n",
"* string模块的常量字符集判断"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"【例】编写程序,用于统计各类字符个数。输入一个字符串,统计其中大写字母,小写字母,数字的个数,其他各类字符的总数。 \n",
"运行示例:\n",
"please input char:HELLO python 123! \n",
"大写字母5个,小写字母6个,数字3个,其他字符3个 \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# ASCII的值判断\n",
"instr=input('please input char:')\n",
"upper,lower,digit,other=0,0,0,0\n",
"for c in instr:\n",
" if c>='A' and c<='Z':\n",
" upper=upper+1\n",
" elif c>='a' and c<='z':\n",
" lower=lower+1\n",
" elif c>='0' and c<='9':\n",
" digit=digit+1\n",
" else:\n",
" other=other+1\n",
"print('大写字母{}个,小写字母{}个,数字{}个,其他字符{}个'.format(upper,lower,digit,other))\n"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"please input char:please input char:HELLO python 123! \n",
" 大 写 字 母 5 个 , 小 写 字 母 21 个 , 数 字 3 个 , 其 他 字 符 8 个 \n"
]
}
],
"source": [
"# str函数判断\n",
"instr=input('please input char:') \n",
"upper,lower,digit,other=0,0,0,0 \n",
"for c in instr: \n",
" if c.isupper():\n",
" upper=upper+1 \n",
" elif c.islower(): \n",
" lower=lower+1 \n",
" elif c.isdigit(): \n",
" digit=digit+1 \n",
" else: \n",
" other=other+1 \n",
"\n",
"\n",
"msg = f' 大 写 字 母 {upper} 个 , 小 写 字 母 {lower} 个 , 数 字 {digit} 个 , 其 他 字 符 {other} 个 '\n",
"print(msg)\n"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"\n",
"字符串.isalnum() 所有字符都是数字或者字母,为真返回 Ture,否则返回 False。\n",
"字符串.isalpha() 所有字符都是字母,为真返回 Ture,否则返回 False。\n",
"字符串.isdigit() 所有字符都是数字,为真返回 Ture,否则返回 False。\n",
"字符串.islower() 所有字符都是小写,为真返回 Ture,否则返回 False。\n",
"字符串.isupper() 所有字符都是大写,为真返回 Ture,否则返回 False。\n",
"字符串.istitle() 所有单词都是首字母大写,为真返回 Ture,否则返回 False。\n",
"字符串.isspace() 所有字符都是空白字符,为真返回 Ture,否则返回 False。 \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#string模块的常量字符集判断\n",
"import string\n",
"instr=input('please input char:')\n",
"upper,lower,digit,other=0,0,0,0\n",
"for c in instr:\n",
" if c in string.ascii_uppercase :\n",
" upper=upper+1\n",
" elif c in string.ascii_lowercase:\n",
" lower=lower+1\n",
" elif c in string.digits:\n",
" digit=digit+1\n",
" else:\n",
" other=other+1\n",
"print('大写字母{}个,小写字母{}个,数字{}个,其他字符{}个'.format(upper,lower,digit,other))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 小试身手"
]
},
{
"cell_type": "raw",
"metadata": {},
"source": [
"(1)编写程序 实现二进制IP地址转为十进制IP地址。 \n",
"一个IP地址是由四个字节(每个字节8个位)的二进制码组成。输入一个合法的二进制表示的IP地址,请将其转换为十进制格式表示的IP地址输出(不考虑异常输入数据)。\n",
"运行示例: \n",
"input:11001100100101000001010101110010 \n",
"output:204.148.21.114 \n",
"\n",
"提示:int(str,base=2)可以将一个二进制字符串转化为十进制整数 \n",
">>> int(\"110111101\",2) \n",
"445 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(2)编写程序 随机产生50个-1000~1000之间的整数,输出其中逆序数大于原数据的整数并统计个数。每个整数之间空格间隔。\n",
"\n",
"输出示例:\n",
"687 -564 -662 -873 519 367 -625 375 436 -981 -742 -231 -671 -382 -32 -30 -958 -920 -520 -97 -350 69 29 \n",
"共23个数"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"(3)编写程序 实现电文加密\n",
"有一行电文,已按如下规律译成密码:\n",
"A-->Z a-->z\n",
"B-->Y b-->y\n",
"C-->X c-->x\n",
" ...... ......\n",
"即第一个字母变成第26个字母,第i个字母变成第(26-i+1)个字母,非字母字符不变。要求根据密码译回原文,并输出。\n",
"\n",
"运行示例\n",
"input:ABC123abc \n",
"output:ZYX123zyx\n",
"运行示例\n",
"input:Life is like an ice cream, enjoy it before it melts. \n",
"output:Oruv rh orpv zm rxv xivzn, vmqlb rg yvuliv rg nvogh. \n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 2
}