python练习——第4题

python学习网 2020-02-18 17:40:01

原GitHub地址:https://github.com/Yixiaohan/show-me-the-code

题目:任一个英文的纯文本文件,统计其中的单词出现的个数。

代码:

import collections

# 打开文本
with open('words.txt', 'rt') as f:
    # 将文本中的标点符号去掉,再依据空格来将单词分开
    str = f.read().replace(',', '').replace('.', '').split(' ')

print(str) # 文本中的单词

# 使用Counter对象来计数
counters = collections.Counter(str)

# 输出结果
print(counters)

原文本:
Today, a lot of people’s life pace is so fast that they feel tired about life. Some you
ng people are lost when they talk about future, because they are stuck in their work. Most Chinese people lack of passion about their life, but life needs passion, or we are just walking dead. The way to find passion is to see the beauty of life, such as spending
more time with families. Even some people live far away, they still need to contact with their friends and families. Talking with them can always bring happy memories and forget annoyance. The other effective way to gain passion is to travel. When people are in vacation, they often feel easy and regain the power. After appreciating beautiful scenery, they will broaden their vision and have the passion to fight again.

进行分割单词处理后:
['Today', 'a', 'lot', 'of', 'people’s', 'life', 'pace', 'is', 'so', 'fast', 'that', 'th
ey', 'feel', 'tired', 'about', 'life', 'Some', 'young', 'people', 'are', 'lost', 'when', 'they', 'talk', 'about', 'future', 'because', 'they', 'are', 'stuck', 'in', 'their', 'work', 'Most', 'Chinese', 'people', 'lack', 'of', 'passion', 'about', 'their', 'life', 'but', 'life', 'needs', 'passion', 'or', 'we', 'are', 'just', 'walking', 'dead', 'The', 'way', 'to', 'find', 'passion', 'is', 'to', 'see', 'the', 'beauty', 'of', 'life', 'such',
'as', 'spending', 'more', 'time', 'with', 'families', 'Even', 'some', 'people', 'live',
'far', 'away', 'they', 'still', 'need', 'to', 'contact', 'with', 'their', 'friends', 'and', 'families', 'Talking', 'with', 'them', 'can', 'always', 'bring', 'happy', 'memories', 'and', 'forget', 'annoyance', 'The', 'other', 'effective', 'way', 'to', 'gain', 'passion', 'is', 'to', 'travel', 'When', 'people', 'are', 'in', 'vacation', 'they', 'often', 'feel', 'easy', 'and', 'regain', 'the', 'power', 'After', 'appreciating', 'beautiful', 'scenery', 'they', 'will', 'broaden', 'their', 'vision', 'and', 'have', 'the', 'passion',
'to', 'fight', 'again']

最终输出结果:
Counter({'they': 6, 'to': 6, 'life': 5, 'passion': 5, 'people': 4, 'are': 4, 'their': 4, 'and': 4, 'of': 3, 'is': 3, 'about': 3, 'the': 3, 'with': 3, 'feel': 2, 'in': 2, 'The': 2, 'way': 2, 'families': 2, 'Today': 1, 'a': 1, 'lot': 1, 'people’s': 1, 'pace': 1, 's
o': 1, 'fast': 1, 'that': 1, 'tired': 1, 'Some': 1, 'young': 1, 'lost': 1, 'when': 1, 'talk': 1, 'future': 1, 'because': 1, 'stuck': 1, 'work': 1, 'Most': 1, 'Chinese': 1, 'lack': 1, 'but': 1, 'needs': 1, 'or': 1, 'we': 1, 'just': 1, 'walking': 1, 'dead': 1, 'find': 1, 'see': 1, 'beauty': 1, 'such': 1, 'as': 1, 'spending': 1, 'more': 1, 'time': 1, 'Even': 1, 'some': 1, 'live': 1, 'far': 1, 'away': 1, 'still': 1, 'need': 1, 'contact': 1, 'friends': 1, 'Talking': 1, 'them': 1, 'can': 1, 'always': 1, 'bring': 1, 'happy': 1, 'memories': 1, 'forget': 1, 'annoyance': 1, 'other': 1, 'effective': 1, 'gain': 1, 'travel': 1, 'When': 1, 'vacation': 1, 'often': 1, 'easy': 1, 'regain': 1, 'power': 1, 'After': 1, 'appreciating': 1, 'beautiful': 1, 'scenery': 1, 'will': 1, 'broaden': 1, 'vision': 1, 'have': 1, 'fight': 1, 'again': 1})

网上看见的另一种方法:

import collections
import re

file_name = "The Old Man and the Sea.txt"

c = collections.Counter()
with open('words.txt', 'r') as f:
    c.update(re.findall(r'\b[a-zA-Z\']+\b', f.read()))
    # c.update(re.findall(r'\b[a-zA-Z]+\b', f.read()))

with open("WordCount.txt", 'w') as wf:
    for word in c.most_common():
        wf.write(word[0]+','+str(word[1])+'\n')
阅读(2341) 评论(0)