本文介绍通过Python自带的email模块解析电子邮件的基本实现方法。

  这里阿猪假设你已经通过pop、imap等方式获取到了一封邮件,或者有一个现成的eml文件。下边的代码着重演示解析邮件内容过程中的基本实现方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
import imaplib
from email.parser import Parser
from email.header import decode_header
import re
import datetime

imap_user = '用户名email地址'
imap_object = imaplib.IMAP4_SSL(port="993",host="imap.xxx.com")
imap_object.login(imap_user, '密码或授权码')
imap_object.select('INBOX')
typ, msg_ids = imap_object.search(None, 'ALL')

ids = msg_ids[0]
ret = ids.decode('utf-8')
message_id_list = ret.split()
int_mail_num = len(message_id_list)
print('收件箱中共有%s封邮件'%int_mail_num)

msg = msg_ids[0]
msg_list = msg.split()

ids = msg_list[0]
results, data = imap_object.fetch(ids, "(RFC822)")
imap_object.close()
str_source = data[0][1].decode('UTF-8')



msg_email = Parser().parsestr(str_source)





str_from = msg_email["from"]




str_from_name = re.search(r'(?<=")[\s\S]*?(?=")',str_from).group()
str_from_address = re.search(r'(?<=<)[\s\S]*?(?=>)',str_from).group()
value, charset = decode_header(str_from_name)[0]
if charset:
str_from_name = value.decode(charset)
print(">>From:%s<%s>"%(str_from_name,str_from_address))

str_to = msg_email["to"]
str_to_name = re.search(r'(?<=")[\s\S]*?(?=")',str_to).group()
str_to_address = re.search(r'(?<=<)[\s\S]*?(?=>)',str_to).group()
value, charset = decode_header(str_to_name)[0]
if charset:
str_to_name = value.decode(charset)
print(">>To:%s<%s>"%(str_to_name,str_to_address))

str_date = msg_email["date"]
str_date = str_date.replace('GMT','+0000')
dtime_date = datetime.datetime.strptime(str_date, '%a, %d %b %Y %H:%M:%S %z')
print('>>时间:%s'%dtime_date)

str_subject = msg_email["subject"]
value, charset = decode_header(str_subject)[0]
if charset:
str_subject = value.decode(charset)
print('>>邮件主题:%s'%str_subject)

def decode_mime(msg):
if msg.is_multipart():
parts = msg.get_payload()
for part in parts:
decode_mime(part)
else:
str_content_type = msg.get_content_type()

str_charset = msg.get_content_charset(failobj=None)

if str_content_type in ('text/plain', 'text/html'):
bytes_content = msg.get_payload(decode=True)
str_content = bytes_content.decode(str_charset)
print('>>邮件正文(%s):%s'%(str_content_type,str_content))



decode_mime(msg_email)

版权声明: 未经书面授权许可,任何个人和组织不得以任何形式转载、引用本站的任何内容。本站保留追究侵权者法律责任的权利。

赏杯咖啡,鼓励一下~

  • 微信打赏

    微信打赏


最新文章