需求場景:
老大讓我利用爬蟲爬取的數(shù)據(jù)寫到或更新到mysql數(shù)據(jù)庫中,百度了兩種方法
1 是使用pymysql連接mysql,通過操作原生的sql語句進行增刪改查數(shù)據(jù);
2 是使用sqlalchemy連接mysql,通過ORM模型建表并操作數(shù)據(jù)庫,不需要寫原生的sql語句,相對簡單些;
以下就是本次使用sqlalchemy的經(jīng)驗之談。
實現(xiàn)流程:連接數(shù)據(jù)庫》通過模型類創(chuàng)建表》建立會話》執(zhí)行創(chuàng)建表語句》通過會話進行增刪改查
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
|
from sqlalchemy import exists, Column, Integer, String, ForeignKey, exists from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker # 創(chuàng)建的數(shù)據(jù)庫引擎 engine = create_engine( "mysql+pymysql://user:pwd@ip/數(shù)據(jù)庫名?charset=utf8" ) #創(chuàng)建session類型 DBSession = sessionmaker(bind = engine) # 實例化官宣模型 - Base 就是 ORM 模型 Base = declarative_base() # 創(chuàng)建服務(wù)單表 class ServiceOrder(Base): __tablename__ = 'serviceOrderTable' id = Column(Integer, primary_key = True , autoincrement = True ) serviceOrderId = Column(String( 32 ), nullable = False , index = True , comment = '服務(wù)單ID' ) serviceDesc = Column(String( 268 ), comment = '服務(wù)說明' ) oneLevelName = Column(String( 32 ), comment = 'C類別' ) twoLevelName = Column(String( 32 ), comment = 'T子類' ) threeLevelName = Column(String( 32 ), comment = 'I項目' ) fourLevelName = Column(String( 32 ), comment = 'S子項' ) transferTimes = Column(String( 32 ), comment = '轉(zhuǎn)派次數(shù)' ) overDueStatus = Column(String( 32 ), comment = '過期狀態(tài)' ) serviceTimeLimit = Column(String( 32 ), comment = '服務(wù)時限' ) serTimeLimitTypeName = Column(String( 16 ), comment = '時限類型' ) # 一對多: # serviceWorkOrder = relationship("ServiceWorkOrder", backref="serviceorder") # 多對一:多個服務(wù)工單可以屬于服務(wù)單 class ServiceWorkOrder(Base): __tablename__ = 'serviceWorkOrderTable' id = Column(Integer, primary_key = True , autoincrement = True ) serviceWorkOrderId = Column(String( 32 ), nullable = False , index = True , comment = '服務(wù)工單ID' ) workOrderName = Column(String( 268 ), comment = '工單名稱' ) fromId = Column(String( 32 ), comment = '服務(wù)單ID' ) createUserSectionName = Column(String( 32 ), comment = '創(chuàng)建人室' ) createUserName = Column(String( 32 ), comment = '創(chuàng)建人' ) handlerName = Column(String( 32 ), comment = '處理人' ) statusName = Column(String( 32 ), comment = '工單狀態(tài)' ) createTime = Column(String( 32 ), comment = '創(chuàng)建時間' ) # “多”的一方的book表是通過外鍵關(guān)聯(lián)到user表的: # serviceOrder_id = Column(Integer, ForeignKey('serviceOrderTable.id')) # 創(chuàng)建數(shù)據(jù)庫 如果數(shù)據(jù)庫已存在 則不會創(chuàng)建 會根據(jù)庫名直接連接已有的庫 def init_db(): Base.metadata.create_all(engine) def drop_db(): Base.metadata.drop_all(engine) def insert_update(): # all_needed_data_lists 是需要插入數(shù)據(jù)庫的數(shù)據(jù) 格式[{key: value, ... }, { }, { }...] for item in all_needed_data_lists: ServiceOrderRow = ServiceOrder(serviceOrderId = item[ 'serviceOrderId' ], serviceDesc = item[ 'serviceDesc' ], oneLevelName = item[ 'oneLevelName' ], twoLevelName = item[ 'twoLevelName' ], threeLevelName = item[ 'threeLevelName' ], fourLevelName = item[ 'fourLevelName' ], transferTimes = item[ 'transferTimes' ], overDueStatus = item[ 'overDueStatus' ], serviceTimeLimit = item[ 'serviceTimeLimit' ], serTimeLimitTypeName = item[ 'serTimeLimitTypeName' ], ) try : # 利用exists判斷目標(biāo)對象是否存在,返回True或Faults it_exists = session.query( exists().where(ServiceOrder.serviceOrderId = = item[ 'serviceOrderId' ] ) ).scalar() except Exception as e: self .log.error(e) break try : # 如果不存在,進行新增;存在的話就更新現(xiàn)存的數(shù)據(jù) if not it_exists: session.add(ServiceOrderRow) else : session.query(ServiceOrder). filter (ServiceOrder.serviceOrderId = = item[ 'serviceOrderId' ])\ .update(item) except Exception as e: self .log.error(e) break try : session.commit() self .log.info( '數(shù)據(jù)更新成功!' ) except : session.rollback() self .log.info( '數(shù)據(jù)更新失敗!' ) if __name__ = = "__main__" : # 創(chuàng)建數(shù)據(jù)庫 如果數(shù)據(jù)庫已存在 則不會創(chuàng)建 會根據(jù)庫名直接連接已有的庫 init_db() # 創(chuàng)建session對象,進行增刪改查: session = DBSession() # 利用session 增 改數(shù)據(jù) 記得提交 insert_update() |
以上就是本文的全部內(nèi)容,希望對大家的學(xué)習(xí)有所幫助,也希望大家多多支持服務(wù)器之家。
原文鏈接:https://www.cnblogs.com/We612/p/12105135.html