前言
驗證碼?我也能破解?
關(guān)于驗證碼的介紹就不多說了,各種各樣的驗證碼在人們生活中時不時就會冒出來,身為學(xué)生日常接觸最多的就是教務(wù)處系統(tǒng)的驗證碼了,比如如下的驗證碼:
識別辦法
模擬登陸有著復(fù)雜的步驟,在這里咱們不管其他操作,只負(fù)責(zé)根據(jù)輸入的一張驗證碼圖片返回一個答案字符串。
我們知道驗證碼為了制作干擾,會把圖片弄成五顏六色的樣子,而我們首先就是要去除這些干擾,這一步就需要不斷試驗了,增強圖片色彩,加大對比度等等都可以產(chǎn)生幫助。
在經(jīng)過各種對圖片的操作之后,終于找到了比較完美的去除干擾方案。可以看到在去除干擾之后,最優(yōu)情況下,我們將得到一張十分純凈的黑白字符圖片。一張圖片上有四個字符,沒辦法一下子就把四個字符全部識別,需要把圖片進(jìn)行裁剪,裁剪成每張小圖只有一個字符的樣子,再對每張圖片分別進(jìn)行識別。
接下來就是識別文字了,我們首先把得到的小圖轉(zhuǎn)換成01表示的矩陣,每個矩陣代表一個字符。
比如數(shù)字六的矩陣
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
num_6=[ 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,1,1,0,0,0,0,0,0, 0,0,0,0,1,1,1,0,0,0,0,0,0, 0,0,0,1,1,1,0,0,0,0,0,0,0, 0,0,0,1,1,0,0,0,0,0,0,0,0, 0,0,1,1,0,0,0,0,0,0,0,0,0, 0,0,1,1,0,0,0,0,0,0,0,0,0, 0,1,1,1,1,1,1,1,0,0,0,0,0, 0,1,1,1,1,1,1,1,1,0,0,0,0, 0,1,1,0,0,0,0,1,1,1,0,0,0, 0,1,1,0,0,0,0,0,1,1,0,0,0, 0,1,1,0,0,0,0,0,1,1,0,0,0, 0,1,1,1,0,0,0,1,1,1,0,0,0, 0,0,1,1,1,1,1,1,1,0,0,0,0, 0,0,0,1,1,1,1,1,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, ] |
遠(yuǎn)遠(yuǎn)望過去,瞇著眼睛還是能分辨出來的。
因為驗證碼十分規(guī)整,每個數(shù)字所在的位置都是固定的,所以并不需要涉及什么機器學(xué)習(xí)的算法,只是簡單的進(jìn)行一下矩陣的比對就可以了,在所有的實現(xiàn)做好的矩陣中找到相似度最高的矩陣就可以了,在這里的比對方法多種多樣,反正數(shù)據(jù)簡單能正確識別出來就好。
至此,咱們的驗證碼識別工作就結(jié)束了。
這次進(jìn)行的驗證碼識別主要采用python的PIL進(jìn)行圖片操作,模擬登陸自動填寫驗證碼的全部代碼請看這里:
示例代碼
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
|
# -*- coding: utf-8 -* import sys reload (sys) sys.setdefaultencoding( "utf-8" ) import re import requests import io import os import json from PIL import Image from PIL import ImageEnhance from bs4 import BeautifulSoup import mdata class Student: def __init__( self , user,password): self .user = str (user) self .password = str (password) self .s = requests.Session() def login( self ): url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4" res = self .s.get(url).text imageUrl = 'http://202.118.31.197/' + re.findall( '<img src="(.+?)" width="55"' ,res)[ 0 ] im = Image. open (io.BytesIO( self .s.get(imageUrl).content)) enhancer = ImageEnhance.Contrast(im) im = enhancer.enhance( 7 ) x,y = im.size for i in range (y): for j in range (x): if (im.getpixel((j,i))! = ( 0 , 0 , 0 )): im.putpixel((j,i),( 255 , 255 , 255 )) num = [ 6 , 19 , 32 , 45 ] verifyCode = "" for i in range ( 4 ): a = im.crop((num[i], 0 ,num[i] + 13 , 20 )) l = [] x,y = a.size for i in range (y): for j in range (x): if (a.getpixel((j,i)) = = ( 0 , 0 , 0 )): l.append( 1 ) else : l.append( 0 ) his = 0 chrr = ""; for i in mdata.data: r = 0 ; for j in range ( 260 ): if (l[j] = = mdata.data[i][j]): r + = 1 if (r>his): his = r chrr = i verifyCode + = chrr # print "輔助輸入驗證碼完畢:",verifyCode data = { 'WebUserNO' : str ( self .user), 'Password' : str ( self .password), 'Agnomen' :verifyCode, } url = "http://202.118.31.197/ACTIONLOGON.APPPROCESS?mode=4" t = self .s.post(url,data = data).text if re.findall( "images/Logout2" ,t) = = []: l = '[0,"' + re.findall( 'alert((.+?));' ,t)[ 1 ][ 1 ][ 2 : - 2 ] + '"]' + " " + self .user + " " + self .password + "\n" # print l # return '[0,"'+re.findall('alert((.+?));',t)[1][1][2:-2]+'"]' return [ False ,l] else : l = '登錄成功 ' + re.findall( '! (.+?) ' ,t)[ 0 ] + " " + self .user + " " + self .password + "\n" # print l return [ True ,l] def getInfo( self ): imageUrl = 'http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS' data = self .s.get( 'http://202.118.31.197/ACTIONQUERYBASESTUDENTINFO.APPPROCESS?mode=3' ).text #學(xué)籍信息 data = BeautifulSoup(data, "lxml" ) q = data.find_all( "table" ,attrs = { 'align' : "left" }) a = [] for i in q[ 0 ]: if type (i) = = type (q[ 0 ]) : for j in i : if type (j) = = type (i): a.append(j.text) for i in q[ 1 ]: if type (i) = = type (q[ 1 ]) : for j in i : if type (j) = = type (i): a.append(j.text) data = {} for i in range ( 1 , len (a), 2 ): data[a[i - 1 ]] = a[i] # data['照片'] = io.BytesIO(self.s.get(imageUrl).content) return json.dumps(data) def getPic( self ): imageUrl = 'http://202.118.31.197/ACTIONDSPUSERPHOTO.APPPROCESS' pic = Image. open (io.BytesIO( self .s.get(imageUrl).content)) return pic def getScore( self ): score = self .s.get( 'http://202.118.31.197/ACTIONQUERYSTUDENTSCORE.APPPROCESS' ).text #成績單 score = BeautifulSoup(score, "lxml" ) q = score.find_all(attrs = { 'height' : "36" })[ 0 ] point = q.text print point[point.find( '平均學(xué)分績點' ):] table = score.html.body.table people = table.find_all(attrs = { 'height' : '36' })[ 0 ].string r = table.find_all( 'table' ,attrs = { 'align' : 'left' })[ 0 ].find_all( 'tr' ) subject = [] lesson = [] for i in r[ 0 ]: if type (r[ 0 ]) = = type (i): subject.append(i.string) for i in r: k = 0 temp = {} for j in i: if type (r[ 0 ]) = = type (j): temp[subject[k]] = j.string k + = 1 lesson.append(temp) lesson.pop() lesson.pop( 0 ) return json.dumps(lesson) def logoff( self ): return self .s.get( 'http://202.118.31.197/ACTIONLOGOUT.APPPROCESS' ).text if __name__ = = "__main__" : a = Student( 20150000 , 20150000 ) r = a.login() print r[ 1 ] if r[ 0 ]: r = json.loads(a.getScore()) for i in r: for j in i: print i[j], print q = json.loads(a.getInfo()) for i in q: print i,q[i] a.getPic().show() a.logoff() |
總結(jié)
以上就是這篇文章的全部內(nèi)容了,希望本文的內(nèi)容對大家的學(xué)習(xí)或者使用python能帶來一定的幫助,如果有疑問大家可以留言交流,謝謝大家對服務(wù)器之家的支持。
原文鏈接:http://www.cnblogs.com/xfangs/p/6500611.html