2013年9月23日星期一

Jiji Ji ! About HttpClient , please master guiding

 This post last edited by the tengshao on 2013-09-23 18:28:52
This is, I now use HttpClient to do a little web crawling program, but because of the speed quickly , the site will be required over time to enter the verification code verification code is the picture , very complicated , the program automatically recognizes correct rate is too low ! So I want to manually type in console , Enter let the program continue , the question now comes
I crawled web pages, and now need to determine the verification code , grab the verification code address, open the local browser to view the verification code picture, and then enter the verification code in the console , submission, the information returned is the verification code expired the .
I think it is because the site verification code is generated dynamically, each request will be updated , so I opened a browser to view the verification code picture, Verification code has been updated
I do not know how to solve this situation
------ Solution - -------------------------------------------
although each are updated, but your last visit to the main verification code is stored in the session among ,
can HTTPCLIENT first obtain the picture, and then manually input.
------ Solution ---------------------------------------- ----
browser and HttpClient holds its own session, therefore , two ways to get the picture , not the same as a corresponding session, the session server due to different submission will be recognized as two different users . Therefore , the landlord from the browser to get pictures from inside the input HttpClient submitted almost impossible to succeed.

Solution:
1. use the program identification code . Improved recognition algorithms , increasing recognition hit rate ;
2. written GUI interface , the verification code displayed on the screen, and then enter the corresponding code . Discard browser.
------ For reference only -------------------------------------- -

eimhee Hello , saved in the session that I understand , but are returned HttpClient HTML code ah , now I also entangled in how at the first time of the visit will be able to get to the Captcha Image this HttpClient to achieve it ?
------ For reference only -------------------------------------- -


preferme Hello , thank you for pointing out the error , I have resolved to thank

没有评论:

发表评论