Project Title: Data Mining from SAFE Mortgage Licensing Site

Project Description:

I am looking for a solution to extracting the information and verifying it from the SAFE mortgage licensing site (http://www.nmlsconsumeraccess.org/) The NMLS site provides some VERY detailed information about every licensed (or registered) person in the US that is authorized to talk with a consumer about a residential mortgage. But their search system is VERY. It fails when there are more than 500 hits to the search. And there s no way to list/browse thru the entire database. In total there are somewhere around 250,000 records in the database.

Right now the search on the site is pretty much useless unless you know the name of the person and you want to verify their license. What I am looking for is someone to SCRAPE the information out of this database so that I can build a database that actually CAN be searched no matter how many hits are generated by the search term. Specifically, I would like to be able to generate lists of all the lenders registered in the system…and under that lists of all the employees of those lenders that are licensed/registered.

Please contact me and tell me what you think you might be able to do to assist in this project…

I am looking to find out WHAT information you can capture from that site for me… The link I provided was to the main site….you need to enter a name to search for….and then you will need to do a captcha to gain access to the search area. Once past that you will be able to see the layout of the pages and the fields of information that are available…there are SEVERAL levels to drill down to the information about each individual person/company.

There is a UNIQUE registration number for each person and company within the system so that is the FIRST field along with the name of the person/company I need to capture. The problem is that you cannot do a search for anything in this stupid system that generates MORE THAN 500 results. If it does id does NOT give you the first 500 results…it fails and gives NOTHING. I think the best strategy to mine the data is to generate a list of names and ID numbers of all the COMPANIES in the database…and then start to pull up the data about individuals. The problem is that when you have more than 500 employees for a company you cannot get a list of them simply by searching using the company name.

The database contains a MASSIVE amount of information about each individual AND their employment history. I want to capture all of that. Please take a look at the site and tell me how you would approach this project….and how long it will take you to scrape all the data out of a record form this database…

For smililar work requirement feel free to email us on info@webscrapingexpert.com