simple web data extractor in php -curl
$30-250 USD
Pagado a la entrega
We would like a php file that extract data from websites. I suppose with you should use curl, but im not expert.
Steps for php:
1- php must connect with a mysql server (remote) configured in a [login to view URL]
2- php must find 1st data unproccesed (lastchange=null) and block it ( to prevent being used by other php process)
3- php must do work descripted below
4- php must write mysqltable with results and unblock this data record.
Process for php:
1-Visit an url and extract from home page metatags = `Title`+`description`+`keywords` and `date_of_html`
2-Spider the first 10 links (only inbounds not external) found in home page to extract: emails + phones + fax
After order that info extracted , update mysql records, unblock used record and start a new url from table.
All outbound links and extras emails found in process will be added to mysql2.
To prevent eating resources after each process php should leave memory or something like this.
If no records are found , an alert should send by email to an administrator to add new records to the mysqltable.
--> Mysql1 for url is as attached in .sql <--
TABLE `url` (
`codigo` int(11) NOT NULL auto_increment
`email` varchar(50) default NULL,
`origendeldato` varchar(30) default NULL,
`url` varchar(50) NOT NULL,
`Title` varchar(250) default NULL,
`description` varchar(250) default NULL,
`keywords` varchar(250) default NULL,
`telefono` varchar(20) NOT NULL,
`fax` varchar(20) default NULL,
`pais` char(2) default NULL,
`empresa` varchar(50) default NULL,
`nombre` varchar(50) default NULL,
`rubro` varchar(20) default NULL,
`lastchange` timestamp NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP ,
PRIMARY KEY (`codigo`),
UNIQUE KEY `base` (`base`),
UNIQUE KEY `email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
--> Mysq2 for extraemails <--
`email` varchar(50) default NULL,
`url` varchar(50) NOT NULL,
when i said " `Title`+`description`+`keywords` " <-- this is htmls metatags
Nº del proyecto: #1004781
Sobre el proyecto
Adjudicado a:
We can help in your project, please check PMB and our ratings/reviews to get idea of our experience.